As shown in #23671 these functions can return the raw error without any
useful context to the user which makes it hard to understand where
things went wrong. Simply add some context to some error paths here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
These flags can affect the output of the HealtCheck log. Currently, when a container is configured with HealthCheck, the output from the HealthCheck command is only logged to the container status file, which is accessible via `podman inspect`.
It is also limited to the last five executions and the first 500 characters per execution.
This makes debugging past problems very difficult, since the only information available about the failure of the HealthCheck command is the generic `healthcheck service failed` record.
- The `--health-log-destination` flag sets the destination of the HealthCheck log.
- `none`: (default behavior) `HealthCheckResults` are stored in overlay containers. (For example: `$runroot/healthcheck.log`)
- `directory`: creates a log file named `<container-ID>-healthcheck.log` with JSON `HealthCheckResults` in the specified directory.
- `events_logger`: The log will be written with logging mechanism set by events_loggeri. It also saves the log to a default directory, for performance on a system with a large number of logs.
- The `--health-max-log-count` flag sets the maximum number of attempts in the HealthCheck log file.
- A value of `0` indicates an infinite number of attempts in the log file.
- The default value is `5` attempts in the log file.
- The `--health-max-log-size` flag sets the maximum length of the log stored.
- A value of `0` indicates an infinite log length.
- The default value is `500` log characters.
Add --health-max-log-count flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-max-log-size flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-log-destination flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
convert the owner UID and GID into the user namespace only when
":idmap" mount is used.
This changes the behaviour of :idmap with an empty volume. Now the
existing directory ownership is copied up as in the other case.
Closes: https://github.com/containers/podman/issues/23347
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
I am seeing a weird flake in my parallel system test PR. The issue is
that system units generated by podman systemd generate leave a container
in the Removing state behind.
As far as I can tell the porblems seems to be that the cleanup process
is killed while it tries to remove the container from the db. Because
the cidfile was removed before the ExecStopPost=podman rm ... process no
longer had access to the cidfile and reported no error because it runs
with --ignore.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We already know the status of the healthcheck in the caller so calling
healthCheckStatus() just make the event code sync the container state
and reread the healthcheck file for no reason.
It is much better to directly pass the status down to the event call.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
When interface_name attribute in containers.conf file is set to "device", then set interface names inside containers same as the network_interface names of the respective network.
The change applies to macvlan and ipvlan networks only. The interface_name attribute value has no impact on any other types of networks.
If the interface name is set in the user request, then that takes precedence.
Fixes: #21313
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
When containers are created with a named volume it can deadlock because
the create logic tried to lock all volumes in a loop, this is fine if it
only ever creates a single container at any given time. However because
we multiple containers can be created at the same time they can cause a
deadlock between the volumes. This is because the order of the loop is
not stable, in fact it is based on the order of how the volumes were
specified on the cli.
So if you create two containers at the same time with
`-v vol1:/dir2 -v vol2:/dir2` and the other one with
`-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock.
Now one solution could be to order the volumes to prevent the issue but
the reason for holding the lock is dubious. The goal was to prevent the
volume from being removed in the meantime. However that could still
have happend before we acquired the lock so it didn't protect against
that.
Both boltdb and sqlite already prevent us from adding a container with
volumes that do not exists due their internal consistency checks.
Sqlite even uses FOREIGN KEY relationships so the schema will prevent us
from doing anything wrong.
The create code currently first checks if the volume exists and if not
creates it. I have checked that the db will guarantee that this will not
work:
Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume`
Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed`
Keep in mind that this error is normally not seen, only if the volume is
removed between the volume exists check and adding the container in the
db this messages will be seen wich is an acceptable race and a
pre-existing condition anyway.
[NO NEW TESTS NEEDED] Race condition, hard to test in CI.
Fixes#20313
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When removing a container's dependency, getting an error that the
container has already been removed (ErrNoSuchCtr and
ErrCtrRemoved) should not be fatal. We wanted the container gone,
it's gone, no need to error out.
[NO NEW TESTS NEEDED] This is a race and thus hard to test for.
Fixes#18874
Signed-off-by: Matt Heon <mheon@redhat.com>
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
We use the name as alias but using the hostname makes also sense and
this is what docker does. We have to keep the short id as well for
docker compat.
While adding some tests I removed some duplicated tests that were
executed twice for nv for no reason.
Fixes#17370
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Since we have sqlite there is no point in duplicating this acroos two db
backends. Just set earlier when we validate the networks anyway.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
If the first container to get the pod lock is the infra container
it's going to want to remove the entire pod, which will also
remove every other container in the pod. Subsequent containers
will get the pod lock and try to access the pod, only to realize
it no longer exists - and that, actually, the container being
removed also no longer exists.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This was causing some CI flakes. I'm pretty sure that the pods
being removed already isn't a bug, but just the result of another
container in the pod removing it first - so no reason not to
ignore the errors.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We had something like 6 different boolean options (removing a
container turns out to be rather complicated, because there are a
million-odd things that want to do it), and the function
signature was getting unreasonably large. Change to a struct to
clean things up.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The infra container would try to remove the pod, despite the pod
already being in the process of being removed - oops. Add a check
to ensure we don't try and remove the pod when called by the
`podman pod rm` command.
Also, wire up noLockPod - it wasn't previously wired in, which is
concerning, and could be related?
Finally, make a few minor fixes to un-break lint.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This probably should have been in the API since the beginning,
but it's not too late to start now.
The extra information is returned (both via the REST API, and to
the CLI handler for `podman rm`) but is not yet printed - it
feels like adding it to the output could be a breaking change?
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This allows for accurate reporting of dependency removal, but the
work is still incomplete: pods can be removed, but do not report
the containers they removed as part of said removal. Will add
this in a subsequent commit.
Major note: I made ignoring no-such-container errors automatic
once it has been determined that a container did exist in the
first place. I can't think of any case where this would not be a
TOCTOU - IE, no reason not to ignore them. The `--ignore` option
to `podman rm` should still retain meaning as it will ignore
errors from containers that didn't exist in the first place.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This is the initial stage of implementation. The current API
functions but does not report the additional containers and pods
removed. This is necessary to properly display results to the
user after `podman rm --all`.
The existing remove-dependencies code has been removed in favor
of this more native solution.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd. The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.
When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint. In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers. The service will then behave
according to the following three exit-code policies:
- `none`: exit 0 and ignore containers (default)
- `any`: exit non-zero if _any_ container did
- `all`: exit non-zero if _all_ containers did
The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.
In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.
Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero. The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero. Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.
Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Use a helper to handle the cleanupErr logic instead of
copy&pasting it EIGHT times.
Also modifies the returned errors to be wrapped with a context,
and changes the text of the logged errors a bit.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
It turns out the restart is _not_ a stop+start but keeps certain
resources open and is subject to some timeouts that may differ across
distributions' default settings.
[NO NEW TESTS NEEDED] as I have absolutely no idea how to reliably cause
the failure/flake/race.
Also ignore ENOENTS of the CID file when removing a container which has
been identified of actually fixing #17607.
Fixes: #17607
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This was added as hack in commit 6b06e9b77c because the journald logs
code was not able to handle an empty journal. But since commit
767947ab88 this is no longer the case, we correctly use the sd_journal
API and know when the journal is empty.
Therefore we no longer need this hack and it should be removed because
it just adds overhead and an empty journal entry for no good reason.
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Do not save the container each for changing the state and for removing
running exec sessions. Saving the container is expensive and avoiding
the redundant save makes `container rm` 1.2 times faster on my
workstation.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Loading container states speed things up when listing all containers but
it comes with a price tag for many other call paths. Hence, make
loading the state conditional to allow for keeping `podman ps` fast
without other commands regressing in performance.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Every time I look at a container-removal issue I wonder why the
container isn't locked directly here, so let's add a comment here.
I am not sure whether I would be better if callers took care of
locking but for now the comment will safe the future me and probably
other readers some time.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
follow-up to 6886e80b45caae27dda81a9b44d8dd179c414580
when "podman -rm -f" is used on a container in "stopping" state, also
make sure it is terminated before removing it from the local storage.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Do not allow for removing the service container unless all associated
pods have been removed. Previously, the service container could be
removed when all pods have exited which can lead to a number of issues.
Now, the service container is treated like an infra container and can
only be removed along with the pods.
Also make sure that a pod is unlinked from the service container once
it's being removed.
Fixes: #16964
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
When the new `events_container_create_inspect_data` option is enabled in
containers.conf set the `ContainersInspectData` event field for each
container-create event.
The data was requested for the purpose of auditing (e.g., intrusion
detection).
Jira: https://issues.redhat.com/browse/RUN-1702
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Including the RawInput in the API output is meaningless.
Fixes: #16497
[NO NEW TESTS NEEDED]
Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
Remove the container/pod ID file along with the container/pod. It's
primarily used in the context of systemd and are not useful nor needed
once a container/pod has ceased to exist.
Fixes: #16387
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
there is already the same check when using cgroupfs, but not when
using the systemd cgroup backend. The check is needed to avoid a
confusing error from the OCI runtime.
Closes: https://github.com/containers/podman/issues/16376
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We added the concept of image volumes in 2.2.0, to support
inspecting an image from within a container. However, this is a
strictly read-only mount, with no modification allowed.
By contrast, the new `image` volume driver creates a c/storage
container as its underlying storage, so we have a read/write
layer. This, in and of itself, is not especially interesting, but
what it will enable in the future is. If we add a new command to
allow these image volumes to be committed, we can now distribute
volumes - and changes to them - via a standard OCI image registry
(which is rather new and quite exciting).
Future work in this area:
- Add support for `podman volume push` (commit volume changes and
push resulting image to OCI registry).
- Add support for `podman volume pull` (currently, we require
that the image a volume is created from be already pulled; it
would be simpler if we had a dedicated command that did the
pull and made a volume from it)
- Add support for scratch images (make an empty image on demand
to use as the base of the volume)
- Add UOR support to `podman volume push` and
`podman volume pull` to enable both with non-image volume
drivers
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Originally, during pod removal, we locked every container in the
pod at once, did a number of validity checks to ensure everything
was safe, and then removed all the containers in the pod.
A deadlock was recently discovered with this approach. In brief,
we cannot lock the entire pod (or much more than a single
container at a time) without causing a deadlock. As such, we
converted to an approach where we just looped over each container
in the pod, removing them individually. Unfortunately, this
removed a lot of the validity checking of the earlier approach,
allowing for a lot of unintended bad things. Infra containers
could be removed while containers in the pod still depended on
them, for example.
There's no easy way to do validity checks while in a simple loop,
so I implemented a version of our graph-traversal logic that
currently handles pod start. This version acts in the reverse
order of startup: startup starts from containers which depend on
nothing and moves outwards, while removal acts on containers which
have nothing depend on them and moves inwards. By doing graph
traversal, we can guarantee that nothing is removed while
something that depends on it still exists - so the infra
container should be the last thing in a pod that is removed, for
example.
In the (unlikely) case that a graph of the pod's containers
cannot be built (most likely impossible without database editing)
the old method of pod removal has been retained to ensure that
even misbehaving pods can be forcibly evicted from the state.
I'm fairly confident that this resolves the problem, but there
are a lot of assumptions around dependency structure built into
the original pod removal code and I am not 100% sure I have
captured all of them.
Fixes#15526
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In view of https://github.com/containers/storage/pull/1337, do this:
for f in $(git grep -l stringid.GenerateNonCryptoID | grep -v '^vendor/'); do
sed -i 's/stringid.GenerateNonCryptoID/stringid.GenerateRandomID/g' $f;
done
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Podman adds an Error: to every error message. So starting an error
message with "error" ends up being reported to the user as
Error: error ...
This patch removes the stutter.
Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
In case of a hard OS shutdown, containers may have a "removing"
state after a reboot, and an attempt to remove Pods with such
containers is unsuccessful:
error freeing lock for container ...: no such file or directory
[NO NEW TESTS NEEDED]
Signed-off-by: Mikhail Khachayants <tyler92@inbox.ru>
This mount has never been standard on FreeBSD, preferring to use /tmp or
/var/tmp optionally with tmpfs to ensure data is lost on a reboot.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
When a kube yaml has a volume set as empty dir, podman
will create an anonymous volume with the empty dir name and
attach it to the containers running in the pod. When the pod
is removed, the empy dir volume created is also removed.
Add tests and docs for this as well.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>