Currently podman run -d can exit 0 if we send SIGTERM during startup
even though the contianer was never started. That just doesn't make any
sense is horribly confusing for a external job manager like systemd.
The original motivation was to exit 0 for the podman.service in commit
ca7376bb11. That does make sense but it should only do so for the
service and only if the server did indeed gracefully shutdown.
So we rework how the exit logic works, do not let the handler perform
the exit. Instead the shutdown package does the exit after all handlers
are run, this solves the issue of ordering. Then we default to exit code
1 like we did before and allow the service exit handler to overwrite the
exit code 0 in case of a graceful shutdown.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When we are killed during netns setup it will leak the netns path as it
was not commited in the db. This is rather common if you run systemctl
stop on a podman systemd unit. Of course we cannot protect against
SIGKILL but in systemd case we get SIGTERM and we really should not exit
in a critical section like this.
Fixes#24044
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Document the special *host-gateway* flag introduced with #19152, mention the special `host.containers.internal` and `host.docker.internal` hostnames, and clarify the option's usage in general.
Signed-off-by: Daniel Rudolf <github.com@daniel-rudolf.de>
Yield to reality: if $XDG_RUNTIME_DIR is unset, assume a
reasonable default (rootless only). This clears up a
common failure in Fedora gating tests, and will probably
prevent future time wasters.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Modifies the "Remove machine" test to verify the system connections are
handled properly on removal.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Primary motivator: 'curl -v' format changes in f42
Drive-bys:
* 127.0.0.1, not localhost
* use wait_for_port, not sleep
* show curl commands and their output, to ease debugging failures
* better failure assertions
Signed-off-by: Ed Santiago <santiago@redhat.com>
These flags can affect the output of the HealtCheck log. Currently, when a container is configured with HealthCheck, the output from the HealthCheck command is only logged to the container status file, which is accessible via `podman inspect`.
It is also limited to the last five executions and the first 500 characters per execution.
This makes debugging past problems very difficult, since the only information available about the failure of the HealthCheck command is the generic `healthcheck service failed` record.
- The `--health-log-destination` flag sets the destination of the HealthCheck log.
- `none`: (default behavior) `HealthCheckResults` are stored in overlay containers. (For example: `$runroot/healthcheck.log`)
- `directory`: creates a log file named `<container-ID>-healthcheck.log` with JSON `HealthCheckResults` in the specified directory.
- `events_logger`: The log will be written with logging mechanism set by events_loggeri. It also saves the log to a default directory, for performance on a system with a large number of logs.
- The `--health-max-log-count` flag sets the maximum number of attempts in the HealthCheck log file.
- A value of `0` indicates an infinite number of attempts in the log file.
- The default value is `5` attempts in the log file.
- The `--health-max-log-size` flag sets the maximum length of the log stored.
- A value of `0` indicates an infinite log length.
- The default value is `500` log characters.
Add --health-max-log-count flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-max-log-size flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-log-destination flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
The various pasta port forwarding tests run a socat server inside a
container, then connect to it from a socat client on the host. Currently
we have the server bind to the same specific address within the container
as we connect to on the host.
That's not quite what we want. For "tap" tests where the traffic goes over
pasta's L2 link to the container it's fine, though unnecessary. For
"loopback" tests where traffic is forwarded by pasta at the L4 socket
level, however, it's not quite right. In this case the address used is
either 127.0.0.1 or ::. That's correct and as needed for the host side
address we're connecting to. However on the container side, this only
works because of an odd and arguably undesirable behaviour of pasta: we use
the fact that we have an L4 socket within the container to make such
"spliced" L4 connections appear as if they come from loopback within the
container. A container will generally expect it's loopback address to be
only accessible from within the container, and this odd behaviour may be
changed in pasta in future.
In any case, the binding of the container side server is unnecessary, so
simply remove it.
Link: https://github.com/containers/podman/issues/24045
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Modify `RemoveConnections` to verify the new default system connection's
rootful state matches the rootful-ness of the podman machine it is associated
with.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Takes the code inside the closure in the function `RemoveConnections`
and makes it a separate function to increase readability.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Moves the `DefaultMachineName` constant out of `pkg/machine` and into
`pkg/machine/define`.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Mostly just switch to safename. Rewrite setup() to guarantee
unique service file names, atomically created.
* IMPORTANT NOTE: enabling parallelization on these tests
triggers #24010 ("fragment file" flake), but only on my
f40 laptop. I have never seen the flake in Cirrus despite
many many runs in #23275. I am submitting this for review
and merging because even though _something_ is broken,
this breakage is unlikely to affect our CI.
Signed-off-by: Ed Santiago <santiago@redhat.com>
In order to get better debug data for cleanup flakes. The argv is
printed with 0 bytes so replace them with spaces to make the log
readable for humans.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add a new program based on bpftrace[1] to trace all podman processes
with arguments and exit code/signals. Additionally this captures stderr
from all podman container cleanup processes spawned by conmon which
otherwise go to /dev/null and are never seen in any CI logs.
Hopefull this allows us to debug strange network cleanup error seen in
CI, my plan is to add this to the cirrus setup and upload the logs so we
can check them when the flakes happen.
[1] https://github.com/bpftrace/bpftrace
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Any test that uses --events-backend=file cannot be run in parallel
due to #23750. This seems to be a hard block, unfixable.
All other tests, enable ci:parallel.
And, bring in timing fixes#23600. Thanks, @Honny1!
Signed-off-by: Ed Santiago <santiago@redhat.com>
The format test flakes when quay is down, because we've
been doing 'podman search $IMAGE', which is a quay image.
Solution: check if local registry is running, and use it.
We don't need a real image.
Signed-off-by: Ed Santiago <santiago@redhat.com>
(where possible. Not all tests are parallelizable).
And, refactor two complicated tests into one. This one
is hard to review, sorry.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The `bin/docker` command should also honor the presence of `$XDG_CONFIG_HOME/containers/nodocker` when considering whether it should print the warning message.
Signed-off-by: Nick Dimiduk <ndimiduk@gmail.com>
Use os.ReadDir recursively instead of filepath.WalkDir
Use map instead of list to easily find looped Symlinks
Update existing tests and add a more elaborate one
Update the man page
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
The netns dir has a special logic to bind mout itself and make itslef
shared. This code here didn't which lead to catastrophic bug during
netns unmounting as we were unable to unmount the netns as the mount got
duplicated and had the wrong parent mount. This caused us to loop forever
trying to remove the file.
Fixes https://issues.redhat.com/browse/RHEL-59620Fixes#23685
Signed-off-by: Paul Holzinger <pholzing@redhat.com>