We reset the failed unit to not leak it, however we did so before
stopping, this is wrong because when the stop fails we will again have a
failed unit. The correct thing is to reset after the stop because once
it is stopped it cannot create new errors.
I found this using the following reproducer and this is enough to fix
it:
```
while :; do
cid=$(podman run -d --name foo --health-cmd /home/podman/healthcheck \
--health-startup-cmd /home/podman/healthcheck \
quay.io/libpod/testimage:20241011 /home/podman/pause)
podman healthcheck run $cid
podman rm -fa
sleep 2
systemctl --user list-units --failed | grep $cid && break
done
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The startup service is special because we have to transition from
startup to the normal unit. And in order to do so we kill ourselves (as
we are run as part of the service). This means we always exited 1 which
causes systemd to keep us failure and not remove the transient unit
unless "reset-failed" is called. As there is no process around to do
that we cannot really do this, thus make us exit(0) which makes more
sense.
Of course we could try to reset-failed the unit later but the code for
that seems more complicated than that.
Add a new test from Ed that ensures we check for all healthcheck units
not just the timer to avoid leaks. I slightly modified it to provide a
better error on leaks.
Fixes: 0bbef4b830 ("libpod: rework shutdown handler flow")
Fixes: #24351
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Equivalent to print() + system(). Shows individual commands
being run, which may help a developer understand and replicate
actions if they fail.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Initial purpose of treadmill PR was to run buildah-bud tests
early, and not run anything else if they fail. This was to
catch vendoring problems and not be distracted by flakes.
This was done by inspecting and massaging .cirrus.yml.
As of #21639 this code was a silent NOP because the entire
CI tree was overhauled. Here we make that work again.
Also, in #20947 I enhanced this script to run rootless
bud tests but neglected to updated the comments. Do so now.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Clarify, expand, fix a typo. These are the instructions
shown when the **patching** step fails, typically when
buildah's helpers.bash is changed in a way that conflicts
with our make-it-work-in-podman patches.
Signed-off-by: Ed Santiago <santiago@redhat.com>
This fixes two problems, first if a port is published and exposed it
should not be shown twice. It is enough to show the published one.
Second, if there is a huge range the ports were no grouped causing the
output to be unreadable basically. Now we group exposed ports like we do
with the normal published ports.
Fixes#23317
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This can never included a comma in the protocol so it just complicated
things for no reason, we never needed this and commit edc3dc5e11 already
ensures this cannot happen.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As an internal consistency check, the pasta tests check for duplicated test
cases by grepping a log file for a parsed test id. However it uses
grep -F for the purpose which will not perform an exact match, but a
substring match. There are some tests which generate an id which is a
substring of the id for other tests, so when test order is randomised, this
can cause a spurious failure. This can happen in practice when running
the test in parallel with very high concurrency (e.g. -j 100).
Fix this by adding the -x option to grep, which only checks for full line
exact matches.
Fixes: https://github.com/containers/podman/issues/24342
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The additional image store feature assumes that images / layers
in the additional store never go away, while we do remove it after
this test. Try to repair the store.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Historically, non-schema1 images had a deterministic image ID == config digest.
With zstd:chunked, we don't want to deduplicate layers pulled by consuming the
full tarball and layers partially pulled based on TOC, because we can't cheaply
ensure equivalence; so, image IDs for images where a TOC was used differ.
To accommodate that, compare images using their configs digests, not using image IDs.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
When looking up the current-store image ID, do that
from the same output where we verify that the ID is from the
current store, instead of listing images twice.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The test got the stores RW status backwards.
Before zstd:chunked, both image IDs should be the same, so this used
to make no difference.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will appease the higher-level quota logic. Basically, to
find a free quota ID to prevent reuse, we will iterate through
the contents of the directory and check the quota IDs of all
subdirectories, then use the first free ID found that is larger
than the base ID (the one set on the base directory). Problem:
our volumes use a two-tier directory structure, where the volume
has an outer directory (with the name of the actual volume) and
an inner directory (always named _data). We were only setting the
quota on _data, meaning the outer directory did not have an ID,
and the ID-choosing logic thus never detected that any IDs had
been allocated and always chose the same ID.
Setting the ID on the outer directory with PROJINHERIT set makes
the ID allocation logic work properly, and guarantees children
inherit the ID - so _data and all contents of the volume get the
ID as we'd expect.
No tests as we don't have a filesystem in our CI that supports
XFS quotas (setting it on / needs kernel flags added).
Fixes https://issues.redhat.com/browse/RHEL-18038
Signed-off-by: Matt Heon <mheon@redhat.com>
when the current soft limit is higher than the new value, ulimit fails
to set the hard limit as (tested on Rawhide):
[root@rawhide ~]# ulimit -n -H 1048575
-bash: ulimit: open files: cannot modify limit: Invalid argument
to avoid the problem, set also the soft limit:
[root@rawhide ~]# ulimit -n -H
12345678
[root@rawhide ~]# ulimit -n -H 1048575
-bash: ulimit: open files: cannot modify limit: Invalid argument
[root@rawhide ~]# ulimit -n -SH 1048575
[root@rawhide ~]# ulimit -n -H
1048575
commit 71d5ee0e04eb61802b7c59166d88eac19c563ff7 introduced the issue.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This hasn't been touched in 7 years and Vagarant is no longer
a default entrypoint for many people. We have other things
documented in CONTRIBUTING.
Signed-off-by: Colin Walters <walters@verbum.org>
`CRRuntimeSupportsPodCheckpointRestore()` is used to check if the current
container runtime (e.g., runc or crun) can restore a container into an
existing Pod. It does this by processing output message to check if the
`--lsm-mount-context` option is supported. This option was recently
added to crun [1], however, crun and runc have slightly different output
messages:
```
$ crun restore--lsm-mount-contextt
restore: option '--lsm-mount-context' requires an argument
Try `restore --help' or `restore --usage' for more information.
```
```
$ runc restore --lsm-mount-context
ERRO[0000] flag needs an argument: -lsm-mount-context
```
This patch updates the function to support both runtimes.
[1] https://github.com/containers/crun/pull/1578
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Update to latest main to see if everything passes in preparation for the
first 5.3 release candidate.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>