In preparation for maybe some day being able to run build tests
in parallel.
SUPER IMPORTANT NOTE! BUILD TESTS CANNOT BE PARALLELIZED YET!
buildah, when run in parallel, barfs with:
race: parallel builds: copying...committing...creating... layer not known
Until this is fixed, podman-build can never be run in parallel.
See https://github.com/containers/buildah/issues/5674
This PR is simply cleaning things up so, if/when that day comes,
the ensuing parallelize PR will be short & sweet.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The recent fedora kernel 6.11.4 has a problem with ipv6 networks [1].
This is not a podman bug at all but rather a kernel regression. I can
reproduce the issue easily by running this test.
Given many users were hit by this add it to the distro level gating
which runs in the fedora openQA framework and then we should catch a
bad kernel like this hopefully in the future and prevent it from going
into stable.
[1] https://github.com/containers/podman/issues/24374
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Quadlet tests and some systemd tests leak unit files, as
reported by 'systemctl list-units --failed'. Clean them up.
Signed-off-by: Ed Santiago <santiago@redhat.com>
We reset the failed unit to not leak it, however we did so before
stopping, this is wrong because when the stop fails we will again have a
failed unit. The correct thing is to reset after the stop because once
it is stopped it cannot create new errors.
I found this using the following reproducer and this is enough to fix
it:
```
while :; do
cid=$(podman run -d --name foo --health-cmd /home/podman/healthcheck \
--health-startup-cmd /home/podman/healthcheck \
quay.io/libpod/testimage:20241011 /home/podman/pause)
podman healthcheck run $cid
podman rm -fa
sleep 2
systemctl --user list-units --failed | grep $cid && break
done
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The startup service is special because we have to transition from
startup to the normal unit. And in order to do so we kill ourselves (as
we are run as part of the service). This means we always exited 1 which
causes systemd to keep us failure and not remove the transient unit
unless "reset-failed" is called. As there is no process around to do
that we cannot really do this, thus make us exit(0) which makes more
sense.
Of course we could try to reset-failed the unit later but the code for
that seems more complicated than that.
Add a new test from Ed that ensures we check for all healthcheck units
not just the timer to avoid leaks. I slightly modified it to provide a
better error on leaks.
Fixes: 0bbef4b830 ("libpod: rework shutdown handler flow")
Fixes: #24351
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Equivalent to print() + system(). Shows individual commands
being run, which may help a developer understand and replicate
actions if they fail.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Initial purpose of treadmill PR was to run buildah-bud tests
early, and not run anything else if they fail. This was to
catch vendoring problems and not be distracted by flakes.
This was done by inspecting and massaging .cirrus.yml.
As of #21639 this code was a silent NOP because the entire
CI tree was overhauled. Here we make that work again.
Also, in #20947 I enhanced this script to run rootless
bud tests but neglected to updated the comments. Do so now.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Clarify, expand, fix a typo. These are the instructions
shown when the **patching** step fails, typically when
buildah's helpers.bash is changed in a way that conflicts
with our make-it-work-in-podman patches.
Signed-off-by: Ed Santiago <santiago@redhat.com>
This fixes two problems, first if a port is published and exposed it
should not be shown twice. It is enough to show the published one.
Second, if there is a huge range the ports were no grouped causing the
output to be unreadable basically. Now we group exposed ports like we do
with the normal published ports.
Fixes#23317
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This can never included a comma in the protocol so it just complicated
things for no reason, we never needed this and commit edc3dc5e11 already
ensures this cannot happen.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As an internal consistency check, the pasta tests check for duplicated test
cases by grepping a log file for a parsed test id. However it uses
grep -F for the purpose which will not perform an exact match, but a
substring match. There are some tests which generate an id which is a
substring of the id for other tests, so when test order is randomised, this
can cause a spurious failure. This can happen in practice when running
the test in parallel with very high concurrency (e.g. -j 100).
Fix this by adding the -x option to grep, which only checks for full line
exact matches.
Fixes: https://github.com/containers/podman/issues/24342
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The additional image store feature assumes that images / layers
in the additional store never go away, while we do remove it after
this test. Try to repair the store.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Historically, non-schema1 images had a deterministic image ID == config digest.
With zstd:chunked, we don't want to deduplicate layers pulled by consuming the
full tarball and layers partially pulled based on TOC, because we can't cheaply
ensure equivalence; so, image IDs for images where a TOC was used differ.
To accommodate that, compare images using their configs digests, not using image IDs.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
When looking up the current-store image ID, do that
from the same output where we verify that the ID is from the
current store, instead of listing images twice.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The test got the stores RW status backwards.
Before zstd:chunked, both image IDs should be the same, so this used
to make no difference.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will appease the higher-level quota logic. Basically, to
find a free quota ID to prevent reuse, we will iterate through
the contents of the directory and check the quota IDs of all
subdirectories, then use the first free ID found that is larger
than the base ID (the one set on the base directory). Problem:
our volumes use a two-tier directory structure, where the volume
has an outer directory (with the name of the actual volume) and
an inner directory (always named _data). We were only setting the
quota on _data, meaning the outer directory did not have an ID,
and the ID-choosing logic thus never detected that any IDs had
been allocated and always chose the same ID.
Setting the ID on the outer directory with PROJINHERIT set makes
the ID allocation logic work properly, and guarantees children
inherit the ID - so _data and all contents of the volume get the
ID as we'd expect.
No tests as we don't have a filesystem in our CI that supports
XFS quotas (setting it on / needs kernel flags added).
Fixes https://issues.redhat.com/browse/RHEL-18038
Signed-off-by: Matt Heon <mheon@redhat.com>