23615 Commits

Author SHA1 Message Date
6d4006b123 Update module github.com/docker/docker to v27.3.1+incompatible
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-09-20 19:56:39 +00:00
2f44b166e7 Merge pull request #24024 from Luap99/netns-dir
libpod: setupNetNS() correctly mount netns
2024-09-20 14:41:59 +00:00
792796183f libpod: setupNetNS() correctly mount netns
The netns dir has a special logic to bind mout itself and make itslef
shared. This code here didn't which lead to catastrophic bug during
netns unmounting as we were unable to unmount the netns as the mount got
duplicated and had the wrong parent mount. This caused us to loop forever
trying to remove the file.

Fixes https://issues.redhat.com/browse/RHEL-59620
Fixes #23685

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-20 15:19:22 +02:00
f6bda786ed vendor latest c/common
To include the pkg/netns changes.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-20 15:18:35 +02:00
f7be7a365a Merge pull request #24019 from edsantiago/quadlet-rootfs-fix
CI: Quadlet rootfs test: use container image as rootfs
2024-09-20 10:55:12 +00:00
e38f86c024 Merge pull request #24020 from containers/renovate/github.com-docker-docker-27.x
Update module github.com/docker/docker to v27.3.0+incompatible
2024-09-20 10:22:14 +00:00
597773464c Update module github.com/docker/docker to v27.3.0+incompatible
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-09-19 22:52:25 +00:00
a08ae98161 CI: Quadlet rootfs test: use container image as rootfs
Test was written to use / (root). This is not parallel-safe.

Fixes: #23909

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-19 15:19:14 -06:00
217ecac740 Merge pull request #23996 from edsantiago/safename-200
CI: make 200-pod parallel-safe
2024-09-19 14:27:38 +00:00
80776fa5bb Merge pull request #24007 from edsantiago/systest-cleanup
CI: system tests: various small cleanups
2024-09-19 14:05:36 +00:00
eb18c41835 Merge pull request #24002 from edsantiago/systest-registry
CI: system test registry: use --net=host
2024-09-19 12:48:35 +00:00
9c51eead06 CI: system test registry: use --net=host
This removes the need for a tricky/fragile namespace workaround.

Huge thanks to Paul for discovering documentation on the
Registry container, and how to override config.yml settings:

   https://distribution.github.io/distribution/about/configuration/#override-specific-configuration-options

Drive-by: consistentize quotes in -eVAR="value". Minor, but
makes them all easier to read with emacs/vi syntax highlighting.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-19 05:17:15 -06:00
327c26af2f Merge pull request #24008 from stilwelb/fix-typo
Fix typo in error message
2024-09-18 20:02:59 +00:00
bb235fb9cc Merge pull request #24006 from Luap99/vendor-common
vendor latest c/common
2024-09-18 19:46:28 +00:00
e3af5a38d3 CI: rm system test: bump grace period
The "rm on stopping containers" test is flaking under high load,
probably because I bumped up two timeouts in the healthcheck
container that it relies on. Bump up this test's timeout as well.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:35:00 -06:00
3396dabdf3 CI: system tests: minor documentation on parallel
Only in 000-TEMPLATE. I know I need to write more thorough
documentation. I choose to defer that.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:32:36 -06:00
31cdf1197b fix typo in error message
Fixes: containers/podman#24001

Signed-off-by: Brad Stilwell <stilwelb@us.ibm.com>
2024-09-18 13:24:34 -04:00
1d5c8ac18e CI: system tests: always create pause image
...not just when running parallel Bats, because Bats
does not provide any way to know if we're parallel.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:23:12 -06:00
5e5c68ffbe CI: quadlet system test: be more forgiving
...of high system load (such as when running parallel tests).
Allow time for services to reach desired state, by retrying
a few times in a loop.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:22:48 -06:00
6dcda2196a vendor latest c/common
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 19:21:50 +02:00
04d193daa9 Merge pull request #23987 from edsantiago/safename-090
CI: make 090-events parallel-safe
2024-09-18 16:06:31 +00:00
bef0aabbdd Merge pull request #23995 from Luap99/netns-leak
CI: netns leak checks for system and e2e
2024-09-18 15:49:59 +00:00
7fee222d52 Merge pull request #23997 from Luap99/expose-sctp
allow exposed sctp ports
2024-09-18 15:08:45 +00:00
f580ae0d19 Merge pull request #23985 from Luap99/wait-hang
wait: fix handling of multiple conditions with exited
2024-09-18 12:26:28 +00:00
6fe832d5d6 CI: make 200-pod parallel-safe
...as much as possible. Not all tests can be parallelized.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 06:25:18 -06:00
d7335855d7 allow exposed sctp ports
There is no reason to disallow exposed sctp ports at all. As root we can
publish them find and as rootless it should error later anyway.

And for the case mentioned in the issue it doesn't make sense as the
port is not even published thus it is just part of the metadata which is
totally in all cases.

Fixes #23911

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 14:24:45 +02:00
755a06aa44 test/e2e: add netns leak check
Like we do in system tests now check for netns leaks in e2e as well. Now
because things run in parallel and this dir is shared we cannot test
after each test only once per suite. This will be a PITA to debug if
leaks happen as the netns files do not contain the container ID and are
just random bytes (maybe we should change this?)

Fixes #23715

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 14:05:26 +02:00
2d469e517d test/system: netns leak check for rootless as well
This fixes the problem where even as root we check the netns files from
root. But in order to catch any rootless bugs we must check the rootless
files from $XDG_RUNTIME_DIR/netns.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 12:07:11 +02:00
5468718f22 CI: make 090-events parallel-safe
...or at least as much as possible. Some tests cannot
be run in parallel due to #23750: "--events-backend=file"
does not actually work the way a naïve user would intuit.
Stop/die events are asynchronous, and can be gathered
by *ANY OTHER* podman process running after it, and if
that process has the default events-backend=journal,
that's where the event will be logged. See #23987 for
further discussion.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 18:21:58 -06:00
62c101651f Merge pull request #23857 from rhatdan/run
Remove containers/common/pkg/config from pkg/util
2024-09-17 20:31:28 +00:00
1e9464c9b4 Merge pull request #23937 from edsantiago/test-crun-17
New VMs: test crun 1.17
2024-09-17 20:28:43 +00:00
4dfff40840 Merge pull request #23989 from edsantiago/enable-bats-parallel
CI: system tests: enable parallel tests
2024-09-17 19:30:57 +00:00
75369fd283 Merge pull request #23986 from mheon/fix_23981
Match output of Compat Top API to Docker
2024-09-17 19:06:13 +00:00
f29901ef1b Merge pull request #23983 from nalind/manifest-remove-docs
podman-manifest-remove: update docs and help output
2024-09-17 18:52:30 +00:00
d0642ca913 Merge pull request #23988 from edsantiago/safename-012
CI: make 012-manifest parallel-safe
2024-09-17 18:00:13 +00:00
8402b6535f Misc minor test fixes
...for dealing with flakes in parallel mode

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
7fcf94d7b5 Add network namespace leak check
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
b3da5be2b1 Add workaround for buildah parallel bug
Need --layers=false in podman build, otherwise a buildah race
can trigger "layer not known" failures:

   https://github.com/containers/buildah/issues/5674

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
5fc3de5583 registry: lock start attempts
When running parallel, multiple tests could be trying to start
the registry at once. Make this parallel-safe.

Also, use a safer port range for the registry. Something
outside of /proc/sys/net/ipv4/ip_local_port_range

Sorry, I'm including a FIXME section that I haven't investigated
deeply enough.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
bf6131780a Update system test template and README
Add a few best-practices examples, and add a whole section
describing the dos and donts of writing parallel-safe tests.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
6502e30cfd bats log: differentiate parallel tests from sequential
For tests run in parallel, show file number as |nnn| (vs [nnn])

Teach logformatter to distinguish the two, adding 'p' to anchors
in parallel tests. Necessary because in this scheme we run bats
twice, thus see 'ok 1' twice, and we want to differentiate them.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
6b621d9571 ci: bump system tests to fastvm
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:36 -06:00
bcffa9ce30 clean_setup: create pause image
Workaround for #23292, where simultaneous 'pod create' commands
will all start a podman-build of the pause image, but only
one of them will be tagged, and the others will leak <none>
images.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:36 -06:00
812c7e9436 CI: make 012-manifest parallel-safe
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 10:35:01 -06:00
00c13afcb9 podman-manifest-remove: update docs and help output
* podman manifest remove doesn't accept references as descriptions of
  what to remove from a list or index; only use digests in the man page
* podman manifest remove only removes one thing at a time; correct the
  man page examples

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2024-09-17 11:36:12 -04:00
aa108924ea test/system: remove wait workaround
The issue is closed and I recently fixed a number of races (bf74797c69)
in the remote attach API that sound like exactly like the same error
that was mentioned in issue #9597.

As such I think this works, if it start flaking again we can revert this
or better fix the actual bug.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-17 17:35:18 +02:00
fbed3a01d2 wait: fix handling of multiple conditions with exited
As it turns on things are not so simple after all...
In podman-py it was reported[1] that waiting might hang, per our docs wait
on multiple conditions should exit once the first one is hit and not all
of them. However because the new wait logic never checked if the context
was cancelled the goroutine kept running until conmon exited and because
we used a waitgroup to wait for all of them to finish it blocked until
that happened.

First we can remove the waitgroup as we only need to wait for one of
them anyway via the channel. While this alone fixes the hang it would
still leak the other goroutine. As there is no way to cancel a goroutine
all the code must check for a cancelled context in the wait loop to no
leak.

Fixes 8a943311db ("libpod: simplify WaitForExit()")
[1] https://github.com/containers/podman-py/issues/425

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-17 17:35:17 +02:00
e04668c8ca Match output of Compat Top API to Docker
We were only splitting on tabs, not spaces, so we returned just a
single line most of the time, not an array of the fields in the
output of `ps`. Unfortunately, some of these fields are allowed
to contain spaces themselves, which makes things complicated, but
we got lucky in that Docker took the simplest possible solution
and just assumed that only one field would contain spaces and it
would always be the last one, which is easy enough to duplicate
on our end.

Fixes #23981

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-09-17 11:34:22 -04:00
d571ca6536 system test parallelization: enable two-pass approach
For the past two months we've been splitting system tests
into two categories: those that CAN be run in parallel,
and those that CANNOT. Much work has been done to replace
hardcoded names (mycontainer, mypod) with safename().
Hundreds of test runs, in CI and on Ed's laptop, have
proven this approach viable.

make {local,remote}system now runs in two steps: first
the serial ones, then the parallel ones. hack/bats will
now recognize the 'ci:parallel' tag and add --jobs (nprocs).

This requires some tweaking of leak_check, because there
can be umpteen tests running (affecting image/container/pod/etc
state) when any given test completes.

Rules for enabling parallelization in tests:

   * use unique container/pod/volume/network names (safename)
   * do not run 'podman rm -a' or 'rmi -a'
   * never use the -l (--latest) option
   * do not run 'podman ps/images' and expect precise output

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 09:25:02 -06:00
f4a08f46b7 Merge pull request #23959 from auyer/hide-secrets-from-container-inspect
Hide secrets from container inspect command
2024-09-17 13:00:18 +00:00