1745 Commits

Author SHA1 Message Date
d571ca6536 system test parallelization: enable two-pass approach
For the past two months we've been splitting system tests
into two categories: those that CAN be run in parallel,
and those that CANNOT. Much work has been done to replace
hardcoded names (mycontainer, mypod) with safename().
Hundreds of test runs, in CI and on Ed's laptop, have
proven this approach viable.

make {local,remote}system now runs in two steps: first
the serial ones, then the parallel ones. hack/bats will
now recognize the 'ci:parallel' tag and add --jobs (nprocs).

This requires some tweaking of leak_check, because there
can be umpteen tests running (affecting image/container/pod/etc
state) when any given test completes.

Rules for enabling parallelization in tests:

   * use unique container/pod/volume/network names (safename)
   * do not run 'podman rm -a' or 'rmi -a'
   * never use the -l (--latest) option
   * do not run 'podman ps/images' and expect precise output

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 09:25:02 -06:00
a4794bc9c6 Merge pull request #23977 from giuseppe/fix-permissions-copyup-volume-userns
libpod: convert owner IDs only with :idmap
2024-09-17 12:46:32 +00:00
432325236b libpod: convert owner IDs only with :idmap
convert the owner UID and GID into the user namespace only when
":idmap" mount is used.

This changes the behaviour of :idmap with an empty volume.  Now the
existing directory ownership is copied up as in the other case.

Closes: https://github.com/containers/podman/issues/23347

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-17 12:38:53 +02:00
c6616004f1 CI: make 260-sdnotify parallel-safe
Use safename. Add ci:parallel tags. Do not remove pause image
nor kube network.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-16 05:04:24 -06:00
d4cda112f1 Merge pull request #23921 from edsantiago/safename-710
CI: make 710-kube parallel-safe
2024-09-13 12:41:54 +00:00
421a80bcb7 Merge pull request #23908 from edsantiago/safename-505
CI: make 505-pasta parallel safe
2024-09-13 12:39:11 +00:00
29f75000dd Merge pull request #23916 from edsantiago/safename-320
CI: mark 320-system-df *NOT* parallel safe
2024-09-13 12:33:41 +00:00
7764bea981 Merge pull request #23819 from l0rd/kube-play-image-type-volumes
Add `kube play` support for volumes of type image
2024-09-11 18:32:24 +00:00
e61682f50e CI: make 710-kube parallel-safe
Use safename. Add ci:parallel tags. Use a random port, not
hardcoded 9999. Do not remove pause image. And especially
do not "rm -a" anything.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-10 14:49:14 -06:00
c38c197c71 Merge pull request #23907 from edsantiago/safename-020
CI: make 020-tag parallel-safe
2024-09-10 19:09:45 +00:00
0ff89a00af CI: mark 320-system-df *NOT* parallel safe
...because it requires 100% control and knowledge of the
state of all images, containers, and volumes.

Use safename anyway, just in case we ever have a leak from here.
I'm finding safename sooooooo helpful when reading journal.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-10 08:48:44 -06:00
db12343e27 Add kube play support for image volume source
Signed-off-by: Mario Loriedo <mario.loriedo@gmail.com>
2024-09-10 12:37:06 +00:00
22ec8ea06d CI: make 505-pasta parallel safe
Add ci:parallel tags; move one non-parallel-safe test to
another networking-test file; and a few drive-by fixes

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-09 14:02:48 -06:00
18932e0339 CI: make 020-tag parallel-safe
Use safename, with guaranteed-adjacent image names

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-09 13:37:57 -06:00
a165289574 CI: make 410-selinux parallel-safe
Use safename for containers and pods. Add ci:parallel tags.
And reenable distro-integration tests that had been skipped
due to a container-selinux bug that is now fixed.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-09 13:09:37 -06:00
649730c9a1 Merge pull request #23887 from Luap99/sort-tags
podman images: sort repository with tags
2024-09-09 16:39:15 +00:00
a1e6603133 libpod: make use of new pasta option from c/common
pasta added a new --map-guest-addr to option that maps a to the actual
host ip. This is exactly what we need for host.containers.internal
entry. So we now make use of this option by default but still have to
keep the exclude fallback because the option is very new and some
users/distros will not have it yet.

This also fixes an issue where the --dns-forward ip were not used when
using the bridge network mode, only useful when not using aardvark-dns
as this used the proper ips there already from the rootless netns
resolv.conf file.

Fixes #19213

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-06 14:43:18 +02:00
0abbcfa50a podman images: sort repository with tags
When you sort by repository a user most likely also want the tags to be
sorted as well. At the very least to get a stable output as the order
could be changed pull podman tag/pull even if they keep using the same
tag name.

Fixes #23803

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-06 14:17:17 +02:00
bdf96e7df2 Add support for Job to kube generate & play
The kube generate command can now generate a yaml for
the Job kind and the kube play command can create a pod
and containers with podman when passed in a Job yaml.
Add relevant tests and docs for this.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2024-09-05 11:36:38 -04:00
296385459a Merge pull request #23856 from edsantiago/safename-055
CI: make 055-rm parallel-safe
2024-09-04 11:31:54 +00:00
5b6fe4454b Merge pull request #23854 from edsantiago/safename-125
CI: make 125-import parallel-safe
2024-09-04 09:55:39 +00:00
958ee481c1 Merge pull request #23851 from edsantiago/parallelize-low-hanging-fruit
CI: system tests: parallelize low-hanging fruit
2024-09-04 09:47:23 +00:00
a9532c2c67 Merge pull request #23853 from edsantiago/safename-110
CI: make 110-history parallel-safe
2024-09-04 09:44:38 +00:00
7b019e9905 CI: make 055-rm parallel-safe
Use safename, and add ci:parallel tags to all tests. (One
test was running "podman wait -l", which cannot work in
parallel. I choose to change it to "wait $cname", and
lose the -l testing)

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-03 14:51:43 -06:00
e5624510ce CI: make 130-kill parallel-safe
Where possible, use safename and add ci:parallel tags.

One test runs "podman kill -a", which would be unwise to run
in parallel with other tests.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-03 14:46:55 -06:00
f38953c156 CI: make 125-import parallel-safe
Add a bunch of safenames, and ci:parallel tags, and one
workaround for a buildah parallelization bug

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-03 14:44:27 -06:00
0e1ac9cee1 CI: make 110-history parallel-safe
Add ci:parallel tags for Bats, and tweak one test to be safe

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-03 14:41:46 -06:00
bca7c20530 CI: system tests: parallelize low-hanging fruit
Add 'ci:parallel' tags to a few easy places. And, two
small easily-reviewed safename or random-port additions.

These have been working fine in #23275. I want to stop
carrying them there so I can work on simplifying my PR.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-03 14:22:01 -06:00
abea5ad4ac CI: parallel-safe network system test
- replace random_string with safename in container/network names
- add ci:parallel tags where possible.
  - where not possible, add explanations
- fix a userns leak

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-29 13:08:57 -06:00
678323efd8 CI: flake workaround: ignore socat waitpid warnings
Workaround (NOT A FIX) for pasta issue #23482, wherein
podman logs includes a waitpid: ESRCH warning. Consensus
seems to be that this is a bug in socat.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-27 11:25:08 -06:00
11547942b1 CI: parallel-safe userns test
- use safename
- add ci:parallel tags where possible
  - where not possible, document why

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-22 08:24:14 -06:00
68efa7e3a1 CI: parallel-safe run system test
- fix a few missing safenames
- eliminate 'container rm -a'
- when running ps, do substring match, not exact
- where possible, add ci:parallel tags
  - when not possible, explain

Also, fix a completely broken inspect test

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-22 05:58:52 -06:00
9c3921ca58 CI: parallel-safe namespaces system test
An easy one :-)

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-21 05:36:04 -06:00
458ba5a8af Fix podman stop and podman run --rmi
This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.

In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.

Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.

Fixes #22852
Fixes RHEL-39513

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-08-20 09:51:18 -04:00
80639df27a podman wait: allow waiting for removal of containers
By default wait only waits for the exit of a container, there is really
no way to make it wait for the removal too when the container was
created with --rm. I though I found a clever way in 8a943311db but this
is not working race free. While it works most of the time any other
parallel process might call syncContainer() before the cleanup process
holds the lock until it removes it. As such the wait hack to only update
the state and not sync the exit file did not work so we can drop that.

However the test wants to wait for the removal to happen by the cleanup
process and we can already say --condition=removing to do this but this
will throw an error if the ctr was removed instead of counting this as
success so fix that as well.

Fixes #23640

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-16 15:44:02 +02:00
8c132cc388 Merge pull request #23595 from edsantiago/parallel-safe-random-free-port
CI: system tests: make random_free_port() parallel-safe
2024-08-16 11:15:09 +00:00
f69ede1138 Merge pull request #23636 from edsantiago/safename-252
CI: quadlet tests: make parallel-safe
2024-08-16 08:30:06 +00:00
480d43748a CI: quadlet tests: make parallel-safe
The usual, safename instead of hardcoded names or random_string.
And remove some rmi statements: we no longer clean up pause_image.

Been working great in #23275 all week.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 10:56:51 -06:00
420bd16a21 CI: system tests: make random_free_port() parallel-safe
...by using a crude port lock-and-reserve mechanism. This is
a small cherrypick from code that has been working in #23275
over dozens of CI runs. Am separating out into a small PR
because it's stable, harmless to serial runs, and will
simplify the eventual review of #23275.

Closes: #23488

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 10:04:51 -06:00
1a1d2646df CI: format test: make parallel-safe
Use safename instead of hardcoded object names. Requires moving
a test table down, into the function itself instead of global,
because the table needs to know object names.

Also: sneak in a workaround for dealing with quay flakes (in
image search). The local registry is allowing almost all tests
to pass even when quay is down, but this one test still needs
to hit quay.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 08:34:26 -06:00
b6beed9f76 test/system: fix network cleanup restart test
Now that on-failure exits right away the test is racy as the
RestartCount is not at the value we expect as the container is still
restarting in the background. As such add a timer based approach.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
8a943311db libpod: simplify WaitForExit()
The current code did several complicated state checks that simply do not
work properly on a fast restarting container. It uses a special case for
--restart=always but forgot to take care of --restart=on-failure which
always hang for 20s until it run into the timeout.

The old logic also used to call CheckConmonRunning() but synced the
state before which means it may check a new conmon every time and thus
misses exits.

To fix the new the code is much simpler. Check the conmon pid, if it is
no longer running then get then check exit file and get exit code.

This is related to #23473 but I am not sure if this fixes it because we
cannot reproduce.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
17baab0bf5 Merge pull request #23561 from Luap99/test-pasta-port
test/system: pasta_test_do add explicit port check
2024-08-13 18:04:58 +00:00
a4c6bef65f Merge pull request #23592 from edsantiago/safename-080
CI: 080-pause.bats: make parallel-safe
2024-08-13 10:54:26 +00:00
1bf711e526 Merge pull request #23591 from edsantiago/safename-050
CI: 050-stop.bats: make parallel-safe
2024-08-13 10:51:42 +00:00
0d7e14fb83 healthcheck system check: reduce raciness
When will I learn not to dismiss something as "easy"?

Anyhow, this doesn't actually change anything parallel-wise
but it does reduce a race condition seen on heavily-loaded
slow systems, wherein a container goes into unhealthy before
we want it to. This version isn't perfect; I don't think
there's an ideal fix for this.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:24:37 -06:00
30ee9c0114 CI: healthcheck system test: make parallel-safe
Easy one, just replace "healthcheck_c"

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:23:54 -06:00
36f9a04499 CI: 080-pause.bats: make parallel-safe
Only one test can be parallelized. Do so, and add a comment
to the other one explaining why it can't be.

Also, add some missing error-message checks.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:05:27 -06:00
6656a18c3f CI: 050-stop.bats: make parallel-safe
Very few changes needed, all of them simple.

It is impossible to parallelize this entire file, because "stop -a".
Add tags to tests that can be parallelized, and comments to those
that can't.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:00:09 -06:00
6fce734f42 remote: fix invalid --cidfile + --ignore
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.

Fixes #23554

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 17:12:12 +02:00