Commit Graph

1717 Commits

Author SHA1 Message Date
Ed Santiago
abea5ad4ac CI: parallel-safe network system test
- replace random_string with safename in container/network names
- add ci:parallel tags where possible.
  - where not possible, add explanations
- fix a userns leak

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-29 13:08:57 -06:00
Ed Santiago
678323efd8 CI: flake workaround: ignore socat waitpid warnings
Workaround (NOT A FIX) for pasta issue #23482, wherein
podman logs includes a waitpid: ESRCH warning. Consensus
seems to be that this is a bug in socat.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-27 11:25:08 -06:00
Ed Santiago
11547942b1 CI: parallel-safe userns test
- use safename
- add ci:parallel tags where possible
  - where not possible, document why

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-22 08:24:14 -06:00
Ed Santiago
68efa7e3a1 CI: parallel-safe run system test
- fix a few missing safenames
- eliminate 'container rm -a'
- when running ps, do substring match, not exact
- where possible, add ci:parallel tags
  - when not possible, explain

Also, fix a completely broken inspect test

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-22 05:58:52 -06:00
Ed Santiago
9c3921ca58 CI: parallel-safe namespaces system test
An easy one :-)

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-21 05:36:04 -06:00
Matt Heon
458ba5a8af Fix podman stop and podman run --rmi
This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.

In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.

Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.

Fixes #22852
Fixes RHEL-39513

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-08-20 09:51:18 -04:00
Paul Holzinger
80639df27a podman wait: allow waiting for removal of containers
By default wait only waits for the exit of a container, there is really
no way to make it wait for the removal too when the container was
created with --rm. I though I found a clever way in 8a943311db but this
is not working race free. While it works most of the time any other
parallel process might call syncContainer() before the cleanup process
holds the lock until it removes it. As such the wait hack to only update
the state and not sync the exit file did not work so we can drop that.

However the test wants to wait for the removal to happen by the cleanup
process and we can already say --condition=removing to do this but this
will throw an error if the ctr was removed instead of counting this as
success so fix that as well.

Fixes #23640

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-16 15:44:02 +02:00
openshift-merge-bot[bot]
8c132cc388 Merge pull request #23595 from edsantiago/parallel-safe-random-free-port
CI: system tests: make random_free_port() parallel-safe
2024-08-16 11:15:09 +00:00
openshift-merge-bot[bot]
f69ede1138 Merge pull request #23636 from edsantiago/safename-252
CI: quadlet tests: make parallel-safe
2024-08-16 08:30:06 +00:00
Ed Santiago
480d43748a CI: quadlet tests: make parallel-safe
The usual, safename instead of hardcoded names or random_string.
And remove some rmi statements: we no longer clean up pause_image.

Been working great in #23275 all week.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 10:56:51 -06:00
Ed Santiago
420bd16a21 CI: system tests: make random_free_port() parallel-safe
...by using a crude port lock-and-reserve mechanism. This is
a small cherrypick from code that has been working in #23275
over dozens of CI runs. Am separating out into a small PR
because it's stable, harmless to serial runs, and will
simplify the eventual review of #23275.

Closes: #23488

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 10:04:51 -06:00
Ed Santiago
1a1d2646df CI: format test: make parallel-safe
Use safename instead of hardcoded object names. Requires moving
a test table down, into the function itself instead of global,
because the table needs to know object names.

Also: sneak in a workaround for dealing with quay flakes (in
image search). The local registry is allowing almost all tests
to pass even when quay is down, but this one test still needs
to hit quay.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-15 08:34:26 -06:00
Paul Holzinger
b6beed9f76 test/system: fix network cleanup restart test
Now that on-failure exits right away the test is racy as the
RestartCount is not at the value we expect as the container is still
restarting in the background. As such add a timer based approach.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
Paul Holzinger
8a943311db libpod: simplify WaitForExit()
The current code did several complicated state checks that simply do not
work properly on a fast restarting container. It uses a special case for
--restart=always but forgot to take care of --restart=on-failure which
always hang for 20s until it run into the timeout.

The old logic also used to call CheckConmonRunning() but synced the
state before which means it may check a new conmon every time and thus
misses exits.

To fix the new the code is much simpler. Check the conmon pid, if it is
no longer running then get then check exit file and get exit code.

This is related to #23473 but I am not sure if this fixes it because we
cannot reproduce.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
openshift-merge-bot[bot]
17baab0bf5 Merge pull request #23561 from Luap99/test-pasta-port
test/system: pasta_test_do add explicit port check
2024-08-13 18:04:58 +00:00
openshift-merge-bot[bot]
a4c6bef65f Merge pull request #23592 from edsantiago/safename-080
CI: 080-pause.bats: make parallel-safe
2024-08-13 10:54:26 +00:00
openshift-merge-bot[bot]
1bf711e526 Merge pull request #23591 from edsantiago/safename-050
CI: 050-stop.bats: make parallel-safe
2024-08-13 10:51:42 +00:00
Ed Santiago
0d7e14fb83 healthcheck system check: reduce raciness
When will I learn not to dismiss something as "easy"?

Anyhow, this doesn't actually change anything parallel-wise
but it does reduce a race condition seen on heavily-loaded
slow systems, wherein a container goes into unhealthy before
we want it to. This version isn't perfect; I don't think
there's an ideal fix for this.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:24:37 -06:00
Ed Santiago
30ee9c0114 CI: healthcheck system test: make parallel-safe
Easy one, just replace "healthcheck_c"

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:23:54 -06:00
Ed Santiago
36f9a04499 CI: 080-pause.bats: make parallel-safe
Only one test can be parallelized. Do so, and add a comment
to the other one explaining why it can't be.

Also, add some missing error-message checks.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:05:27 -06:00
Ed Santiago
6656a18c3f CI: 050-stop.bats: make parallel-safe
Very few changes needed, all of them simple.

It is impossible to parallelize this entire file, because "stop -a".
Add tags to tests that can be parallelized, and comments to those
that can't.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:00:09 -06:00
Paul Holzinger
6fce734f42 remote: fix invalid --cidfile + --ignore
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.

Fixes #23554

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 17:12:12 +02:00
openshift-merge-bot[bot]
708d6c5e2b Merge pull request #23449 from ygalblum/quadlet-override-service-name
Quadlet override service name
2024-08-12 13:56:48 +00:00
Paul Holzinger
20f3e8909e test/system: pasta_test_do add explicit port check
Do not rely on an arbitrary delay in order to ensure the port was bound
in the container. Instead this approach checks if the port is bound in
the netns and only then starts the client. This speeds up the entire
test file by 50% but more importantly in parallel testing it solves
hangs as the timeout there was unreliable.

Fixes #23471

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 13:46:56 +02:00
Ygal Blum
795851edd3 Quadlet - Allow the user to override the default service name
Add support for the ServiceName key for all unit types
Extend the PodInfo struct into UnitInfo to consolidate all prepopulated data into a single map
Use the NodesInfo map instead of the resourceName
Update the UnitInfo in the convert function instead of returning it
No need to replace extension anymore just remove it
All e2e tests with dependencies on other Quadlet files moved to a separate section
Add the capability of overriding the service name in the test
Add e2e tests for the new functionality
Adjust integration tests
Update the MAN page

Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
2024-08-07 17:50:49 +03:00
Daniel J Walsh
a06a7d7ba8 Should not force conversion of manifest type to DockerV2ListMediaType
Fixes: https://github.com/containers/podman/issues/23163

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-08-07 06:07:46 -04:00
openshift-merge-bot[bot]
61f7db5e7a Merge pull request #23527 from edsantiago/safename-012
CI: manifest system tests: make parallel-safe
2024-08-07 08:25:10 +00:00
openshift-merge-bot[bot]
4109ffa649 Merge pull request #23529 from edsantiago/safename-060
CI: mount system test: make parallel-safe
2024-08-07 08:19:31 +00:00
Ed Santiago
f99c7ead92 CI: mount system test: parallelize
Use safename for containers, volumes, images.

Build a temporary scratch image for podman image mount, so
we can safely mount/umount it (instead of $IMAGE) without
risk of other parallel tests umounting it.

Fixed some oopsies ("$vol1" is empty string, so, NOP test)

And... an experiment. I'm leaving in my 'ci:parallel' tags
and notes, so I don't have to carry them in #23275. This
is harmless, basically just noisy comments. The drawback
is, if for some reason #23275 does not pan out, I'll have
to go back and remove those tags. Right now I'm feeling
pretty comfortable about this parallelization approach tho.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-06 13:28:47 -06:00
Ed Santiago
f9b67cea57 CI: manifest system tests: make parallel-safe
Use safename instead of hardcoded "test"

Start registry once, in setup_file(), instead of requiring
individual tests to do so.

Add explicit --authfile arg to a bunch of places that now need it

Minor cleanup and improvements in test descriptions. I may have
gotten a little carried away here, but if this test ever fails
these additions will make someone's life much easier.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-06 13:07:10 -06:00
Matt Heon
eb7ce80cf9 Create volume path before state initialization
Strictly speaking we don't need the path yet, but it existing
prevents a lot of strangeness in our path-checking logic to
validate the current Podman configuration, as it was the only
path that might not exist this early in init.

Fixes #23515

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-08-06 13:42:09 -04:00
Ed Santiago
bfb42b3b15 CI: completion system test: use safename
Ongoing efforts to make system tests parallel-safe

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-05 05:09:19 -06:00
Giuseppe Scrivano
3ae1568933 libpod: fix volume copyup with idmap
if idmap is specified for a volume, reverse the mappings when copying
up from the container, so that the original permissions are maintained.

Closes: https://github.com/containers/podman/issues/23467

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-08-01 22:49:27 +02:00
Ed Santiago
83e90a2f5b System tests: leak_test: readable output
BATS teardown logs are unreadable, making it almost impossible
to see tiny "Leaked this-or-that" messages.

Solution: new _run_podman_quiet() helper, replaces run_podman
in a small number of cases within teardown. Clunky, and
duplicative, sorry.

New helper for leak_check, basically spits out warnings (and
bumps error count) if it sees any output whatsoever from
individual "podman XXX ls" commands.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-01 05:39:50 -06:00
openshift-merge-bot[bot]
7c4b1f7aa7 Merge pull request #23431 from edsantiago/clean-up-stray-external
CI: kube test: fix broken external-storage test
2024-08-01 11:30:00 +00:00
openshift-merge-bot[bot]
803ef5c16f Merge pull request #23384 from edsantiago/root-namespace
CI: enable root user namespaces
2024-08-01 10:32:16 +00:00
Paul Holzinger
77081df8cd libpod: bind ports before network setup
We bind ports to ensure there are no conflicts and we leak them into
conmon to keep them open. However we bound the ports after the network
was set up so it was possible for a second network setup to overwrite
the firewall configs of a previous container as it failed only later
when binding the port. As such we must ensure we bind before the network
is set up.

This is not so simple because we still have to take care of
PostConfigureNetNS bool in which case the network set up happens after
we launch conmon. Thus we end up with two different conditions.

Also it is possible that we "leak" the ports that are set on the
container until the garbage collector will close them. This is not
perfect but the alternative is adding special error handling on each
function exit after prepare until we start conmon which is a lot of work
to do correctly.

Fixes https://issues.redhat.com/browse/RHEL-50746

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-30 14:39:08 +02:00
Ed Santiago
396961069c CI: kube test: fix broken external-storage test
I broke the kube external storage test in the course of my
safename PR: _write_test_yaml() with no command generated
a pod that did not trigger the conditions required for
this test.

Solution: run a container (top). Add new checks to prevent
this gap from happening again.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-29 12:23:35 -06:00
Ed Santiago
7bb3b83c17 CI: enable root user namespaces
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-27 23:23:29 +02:00
Yiqiao Pu
a18bd3e9c0 Add test steps for automount with multi images
These test steps check the automount feature with multi images for
following item:
  1. multi images can be auotmounted with yaml file.
  2. if there are same path exist in the images, the last one
should trumps.
  3. the volume is mounted readonly in the container.
  4. the volumes are only mounted in the specific container, but
not the whole pods.

Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2024-07-26 15:56:33 +08:00
Ed Santiago
25fffdb74f CI: cp tests: use safename
Continuing efforts to make system tests parallel-safe

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-25 11:55:38 -06:00
Ed Santiago
fd0ff9060f CI: 700-play: fix a leaked non-safename
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-25 05:49:42 -06:00
openshift-merge-bot[bot]
1da89dd180 Merge pull request #23249 from giuseppe/play-kube-userns-fixes
kube generate/play restores the user namespace configuration
2024-07-24 17:34:59 +00:00
Giuseppe Scrivano
d9c2806461 test: check that kube generate/play restores the userns
validate that a "podman generate" and "podman play" cycle restores the
specified user namespace.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-24 17:36:38 +02:00
Paul Holzinger
2e20681f05 test/system: fix borken pasta interface name checks
The tests didn't check anything actually because default_ifname requires
an ip version argument to work. Thus pasta_iface was empty, add new
checks to prevent this kind of error again.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-24 14:56:30 +02:00
Paul Holzinger
da3edce4e6 test/system: fix bridge host.containers.internal test
The test assumes that if more than 1 ip on the host we should be able to
set host.containers.internal. This however is not how the logic works in
the code. What it actually does is to check all ips in the
rootless-netns and then it knows that it cannot use any of these ips.
This includes any podman bridge ips.

You can reproduce the error when you have only one ipv4 on the host then
run a container as root in the background and run the test:
hack/bats --rootless 505:host.containers.internal

So the failure here was that there was already a podman container
running as root on the default bridge thus the test saw 2 ips but then
the rootless run also uses the same subnet for its bridge and the code
knew that ip would not work either. I could have made another special
condition in test but the better way to work around it is to create a
new network. A new network will make sure there are no conflicting
subnets assigned so the test will pass.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-24 14:52:53 +02:00
openshift-merge-bot[bot]
c804f10686 Merge pull request #23378 from edsantiago/systest-fixes
CI: system tests: instrument to allow failure analysis
2024-07-24 08:29:49 +00:00
openshift-merge-bot[bot]
7b59ad8681 Merge pull request #23380 from edsantiago/safename-log-test
CI: system log test: use safe names
2024-07-24 05:53:01 +00:00
Ed Santiago
64f2d85e4f CI: system log test: use safe names
Continuing efforts on making system tests parallel-safe by
using unique names for containers and pods.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-23 14:47:02 -06:00
Ed Santiago
b61667470c CI: system tests: instrument to allow failure analysis
Two tests failing in gating but never CI; add some debug
instrumentation to make it possible to find out what
is going on

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-23 12:58:58 -06:00