Commit Graph

1717 Commits

Author SHA1 Message Date
Daniel J Walsh
7768cf235e Run codespell on source
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-07-23 07:28:23 -04:00
Paul Holzinger
5e8884ab0d libpod: correctly capture healthcheck output
Using the scanner is just unnecessary complicated an buggy as it will
not read the final line with a newline. There is also the problem that
it happens in a separate goroutine so it could loose output if we read
the array before the scanner was done.

The API accepts a Writer so we can just directly use a bytes.Buffer
which captures all output in memory without the need of another
goroutine.

This also means that now we always include the final newline in the
output. I checked with docker and they do the same so this is good.

Fixes #23332

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-19 15:16:55 +02:00
openshift-merge-bot[bot]
8a53e8eb67 Merge pull request #23323 from Luap99/machine-decompress-empty
pkg/machine/compression: skip decompress bar for empty file
2024-07-18 17:51:11 +00:00
openshift-merge-bot[bot]
73986f67a3 Merge pull request #23313 from edsantiago/safename-kube-tests
CI: 700-play.bats: huge cleanup, with goal of making parallel-safe
2024-07-18 17:45:40 +00:00
Paul Holzinger
f630eebcfa pkg/machine/compression: skip decompress bar for empty file
When the file is empty it is possible our code panics as bar.ProxyReader
returns nil when the bar is finished which is the case for 0 size as it
doesn't have to read anything from there. However as this happens on
different goroutines it is race and most of the time still works.

To fix this simply skip the progress bar setup for empty files.

While at it fix the deprecated argument in the tests.

Fixes #23281

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-18 13:23:00 +02:00
Ed Santiago
7100ead475 nc -p considered harmful
nmap-ncat has been downgraded on Fedora, to 7.92.
nc -l -p PORT requires 7.95. Switch to nc -l ADDR PORT.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-18 05:15:57 -06:00
Ed Santiago
2f7fd64e98 700-play.bats: use unique pod/container/image/volume names
The end goal is making this test file parallel-safe, by:

  1) Having all tests use unique names for all objects; and
  2) Not doing "rm -a" or "expect ps to be empty".

This commit is not enough to make tests parallel-safe. The
rest of the changes are not relevant for now. This set of
changes is _necessary_ for parallelizing, and is _meaningful_
(good practice) for current linear-testing podman without
introducing any unnecessary cruft.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
Ed Santiago
380ed3a40d safename: consistent within same test, and, dashes
Make safename() invocations consistent within the same
test. This puts the onus on the caller to add a unique
element when calling multiple times, e.g. "ctr1-$(safename)".
This is not too much of a burden. Major benefit is making
it easy for a reader to associate containers, pods, volumes,
images within a given test.

And, use dashes, not underscores. "podman generate kube"
removes underscores, making it very difficult to do
things like "podman inspect $podname" (because we need
to generate "$podname_with_underscores_removed")

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
Ed Santiago
6d01ce417d 700-kube.bats: refactor $PODMAN_TMPDIR/test.yaml
Many instances. Simplify by having _write_test_yaml() define
the variable TESTYAML and make it available to callers.
Global replace, with care taken to undo any instances
where _write_test_yaml() is not invoked first.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
Ed Santiago
987d15a378 700-play.bats: eliminate $testYaml
Get rid of the last two instances of the clunky $testYaml
writing, by adding a 'volume=' arg to _write_test_yaml()

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
Ed Santiago
48aea083c0 700-play.bats: refactor clumsy yamlfile creation
Remnant from the very early days of this test file. There's
a boilerplate $testYaml string used in many tests; each
use requires three clunky lines of prep. Most of those
were not needed; we can (and now do) use _write_test_yaml()
instead.

There are still two instances that could not be fixed in
this commit. I will do those next. This commit is kept
relatively simple for ease of review.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
Ed Santiago
517c6e6f10 700-play.bats: move _write_test_yaml up near top
This is almost a NOP; it's needed for making subsequent commits
reviewable.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-17 18:24:07 -06:00
benniekiss
3c52ef43f5 Expand drop-in search paths
* top-level (pod.d)
* truncated (unit-.container.d)

Signed-off-by: Bennie Milburn-Town <63211101+benniekiss@users.noreply.github.com>
2024-07-17 17:43:02 -04:00
Daniel J Walsh
1ec3edd3f6 Do not crash on invalid filters
Vendor in latest containers/common
Fixes #23120

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-07-17 10:44:55 -04:00
Ed Santiago
b28027148b System tests: safe container/image/volume/etc names
Many system tests use hardcoded names for containers, images,
and everything. This has worked because system tests run
serially. It will not work if we ever run in parallel.

Create a new safename() helper, and use it as follows:

   myctr=c_$(safename)
   myvol1=v1_$(safename)
   ...

Find current instances of hardcoded names, and replace
with safe ones.

Whether or not we ever end up parallelizing system tests,
this is simply good practice.

There are far too many instances to fix in one (reviewable) PR.
This is commit 1 of N.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-15 11:38:00 -06:00
Giuseppe Scrivano
6832a35f65 libpod: cleanup store at shutdown
shutdown the containers store so that the home directory mount is not
leaked when "podman system service" exits.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-15 11:41:28 +02:00
openshift-merge-bot[bot]
360c4f372d Merge pull request #23234 from Luap99/test-nftables
test netavark nftables driver
2024-07-11 22:19:32 +00:00
openshift-merge-bot[bot]
58c8803a1e Merge pull request #22726 from edsantiago/pull-from-local-registry
CI: Use local cache registry
2024-07-11 12:42:04 +00:00
Paul Holzinger
5856adb9f8 test/system: fix network reload test with nftables
netavark can use iptables or nftables as firewall driver, thus if we try
to flush rules make sure we try both to keep the test working when we
switch the default to nftables.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-11 14:09:59 +02:00
Paul Holzinger
9945981afb test: remove publish tests from e2e
This test checks a simple publish which is already covered in many other
places, it also used iptables wich is a invalid assumption going forward
as we start to enable nftables as firewall driver.

The only thing these tests added where checking that we cannot resuse
the same port. Given there was more than one kernel regression[1,2]
about correctly failing with EADDRINUSE I also added the
distro-integration tag to make sure we catch this early in fedora
testing.

[1] https://lore.kernel.org/regressions/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
[2] https://lore.kernel.org/regressions/CAFsF8vL4CGFzWMb38_XviiEgxoKX0GYup=JiUFXUOmagdk9CRg@mail.gmail.com/

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-11 14:08:28 +02:00
openshift-merge-bot[bot]
1125d4d143 Merge pull request #23228 from Luap99/fix-internal-test
test/system: fix pasta host.containers.internal test
2024-07-11 11:22:20 +00:00
Ed Santiago
dd1bcabae9 CI: use local registry, part 2 of 3: fix tests
This commit gets tests working under the new local-registry system:

  * amend a few image names, mostly just sticking to a consistent
    list of those images in our registry cache. Mostly minor
    tag updates.

  * trickier: pull_test: change some error messages, and remove
    a test that's now a NOP. Basically, with a local (unprotected)
    registry we always get "404 manifest unknown"; with a real
    registry we'll get "403 I can't tell you".

  * trickiest: seccomp_test: build our own images at run time,
    with our desired labels. Until now we've been pulling
    prebuilt images, but those will not copy to the local
    cache registry. Something about v1? Anyhow, I gave up
    trying to cache them, and the workaround is straightforward.

Also took the liberty of strengthening a few error-message checks

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-11 04:39:45 -06:00
Ed Santiago
d4c0e7ecbd CI: test composefs on rawhide
Run root e2e & system tests using composefs on rawhide.

Write magic settings to storage.conf. That part is easy.

e2e tests, however, ignore storage.conf. They require everything
to be specified on the command line. And "everything", in the
case of composefs, includes a long complicated --pull-options
string which in turn requires containers-storage PR 1966
which, as of this writing, is finally vendored into podman.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-10 14:51:58 -06:00
Paul Holzinger
34ba26ec52 test/system: fix pasta host.containers.internal test
When a system has one ipv4 and one ipv6 address hostname -I will show
both causing a failure in the case where this is only one address.
To fix this stop using hostname -I and use ip -4 to only list v4
addresses and the use jq to filter the output accordingly.

Fixes #23227

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-09 11:08:52 +02:00
Paul Holzinger
3350cd3eed pkg/rootless: simplify reexec for container code
The code currently tried to avoid joining the userns from conmon
directly and rather joined to only read the pid file and then send this
back to use so we could join the userns. From the comment this was done
because we could not read the pid file. However this is no longer true
as of commit 49eb5af301 and file is no always owned by the real user.

This means we can just remove this special logic and join the namespace
directly there. A test has been added to check the rejoin logic with a
custom uidmapping.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-08 13:28:31 +02:00
openshift-merge-bot[bot]
666ed8f0dc Merge pull request #23189 from edsantiago/system-test-tweaks
System test fixes
2024-07-04 13:04:36 +00:00
Ed Santiago
a181b7bc61 System test fixes
- fix test name to reflect that it's not pasta-only
   (followup from #21563)

 - in one podman-update test run in OpenQA, defer assertion
   failures so we can gather better data on regressions.
   This would've been helpful in diagnosing bz2281805.

 - add an error-message check to one test that needed it
   (found by accident)

 - add distro-integration test tag to a handful of new tests,
   so they run in OpenQA. Found via 'git diff 33891e8 test/system'
   and scanning for '^\+@test '. I only added tests that IMO
   have some risk of interacting poorly with kernel or systemd
   updates, e.g. quadlet, modules, tmpfs+noswap.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-07-04 06:13:02 -06:00
Paul Holzinger
fad1f757cc test/system: fix podman --image-volume to allow tmpfs storage
The test check the the default volume is not on tmpfs, however what it
should really check that the volume is on our container storage fs. It
is possible that users run the storage on top of tmpfs so this test
always failed there.

The better check is to compare the fs from the graphroot and the volume.
Unfortunately, for unknown reasons stat -f -c %T returns UNKNOWN and not
the actual fs. I have no idea why, to work around that we now parse
/proc/mounts manually for the fs. Not nice but at least it works
correctly.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-01 12:51:33 +02:00
openshift-merge-bot[bot]
f62c3ec561 Merge pull request #23083 from Luap99/restore-hosts
restore: fix missing network setup
2024-06-25 11:29:20 +00:00
Yiqiao Pu
31888f23aa test/system: Add test steps for journald log check in quadlet
Add some test steps into quadlet - ContainerName. These steps are
used to ensure the default configuration for quadlets generated
service files is sending stdout/stderr/syslog to the journald.

Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2024-06-25 01:25:04 +08:00
Paul Holzinger
def182d396 restore: fix missing network setup
The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so  I don't see it as problem.

Fixes #22901

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-24 18:52:02 +02:00
openshift-merge-bot[bot]
bf2de4177b Merge pull request #23064 from giuseppe/podman-pass-timeout-stop-to-systemd
container: pass StopTimeout to the systemd slice
2024-06-23 14:57:55 +00:00
openshift-merge-bot[bot]
42a01c0f0c Merge pull request #22967 from rhatdan/build
Remove references to --pull=true and --pull=false
2024-06-21 19:27:36 +00:00
Chris Evich
d53fee511f CI Cleanup: Remove cgroups v1 support
With (esp. Debian) CI VM images built by
https://github.com/containers/automation_images/ pull/338 CI no-longer
tests with runc nor cgroups v1.  Add logic to fail under these
conditions.  Prune back high-level YAML/script envars and logic formerly
required to support these things.

Signed-off-by: Chris Evich <cevich@redhat.com>
2024-06-21 10:08:39 -04:00
Paul Holzinger
4b3890ccac remote: fix incorrect CONTAINER_CONNECTION parsing
When a user specifies a invalid connection in CONTAINER_CONNECTION then
podman should return a proper error saying so. Currently it ignored the
error and in rootFlags() just exited early with defining any flags. This
caused a panic then when trying to use the flags later.

In order to address this first store the connection error in the
PodmanConfig struct and not abort right away during flag setup. This is
important as the user might have specified a flag with a valid remote
connection. As such we check all flags and only when none were given we
return the connection error.

Also while at it I noticed that the default connection reported via
podman --help was wrong as it only used the old containers.conf field
for it and did not consider the podman-connections.json default.

New regression tests have been added to make sure it behaves correctly.

This fixes the problem reported in the PR #22997.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-21 14:01:22 +02:00
Giuseppe Scrivano
7d22f04f56 container: pass KillSignal and StopTimeout to the systemd scope
so that they are honored when systemd terminates the scope.

Closes: https://issues.redhat.com/browse/RHEL-16375

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 13:46:08 +02:00
Ed Santiago
3f785e8735 systests: kube: bump up a timeout
PR #22821 (CI speedup) was overly aggressive in one kube test.
It's flaking. Bump up timeout from 3s to 4.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-06-20 14:37:25 -06:00
Marius Hoch
6dd9abf9ec sqlite_state: Fix RewriteVolumeConfig
The VolumeConfig table does not have an ID column, thus
use the Name column to update it.

Fixes #23052

Signed-off-by: Marius Hoch <mail@mariushoch.de>
2024-06-20 11:39:44 +02:00
Paul Holzinger
4e0cd49148 test/system: check for leaks in teardown suite
At the end of all tests always check for leaks. That should make us more
robust against adding tests at the end that would leak stuff otherwise.

TODO: something seems wrong with bats when returning an error in
teardown_suite(), it prints a warning:
bats warning: Executed <NUM+1> instead of expected <NUM> tests
And also the output is formatted weirdly in this case where the podman
args are split over multiple lines.
But the test fails as expected so I don't think it is a problem.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-18 11:06:51 +02:00
Paul Holzinger
81c90f51c2 test/system: speed up basic_{setup,teardown}()
While these are not really slow they still take about 100-250ms if I
time this locally. Given they are run for every test this adds up
quickly. Looking at CI logs I can see the timings for skipped
tests are all in 600ms range. So I think it is safe to assume that these
functions need to get faster.

We have over 670 test cases currently so we talk about over 400s spend
in these functions in CI. This allows for big gains.

Now overall this is a tricky trade of, while all tests should cleanup
after themselves there is no guarantee for that as such errors can be
leaked into other tests making debugging much harder. To work at least a
bit against this teardown checks if the test was successful and only
skips the podman commands bases on that. Without it a single flake could
cause all following tets to fail.

As such this commit does the proper setup once one suite start then only
after a test failed.

In order for this to work at all we have to fix all leaks first, see
previous commits. And then for the future keep a very strong eye on
this during reviews.

Also add a PODMAN_BATS_LEAK_CHECK option

By default test must cleanup themselves and to speed up CI we no longer
do any cleanup in teardown by default. However there is still many cases
where we might have to debug a leak so add a new PODMAN_BATS_LEAK_CHECK
env option that can be set and should cause teardown to fail if the test
did not cleanup properly.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-18 11:06:50 +02:00
Paul Holzinger
a2352fa3ea test/system: fix up many tests that do not cleanup
All tests should cleanup themselves and not leak stuff.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-18 11:06:49 +02:00
Paul Holzinger
e9c6cd1559 test/system: fix podman --authfile=nonexistent-path
Remove leaking containers and remove unessesary push/pull args. For push
it tries to push an image as argument which makes no sense and for pull
we try to pull argument as image which is also wrong.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-18 11:06:47 +02:00
openshift-merge-bot[bot]
00bcd9aa81 Merge pull request #22733 from nalind/system-check
Add `podman system check`
2024-06-13 10:35:56 +00:00
Daniel J Walsh
64091777fe Remove references to --pull=true and --pull=false
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-06-12 18:16:29 +02:00
openshift-merge-bot[bot]
f79ede86c6 Merge pull request #22914 from Luap99/start-stopped
libpod: do not reuse networking on start
2024-06-11 19:18:55 +00:00
Daniel J Walsh
ad8fc6a74b --squash --layers=false should be allowed
This is the same as what --squash-all is doing, and we already support
--squash with --layers=true since this is the default.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-06-10 13:24:05 -04:00
Paul Holzinger
a9de888a15 libpod: do not resuse networking on start
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.

Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-07 17:50:28 +02:00
openshift-merge-bot[bot]
42ffa4db43 Merge pull request #22886 from Luap99/fast-system-test-3
test/system: make some tests faster part 3
2024-06-05 13:19:00 +00:00
Paul Holzinger
e8ea1e7632 libpod: do not leak systemd hc startup unit timer
This fixes a regression added in commit 4fd84190b8, because the name was
overwritten by the createTimer() timer call the removeTransientFiles()
call removed the new timer and not the startup healthcheck. And then
when the container was stopped we leaked it as the wrong unit name was
in the state.

A new test has been added to ensure the logic works and we never leak
the system timers.

Fixes #22884

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 18:03:46 +02:00
Paul Holzinger
350dfabf66 test/system: speed up podman ps --external
The buildah buil kill trick is bad as we have to sleep and wait to aboid
flakes which takes time. Instead it is possible to redo this build part
manually with buildah commands. It is not trival and harder to
understand but it safes 2-3s so I think it is worth it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:01 +02:00