Waiting now actually makes sure to exit on first container exit. Also
notice that it does not wait for --rm to have the container removed at
this point.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We cannot get first the volume lock and the container locks. Other code
paths always have to first lock the container and the lock the volumes,
i.e. to mount/umount them. As such locking the volume fust can always
result in ABBA deadlocks.
To fix this move the lock down after the container removal. The removal
code is racy regardless of the lock as the volume lcok on create is no
longer taken since commit 3cc9db8626 due another deadlock there.
Fixes#23613
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Now that on-failure exits right away the test is racy as the
RestartCount is not at the value we expect as the container is still
restarting in the background. As such add a timer based approach.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Init containers are meant to exit early before other containers are
started. Thus stopping the infra container in such case is wrong.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The current code did several complicated state checks that simply do not
work properly on a fast restarting container. It uses a special case for
--restart=always but forgot to take care of --restart=on-failure which
always hang for 20s until it run into the timeout.
The old logic also used to call CheckConmonRunning() but synced the
state before which means it may check a new conmon every time and thus
misses exits.
To fix the new the code is much simpler. Check the conmon pid, if it is
no longer running then get then check exit file and get exit code.
This is related to #23473 but I am not sure if this fixes it because we
cannot reproduce.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
CI will fail if quay is down, but a build-time check does not
help us in any way. It just introduces another pain point
where we have to hit the Rerun button.
Signed-off-by: Ed Santiago <santiago@redhat.com>
By enabling UserKnownHostsFile=/dev/null, and CheckHostIP=no
options to the defaults we prevent the user from adding the host key
multiple times and from flakes that can raise Remote Host Id change.
Resolves: https://github.com/containers/podman/issues/23505
Signed-off-by: Nicola Sella <nsella@redhat.com>
When will I learn not to dismiss something as "easy"?
Anyhow, this doesn't actually change anything parallel-wise
but it does reduce a race condition seen on heavily-loaded
slow systems, wherein a container goes into unhealthy before
we want it to. This version isn't perfect; I don't think
there's an ideal fix for this.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Read stderr from ssh-keygen before calling wait(), since cmd.Wait() closes cmd.StderrPipe() after it exits, causing a read-on-closed-pipe error.
Signed-off-by: Ashley Cui <acui@redhat.com>
Only one test can be parallelized. Do so, and add a comment
to the other one explaining why it can't be.
Also, add some missing error-message checks.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Very few changes needed, all of them simple.
It is impossible to parallelize this entire file, because "stop -a".
Add tags to tests that can be parallelized, and comments to those
that can't.
Signed-off-by: Ed Santiago <santiago@redhat.com>
When the client gets a 404 back we know the container does not exists,
if ignore is set as well we should just ignore the error client side.
seen in #23554
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.
Fixes#23554
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The previous comment included way too many details. It also referenced
a docker-hub container image which is not accessible under all
circumstances. Switch to the GitHub container registry and include
mention of the pre-commit hook that's available.
Signed-off-by: Chris Evich <cevich@redhat.com>
The main change is a global "packageRules" config that encompasses all
rules instead of configuring them as options to a manager.
Signed-off-by: Chris Evich <cevich@redhat.com>
In podman-systemd we are intersecting the worlds of containers
and systemd, and I had to stop and think to understand what
`Exec=` does.
I tried to clarify things more here.
I found it especially confusing because the example at the
very top of the file does:
```
Image=quay.io/fedora/fedora
Exec=sleep 10
```
But that only makes sense because the fedora base image
(being generic) doesn't define an `ENTRYPOINT`, just a `CMD`.
But IMO by far the most common usage for podman-systemd
is "app images" which conventionally should use `ENTRYPOINT`
in general. Maybe we should change the default example,
but I'm leaving that for a later followup.
(It perhaps would have been less confusing if this field
had been called `Args=` to make clear it's quite different
in practice from systemd `ExecStart=`)
Signed-off-by: Colin Walters <walters@verbum.org>
If we manage to init/start a container successfully we should unset any
previously stored state errors. Otherwise a user might be confused why
there is an error in the state about some old error even though the
container works/runs.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>