As outlined in #16076, a subsequent BARRIER *may* follow the READY
message sent by a container. To correctly imitate the behavior of
systemd's NOTIFY_SOCKET, the notify proxies span up by `kube play` must
hence process messages for the entirety of the workload.
We know that the workload is done and that all containers and pods have
exited when the service container exits. Hence, all proxies are closed
at that time.
The above changes imply that Podman runs for the entirety of the
workload and will henceforth act as the MAINPID when running inside of
systemd. Prior to this change, the service container acted as the
MAINPID which is now not possible anymore; Podman would be killed
immediately on exit of the service container and could not clean up.
The kube template now correctly transitions to in-active instead of
failed in systemd.
Fixes: #16076Fixes: #16515
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Move the handling of userns keys from ConvertContainer to a separate method
Adjust the method according to the different supported values
Use the new method in both ConvertContainer and ConvertKube
Pass isUser to ConvertKube as well
Add tests
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
Since https://github.com/containers/podman/pull/16394 was merged
we now always delete the cid file if --replace=true was specified,
so we can avoid this extra command being launched.
[NO NEW TESTS NEEDED] Already tested in above PR.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Quadlet was doing some custom handling of uid/gid remapping, originating
from pre --userns=auto support, including its own user for getting subuids
which kinda conflicts with the "container" user used for that.
This drops all the old support for id remapping in favour of a new set
of keys that more directly map to the podman run options.
We have essentially 3 modes now:
```
RemapUsers=manual
RemapUid=0:10000:10
RemapUid=10:20000:10
RemapGid=0:10000:10
RemapGid=10:20000:10
```
This maps to --uidmap and --gidmap options.
```
RemapUsers=auto
```
This maps to --userns=auto. But you can additionally specify RemapUid,
RemapGid and RemapUidSize which gets applied as options to the
--userns podman option.
```
RemapUsers=keep-id
```
This maps to --userns=keep-id and only works for user units.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Attempts to fix#16419
podman generate systemd --restart-sec pod
^now generates RestartSec= both in pod service file and in container service file.
podman generate systemd --restart-sec container
^now generates RestartSec= in container service file.
Signed-off-by: Veronika Fuxova <vfuxova@redhat.com>
This is much better for the systemd case becase we pass the journal
socket fds directly to the container. This means less copying of the
logs, but it also means the journal will correctly get the peer
process id when it tries to extract things like the name of what
is logging something.
With this we correctly name the logging process rather than claim
everything comes from conmon.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
This makees much more sense for typical service loads, and can
easily be reverted by `ReadOnly=no`.
Also updates and adds various tests for this.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
The notify proxy has a watcher to check whether the container has left
the running state. In that case, Podman should stop waiting for the
ready message to prevent a dead lock. Fix this watcher but adding a
loop.
Fixes the dead lock in #16076 surfacing in a timeout. The underlying
issue persists though. Also use a timer in the select statement to
prevent the goroutine from running unnecessarily long
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Truncate the container and pod ID files instead of throwing an error.
The main motivation is to prevent redundant work when starting systemd
units. Throwing an error when the file already exists is not preventing
races or file corruptions, so let's leave that to the user which in
almost all cases are generated (and tested) systemd units.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This way we don't have to use the `ExecCondition=podman volume exist`,
which saves one process start.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reduce the number of top-level packages in ./pkg by moving quadlet
packages under ./pkg/systemd.
[NO NEW TESTS NEEDED] - no functional change.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Starting listening for the READY messages on the sdnotify proxies before
starting the Pod. Otherwise, we may be missing messages.
[NO NEW TESTS NEEDED] as it's hard to test this very narrow race.
Related to but may not be fixing #16076.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Package `io/ioutil` was deprecated in golang 1.16, preventing podman from
building under Fedora 37. Fortunately, functionality identical
replacements are provided by the packages `io` and `os`. Replace all
usage of all `io/ioutil` symbols with appropriate substitutions
according to the golang docs.
Signed-off-by: Chris Evich <cevich@redhat.com>
The read deadline may yield the READY message to be lost in space.
Instead, use a more Go-idiomatic alternative by using two goroutines;
one reading from the connection, the other watching the container.
[NO NEW TESTS NEEDED] since existing tests are exercising this
functionality already.
Fixes: #15800
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This reverts commit c20abf12c714f359c7bbb291c444530f70cb1185. In the
absence of `ExecStop` step, systemd will send the stop/kill signals to
the main PID while I asummed that systemd would jump directly to an
ExecStopPost step instead.
Hence revert the commit to let Podman take care of stopping rather than
systemd.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Drop the ExecStop step to simplify the generated units a bit.
The extra ExecStopPost step was added by commit e5c343294424. If the
main PID (i.e., conmon) is killed, systemd will not execute ExecStop
(since the main PID is already down) but only execute the *Post steps.
Credits to the late Ulrich Obergfell for tracking this issue down; he is
missed.
The ExecStop step can safely be dropped since the Post step will take of
stopping (and removing) in any case.
Context: #15686
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Two PRs have been merged causing a failure in one unit test.
Fix the unit test to turn CI green again.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
When creating a new pod without the `--name` flag, e.g.:
`podman pod create foobar`
it will get the name `foobar` implicitly and this will be recorded as the in the
`podCreateArgs`. Unfortunately, the implicit name only works if it appears as
the **last** argument of the startup command.
With 6e2e3a78ed1d05ee5f23f65b814e8135021961dd we started appending the pod
security policy to the startCommand, resulting in the following `ExecStartPre=`
line:
```
/usr/bin/podman pod create --infra-conmon-pidfile %t/pod-foobar.pid --pod-id-file %t/pod-foobar.pod-id foobar --exit-policy=stop
```
This fails to launch, as the `pod create` command expects only a single
non-flag parameter, but it assumes that `exit-policy=stop` is a second and
terminates immediately instead.
This fixes https://github.com/containers/podman/issues/15592
Signed-off-by: Dan Čermák <dcermak@suse.com>
Change the dependencies from a pod unit to its associated container
units from `Requires` to `Wants` to prevent the entire pod from
transitioning to a failed state. Restart policies for individual
containers can be configured separately.
Also make sure that the pod's RunRoot is always set.
Fixes: #14546
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Emit a warning to the user when generating a unit with --new on a
container that was created with a custom --restart policy. As shown
in #15284, a custom --restart policy in that case can lead to issues
on system shutdown where systemd attempts to nuke the unit but Podman
keeps on restarting the container.
Fixes: #15284
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>