93 Commits

Author SHA1 Message Date
46d874aa52 Refactor graph traversal & use for pod stop
First, refactor our existing graph traversal code to improve code
sharing. There still isn't much sharing between inward traversal
(stop, remove) and outward traversal (start) but stop and remove
are sharing most of their code, which seems a positive.

Second, add a new graph-traversal function to stop containers.
We already had start and remove; stop uses the newly-refactored
inward-traversal code which it shares with removal.

Third, rework the shared stop/removal inward-traversal code to
add locking. This allows parallel execution of stop and removal,
which should improve the performance of `podman pod rm` and
retain the performance of `podman pod stop` at about what it is
right now.

Fourth and finally, use the new graph-based stop when possible
to solve unordered stop problems with pods - specifically, the
infra container stopping before application containers, leaving
those containers without a working network.

Fixes https://issues.redhat.com/browse/RHEL-76827

Signed-off-by: Matt Heon <mheon@redhat.com>
2025-02-06 18:28:12 -05:00
06fa617f61 Lock pod while starting and stopping containers
The intention behind this is to stop races between
`pod stop|start` and `container stop|start` being run at the same
time. This could result in containers with no working network
(they join the still-running infra container's netns, which is
then torn down as the infra container is stopped, leaving the
container in an otherwise unused, nonfunctional, orphan netns.

Locking the pod (if present) in the public container start and
stop APIs should be sufficient to stop this.

Signed-off-by: Matt Heon <mheon@redhat.com>
2025-02-03 11:19:20 -05:00
6565bde6e8 Add --no-hostname option
Fixes: https://github.com/containers/podman/issues/25002

Also add the ability to inspect containers for
UseImageHosts and UseImageHostname.

Finally fixed some bugs in handling of --no-hosts for Pods,
which I descovered.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-15 06:51:32 -05:00
4f7395f93a Add --hosts-file flag to container and pod commands
* Add --hosts-file flag to container create, container run and pod create
* Add HostsFile field to pod inspect and container inspect results
* Test BaseHostsFile config in containers.conf

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
2024-11-24 22:00:34 -05:00
a89fef6e2a cleanup: add new --stopped-only option
The podman container cleanup process runs asynchronous and by the time
it gets the lock it is possible another podman process already did the
cleanup and then did a new init() to start it again. If the cleanup
process gets the lock there it will cause very weird things.

This can be observed in the remote start API as CI flakes.

Fixes #23754

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-27 15:01:23 +02:00
4aaa5cb6f0 stopIfOnlyInfraRemains: log all errors
Log all stopping errors for each container so we actually see the real
cause.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 12:11:26 +02:00
4c3531a1a4 fix network cleanup flake in play kube
When using service containers and play kube we create a complicated set
of dependencies.

First in a pod all conmon/container cgroups are part of one slice, that
slice will be removed when the entire pod is stopped resulting in
systemd killing all processes that were part in it.

Now the issue here is around the working of stopPodIfNeeded() and
stopIfOnlyInfraRemains(), once a container is cleaned up it will check
if the pod should be stopped depending on the pod ExitPolicy. If this is
the case it wil stop all containers in that pod. However in our flaky
test we calle podman pod kill which logically killed all containers
already. Thus the logic now thinks on cleanup it must stop the pod and
calls into pod.stopWithTimeout(). Then there we try to stop but because
all containers are already stopped it just throws errors and never gets
to the point were it would call Cleanup(). So the code does not do
cleanup and eventually calls removePodCgroup() which will cause all
conmon and other podman cleanup processes of this pod to be killed.

Thus the podman container cleanup process was likely killed while
actually trying to the the proper cleanup which leaves us in a bad
state.

Following commands such as podman pod rm will try to the cleanup again
as they see it was not completed but then fail as they are unable to
recover from the partial cleanup state.

Long term network cleanup needs to be more robust and ideally should be
idempotent to handle cases were cleanup was killed in the middle.

Fixes #21569

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-31 16:59:43 +02:00
72f1617fac Bump Go module to v5
Moving from Go module v4 to v5 prepares us for public releases.

Move done using gomove [1] as with the v3 and v4 moves.

[1] https://github.com/KSubedi/gomove

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-08 09:35:39 -05:00
2a2d0b0e18 chore: delete obsolete // +build lines
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-04 11:53:38 +02:00
bad25da92e libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
627ac1c96b libpod: destroy pod cgroup on pod stop
When the pod is stopped, we need to destroy the pod cgroup, otherwise
it is leaked.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-09-08 14:58:48 +02:00
3cae574ab2 Merge pull request #18507 from mheon/fix_rm_depends
Fix `podman rm -fa` with dependencies
2023-06-12 13:27:34 -04:00
3b39eb1333 Include lock number in pod/container/volume inspect
Being able to easily identify what lock has been allocated to a
given Libpod object is only somewhat useful for debugging lock
issues, but it's trivial to expose and I don't see any harm in
doing so.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-06-05 12:28:50 -04:00
2c9f18182a The removeContainer function now accepts a struct
We had something like 6 different boolean options (removing a
container turns out to be rather complicated, because there are a
million-odd things that want to do it), and the function
signature was getting unreasonably large. Change to a struct to
clean things up.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:27:27 -04:00
bc1a31ce6d Make RemoveContainer return containers and pods removed
This allows for accurate reporting of dependency removal, but the
work is still incomplete: pods can be removed, but do not report
the containers they removed as part of said removal. Will add
this in a subsequent commit.

Major note: I made ignoring no-such-container errors automatic
once it has been determined that a container did exist in the
first place. I can't think of any case where this would not be a
TOCTOU - IE, no reason not to ignore them. The `--ignore` option
to `podman rm` should still retain meaning as it will ignore
errors from containers that didn't exist in the first place.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:24:56 -04:00
e8d7456278 Add an API for removing a container and dependencies
This is the initial stage of implementation. The current API
functions but does not report the additional containers and pods
removed. This is necessary to properly display results to the
user after `podman rm --all`.

The existing remove-dependencies code has been removed in favor
of this more native solution.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 15:32:50 -04:00
edbeee5238 Add --restart flag to pod create
Add --restart flag to pod create to allow users to set the
restart policy for the pod, which applies to all the containers
in the pod. This reuses the restart policy already there for
containers and has the same restart policy options.
Add "never" to the restart policy options to match k8s syntax.
It is a synonym for "no" and does the exact same thing where the
containers are not restarted once exited.
Only the containers that have exited will be restarted based on the
restart policy, running containers will not be restarted when an exited
container is restarted in the same pod (same as is done in k8s).

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2023-05-02 10:29:58 -04:00
536d3b87f0 Merge pull request #16818 from SoMuchForSubtlety/api-port-bindings
api: remove unmapped ports from PortBindings
2022-12-15 20:19:53 -05:00
97f63da67d remove unmapped ports from inspect port bindings
Signed-off-by: Jakob Ahrer <jakob@ahrer.dev>
2022-12-15 23:18:50 +01:00
4a5581ce0d stop reporting errors removing containers that don't exist
Init containers are removed once they exit, but podman
reports and error that the container does not exist, when
it was previously removed.  Stop reporting missing containers
when removing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-12-14 14:09:56 -05:00
f355900d34 Fix deadlock between 'podman ps' and 'container inspect' commands
Fixes: #16326

[NO NEW TESTS NEEDED]

Signed-off-by: Mikhail Khachayants <tyler92@inbox.ru>
2022-10-28 10:12:34 +03:00
e19e0de5fa Introduce graph-based pod container removal
Originally, during pod removal, we locked every container in the
pod at once, did a number of validity checks to ensure everything
was safe, and then removed all the containers in the pod.

A deadlock was recently discovered with this approach. In brief,
we cannot lock the entire pod (or much more than a single
container at a time) without causing a deadlock. As such, we
converted to an approach where we just looped over each container
in the pod, removing them individually. Unfortunately, this
removed a lot of the validity checking of the earlier approach,
allowing for a lot of unintended bad things. Infra containers
could be removed while containers in the pod still depended on
them, for example.

There's no easy way to do validity checks while in a simple loop,
so I implemented a version of our graph-traversal logic that
currently handles pod start. This version acts in the reverse
order of startup: startup starts from containers which depend on
nothing and moves outwards, while removal acts on containers which
have nothing depend on them and moves inwards. By doing graph
traversal, we can guarantee that nothing is removed while
something that depends on it still exists - so the infra
container should be the last thing in a pod that is removed, for
example.

In the (unlikely) case that a graph of the pod's containers
cannot be built (most likely impossible without database editing)
the old method of pod removal has been retained to ensure that
even misbehaving pods can be forcibly evicted from the state.

I'm fairly confident that this resolves the problem, but there
are a lot of assumptions around dependency structure built into
the original pod removal code and I am not 100% sure I have
captured all of them.

Fixes #15526

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-14 13:44:48 -04:00
2c63b8439b Fix stutters
Podman adds an Error: to every error message.  So starting an error
message with "error" ends up being reported to the user as

Error: error ...

This patch removes the stutter.

Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-09-10 07:52:00 -04:00
c00ea686fe resource limits for pods
added the following flags and handling for podman pod create

--memory-swap
--cpuset-mems
--device-read-bps
--device-write-bps
--blkio-weight
--blkio-weight-device
--cpu-shares

given the new backend for systemd in c/common, all of these can now be exposed to pod create.
most of the heavy lifting (nearly all) is done within c/common. However, some rewiring needed to be done here
as well!

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2022-07-21 14:50:01 -04:00
ca5bebb082 Merge pull request #14501 from cdoern/podUTS
podman pod create --uts support
2022-07-06 14:51:22 +00:00
251d91699d libpod: switch to golang native error wrapping
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.

[NO NEW TESTS NEEDED]

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-07-05 16:06:32 +02:00
8f2d9e7a7c podman pod create --uts support
add support for the --uts flag in pod create, allowing users to avoid
issues with default values in containers.conf.

uts follows the same format as other namespace flags:
--uts=private (default), --uts=host, --uts=ns:PATH

resolves #13714

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2022-07-05 09:28:07 -04:00
b92149e2a8 podman pod create --memory
using the new resource backend, implement podman pod create --memory which enables
users to modify memory.max inside of the parent cgroup (the pod), implicitly impacting all
children unless overriden

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2022-07-01 13:44:32 -04:00
b13fc1bf98 patch for pod host networking & other host namespace handling
this patch included additonal host namespace checks when creating a ctr as well
as fixing of the tests to check /proc/self/ns/net

see #14461

Signed-off-by: cdoern <cdoern@redhat.com>
2022-06-09 10:30:48 -04:00
badf76e172 Remove more FIXMEs
Mostly, just removing the comments. These either have been done,
or are no longer a good idea.

No code changes. [NO NEW TESTS NEEDED] as such.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-05-25 14:10:02 -04:00
840c120c21 play kube: service container
Add the notion of a "service container" to play kube.  A service
container is started before the pods in play kube and is (reverse)
linked to them.  The service container is stopped/removed *after*
all pods it is associated with are stopped/removed.

In other words, a service container tracks the entire life cycle
of a service started via `podman play kube`.  This is required to
enable `play kube` in a systemd unit file.

The service container is only used when the `--service-container`
flag is set on the CLI.  This flag has been marked as hidden as it
is not meant to be used outside the context of `play kube`.  It is
further not supported on the remote client.

The wiring with systemd will be done in a later commit.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-05-12 10:51:13 +02:00
4eff0c8cf2 pod: add exit policies
Add the notion of an "exit policy" to a pod.  This policy controls the
behaviour when the last container of pod exits.  Initially, there are
two policies:

 - "continue" : the pod continues running. This is the default policy
                when creating a pod.

 - "stop" : stop the pod when the last container exits. This is the
            default behaviour for `play kube`.

In order to implement the deferred stop of a pod, add a worker queue to
the libpod runtime.  The queue will pick up work items and in this case
helps resolve dead locks that would otherwise occur if we attempted to
stop a pod during container cleanup.

Note that the default restart policy of `play kube` is "Always".  Hence,
in order to really solve #13464, the YAML files must set a custom
restart policy; the tests use "OnFailure".

Fixes: #13464
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-05-02 13:29:59 +02:00
7f28fd9386 Report properly whether pod shares host network
Fixes: https://github.com/containers/podman/issues/14028

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-04-28 10:27:21 -04:00
7a53428049 fix pod volume passing and alter infra inheritance
the infra Inherit function was not properly passing pod volume information to new containers
alter the inherit function and struct to use the new `ConfigToSpec` function used in clone
pick and choose the proper entities from a temp spec and validate them on the spegen side rather
than passing directly to a config

resolves #13548

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
Signed-off-by: cdoern <cdoern@redhat.com>
Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
2022-03-29 11:10:46 -04:00
bd09b7aa79 bump go module to version 4
Automated for .go files via gomove [1]:
`gomove github.com/containers/podman/v3 github.com/containers/podman/v4`

Remaining files via vgrep [2]:
`vgrep github.com/containers/podman/v3`

[1] https://github.com/KSubedi/gomove
[2] https://github.com/vrothberg/vgrep

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2022-01-18 12:47:07 +01:00
289270375a Pod Security Option support
Added support for pod security options. These are applied to infra and passed down to the
containers as added (unless overridden).

Modified the inheritance process from infra, creating a new function Inherit() which reads the config, and marshals the compatible options into an intermediate struct `InfraInherit`
This is then unmarshaled into a container config and all of this is added to the CtrCreateOptions. Removes the need (mostly) for special additons which complicate the Container_create
code and pod creation.

resolves #12173

Signed-off-by: cdoern <cdoern@redhat.com>
2021-12-27 13:39:36 -05:00
9ce6b64133 network db: add new strucutre to container create
Make sure we create new containers in the db with the correct structure.
Also remove some unneeded code for alias handling. We no longer need this
functions.

The specgen format has not been changed for now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-12-14 15:23:39 +01:00
2130d18539 Update vendor or containers/common moving pkg/cgroups there
[NO NEW TESTS NEEDED] This is just moving pkg/cgroups out so
existing tests should be fine.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-12-07 06:17:11 -05:00
21c9dc3c40 Add --time out for podman * rm -f commands
Add --time flag to podman container rm
Add --time flag to podman pod rm
Add --time flag to podman volume rm
Add --time flag to podman network rm

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-10-04 07:07:56 -04:00
6da97c8631 Pod Volumes From Support
added support for a volumes from container. this flag just required movement of the volumes-from flag declaration
out of the !IsInfra block, and minor modificaions to container_create.go

Signed-off-by: cdoern <cdoern@redhat.com>
2021-10-01 14:09:11 -04:00
81aabc8054 Merge pull request #11686 from cdoern/podDeviceOptions
Pod Device-Read-BPS support
2021-10-01 10:53:14 -04:00
2d86051893 Pod Device-Read-BPS support
added the option for the user to specify a rate, in bytes, at which they would like to be able
to read from the device being added to the pod. This is the first in a line of pod device options.

WARNING: changed pod name json tag to pod_name to avoid confusion when marshaling with the containerspec's name

Signed-off-by: cdoern <cdoern@redhat.com>
2021-09-28 21:20:01 -04:00
5d6ea90e75 libpod: do not call (*container).Config()
Access the container's config field directly inside of libpod instead of
calling `Config()` which in turn creates expensive JSON deep copies.

Accessing the field directly drops memory consumption of a simple
`podman run --rm busybox true` from 1245kB to 410kB.

[NO TESTS NEEDED]

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-09-28 17:18:02 +02:00
1c4e6d8624 standardize logrus messages to upper case
Remove ERROR: Error stutter from logrus messages also.

[ NO TESTS NEEDED] This is just code cleanup.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-09-22 15:29:34 -04:00
8fac34b8ff Pod Device Support
added support for pod devices. The device gets added to the infra container and
recreated in all containers that join the pod.

This required a new container config item to keep track of the original device passed in by the user before
the path was parsed into the container device.

Signed-off-by: cdoern <cdoern@redhat.com>
2021-09-20 23:22:43 -04:00
84005330aa Pod Volumes Support
added support for the --volume flag in pods using the new infra container design.
users can specify all volume options they can with regular containers

resolves #10379

Signed-off-by: cdoern <cdoern@redhat.com>
2021-09-14 08:32:07 -04:00
d28e85741f InfraContainer Rework
InfraContainer should go through the same creation process as regular containers. This change was from the cmd level
down, involving new container CLI opts and specgen creating functions. What now happens is that both container and pod
cli options are populated in cmd and used to create a podSpecgen and a containerSpecgen. The process then goes as follows

FillOutSpecGen (infra) -> MapSpec (podOpts -> infraOpts) -> PodCreate -> MakePod -> createPodOptions -> NewPod -> CompleteSpec (infra) -> MakeContainer -> NewContainer -> newContainer -> AddInfra (to pod state)

Signed-off-by: cdoern <cdoern@redhat.com>
2021-08-26 16:05:16 -04:00
4b2dc48d0b podman inspect show exposed ports
Podman inspect has to show exposed ports to match docker. This requires
storing the exposed ports in the container config.
A exposed port is shown as `"80/tcp": null` while a forwarded port is
shown as `"80/tcp": [{"HostIp": "", "HostPort": "8080" }]`.

Also make sure to add the exposed ports to the new image when the
container is commited.

Fixes #10777

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-24 15:44:26 +02:00
bef26f2582 rename oneshot initcontainers to once
after the init containers pr merged, it was suggested to use `once`
instead of `oneshot` containers as it is more aligned with other
terminiology used similarily.

[NO TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2021-08-12 12:57:15 -05:00
221b1add74 Add support for pod inside of user namespace.
Add the --userns flag to podman pod create and keep
track of the userns setting that pod was created with
so that all containers created within the pod will inherit
that userns setting.

Specifically we need to be able to launch a pod with
--userns=keep-id

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2021-08-09 15:17:22 -04:00