podman

mirror of https://github.com/containers/podman.git synced 2025-10-18 11:42:55 +08:00

Author	SHA1	Message	Date
Matt Heon	8f1e58d9f8	[v4.4.1-rhel] Ensure that containers do not get stuck in stopping The scenario for inducing this is as follows: 1. Start a container with a long stop timeout and a PID1 that ignores SIGTERM 2. Use `podman stop` to stop that container 3. Simultaneously, in another terminal, kill -9 `pidof podman` (the container is now in ContainerStateStopping) 4. Now kill that container's Conmon with SIGKILL. 5. No commands are able to move the container from Stopping to Stopped now. The cause is a logic bug in our exit-file handling logic. Conmon being dead without an exit file causes no change to the state. Add handling for this case that tries to clean up, including stopping the container if it still seems to be running. Fixes #19629 Addresses: https://issues.redhat.com/browse/ACCELFIX-250 Signed-off-by: Matt Heon <mheon@redhat.com> Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>	2024-06-24 18:37:06 -04:00
Paul Holzinger	9ef352adfc	[v4.4.1-rhel] libpod: fix deadlock while parallel container create Cherry pick from #20329 Addresses: https://issues.redhat.com/browse/RHEL-14744 and https://issues.redhat.com/browse/RHEL-14743 When containers are created with a named volume it can deadlock because the create logic tried to lock all volumes in a loop, this is fine if it only ever creates a single container at any given time. However because we multiple containers can be created at the same time they can cause a deadlock between the volumes. This is because the order of the loop is not stable, in fact it is based on the order of how the volumes were specified on the cli. So if you create two containers at the same time with `-v vol1:/dir2 -v vol2:/dir2` and the other one with `-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock. Now one solution could be to order the volumes to prevent the issue but the reason for holding the lock is dubious. The goal was to prevent the volume from being removed in the meantime. However that could still have happend before we acquired the lock so it didn't protect against that. Both boltdb and sqlite already prevent us from adding a container with volumes that do not exists due their internal consistency checks. Sqlite even uses FOREIGN KEY relationships so the schema will prevent us from doing anything wrong. The create code currently first checks if the volume exists and if not creates it. I have checked that the db will guarantee that this will not work: Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume` Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed` Keep in mind that this error is normally not seen, only if the volume is removed between the volume exists check and adding the container in the db this messages will be seen wich is an acceptable race and a pre-existing condition anyway. [NO NEW TESTS NEEDED] Race condition, hard to test in CI. Fixes #20313 Signed-off-by: Paul Holzinger <pholzing@redhat.com> Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>	2024-03-15 16:13:29 -04:00
Matt Heon	cb799fd19e	Fix updated runc dep breaking pod devices cgroup The update to runc broke creation of devices for containers in the pod cgroup. We don't support the device cgroup for pods at present, so just disable it for now, resolving the issue. Thanks to Giuseppe for finding the fix. [NO NEW TESTS NEEDED] fixes a test break Signed-off-by: Matt Heon <mheon@redhat.com>	2024-02-02 09:26:40 -05:00
Matt Heon	fd07c085f4	Bump to runc v1.1.12 Signed-off-by: Matt Heon <mheon@redhat.com>	2024-02-02 09:25:17 -05:00
Matt Heon	35537de4c7	Fix several issues with image volumes Particularly, fix an issue with SELinux where image volumes would not mount properly. Based (loosely) on 2ec11b16abe26297331333ceb9aea3b908dba7b0 which intermingled these changes with other fixes not relevant to this backport. Backported to v4.4.1-rhel per RHBZ 2213843 [NO NEW TESTS NEEDED] Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-12 13:31:39 -04:00
tomsweeneyredhat	b458fc63d2	[v4.4.1-rhel] Vendor c/storage v1.45.5 Vendor in c/storage v1.45.5 Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=2177925 and https://bugzilla.redhat.com/show_bug.cgi?id=2176833 "Podman containers do not start after upgrade to v4.4.1 " ZeroDay BZs for RHEL 8.8 and 9.2 Repushed with v1.45.5 after it was found that v1.45.4 was built incorrectly from main. [NO NEW TESTS NEEDED] Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>	2023-04-13 08:44:36 -04:00
Paul Holzinger	62d347488a	[v4.4.1-rhel] fix slirp4netns resolv.conf ip with a userns When a userns is set we setup the network after the bind mounts, at the point where resolv.conf is generated we do not yet know the subnet. Just like the other dns servers for bridge networks we need to add the ip later in completeNetworkSetup() Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=2182492 and https://bugzilla.redhat.com/show_bug.cgi?id=2182491 This is targeted to RHEL 8.8 and 9.2 ZeroDay Signed-off-by: Paul Holzinger <pholzing@redhat.com> Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>	2023-03-28 18:11:25 -04:00
Valentin Rothberg	a2e80c53d4	[v4.4.1-rhel] fix --health-on-failure=restart in transient unit As described in #17777, the `restart` on-failure action did not behave correctly when the health check is being run by a transient systemd unit. It ran just fine when being executed outside such a unit, for instance, manually or, as done in the system tests, in a scripted fashion. There were two issue causing the `restart` on-failure action to misbehave: 1) The transient systemd units used the default `KillMode=cgroup` which will nuke all processes in the specific cgroup including the recently restarted container/conmon once the main `podman healthcheck run` process exits. 2) Podman attempted to remove the transient systemd unit and timer during restart. That is perfectly fine when manually restarting the container but not when the restart itself is being executed inside such a transient unit. Ultimately, Podman tried to shoot itself in the foot. Fix both issues by moving the restart logic in the cleanup process. Instead of restarting the container, the `healthcheck run` will just stop the container and the cleanup process will restart the container once it has turned unhealthy. Backport of commit 95634154303f5b8c3d5c92820e2a3545c54f0bc8. Fixes: #17777 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2180125 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2180126 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-03-21 09:18:42 +01:00
Aditya R	017535d05e	volume,container: chroot to source before exporting content * Utils must support higher level API to create Tar with chrooted into directory * Volume export: use TarwithChroot instead of Tar so we can make sure no symlink can be exported by tar if it exists outside of the source directory. * container export: use chroot and Tar instead of Tar so we can make sure no symlink can be exported by tar if it exists outside of the mointPoint. [NO NEW TESTS NEEDED] [NO TESTS NEEDED] Race needs combination of external/in-container mechanism which is hard to repro in CI. CVE: https://access.redhat.com/security/cve/CVE-2023-0778 Signed-off-by: Aditya R <arajan@redhat.com> MH: Cherry-pick to v4.4.1-rhel per RHBZ 2169618 Signed-off-by: Matt Heon <mheon@redhat.com>	2023-02-17 11:33:30 -05:00
OpenShift Merge Robot	1f96d03458	Merge pull request #17258 from openshift-cherrypick-robot/cherry-pick-17213-to-v4.4 [v4.4] Set runAsNonRoot=true in gen kube	2023-01-29 13:38:15 -05:00
Urvashi Mohnani	f4bf448d85	Set runAsNonRoot=true in gen kube If the image being used has a user set that is a positive integer greater than 0, then set the securityContext.runAsNonRoot to true for the container in the generated kube yaml. Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>	2023-01-27 19:48:26 +00:00
Paul Holzinger	6870dae236	journald: podman logs only show logs for current user In the super rare case that there are two containers with the same ID for two different users, podman logs with the journald driver would show logs from both containers. [NO NEW TESTS NEEDED] Impossible to reproduce. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-27 19:27:02 +00:00
Paul Holzinger	cd4590908a	journald: podman events only show events for current user I noticed this while running some things in parallel, podman events would show events from other users. Because all events are written to the journal everybody can see them. So when we read the journal we must filter events for only the current UID. To reproduce run `podman events` as user then in another window create a container as root for example. After this patch it will correctly ignore these events from other users. [NO NEW TESTS NEEDED] I don't think we can test with two users at the same time. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-27 19:27:02 +00:00
Valentin Rothberg	916ea3e5d6	DB: make loading container states optional Loading container states speed things up when listing all containers but it comes with a price tag for many other call paths. Hence, make loading the state conditional to allow for keeping `podman ps` fast without other commands regressing in performance. [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-27 09:14:12 +00:00
Valentin Rothberg	de84be54eb	ps: do not sync container Do not sync containers with the runtime and the database when listing containers. It turns out to be extremely expensive and unnecessary. The sync was needed since listing all containers from the database did not populate their state. Doing that, however, is much faster since we already have a connection to the database. This change makes listing 200 containers 2 times faster than before. [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-27 09:14:12 +00:00
Valentin Rothberg	9d1c153cfc	ps: query health check in batch mode Also do not return (and immediately suppress) an error if no health check is defined for a given container. Makes listing 100 containers around 10 percent faster. [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-25 11:24:18 +01:00
OpenShift Merge Robot	3cee9d9d98	Merge pull request #17201 from rhatdan/ipc Correct output when inspecting containers created with --ipc	2023-01-24 17:29:29 -05:00
Daniel J Walsh	623ad2a636	Correct output when inspecting containers created with --ipc Fixes: https://github.com/containers/podman/issues/17189 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-01-24 12:18:39 -05:00
Daniel J Walsh	c4aae9b47e	Get correct username in pod when using --userns=keep-id Fixes: https://github.com/containers/podman/issues/17148 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-01-24 11:06:06 -05:00
Aditya R	e2c44c3d49	libpod: set search domain independently of nameservers Set search domain irrespective of nameservers. Signed-off-by: Aditya R <arajan@redhat.com>	2023-01-22 12:48:58 +05:30
Aditya R	06241077cc	libpod,netavark: correctly populate /etc/resolv.conf with custom dns server After https://github.com/containers/netavark/pull/452 `netavark` is incharge of deciding `custom_dns_servers` if any so lets honor that and libpod should not set these manually. This also ensures docker parity Podman populates container's `/etc/resolv.conf` with custom DNS servers ( specified via `--dns` or `dns_server` in containers.conf ) even when container is connected to a network where `dns_enabled` is `true`. Current behavior does not matches with docker, hence following commit ensures that podman only populates custom DNS server when container is not connected to any network where DNS is enabled and for the cases where `dns_enabled` is `true` the resolution for custom DNS server will happen via ( `aardvark-dns` or `dnsname` ). Reference: https://docs.docker.com/config/containers/container-networking/#dns-services Closes: containers#16172 Signed-off-by: Aditya R <arajan@redhat.com>	2023-01-22 12:48:55 +05:30
Aditya R	366e1686a0	podman: relay custom DNS servers to network stack Aardvark-dns and netavark now accepts custom DNS servers for containers via new config field `dns_servers`. New field allows containers to use custom resolvers instead of host's default resolvers. Following commit instruments libpod to pass these custom DNS servers set via `--dns` or central config to the network stack. Depends-on: * Common: containers/common#1189 * Netavark: containers/netavark#452 * Aardvark-dns: containers/aardvark-dns#240 Signed-off-by: Aditya R <arajan@redhat.com>	2023-01-22 12:48:49 +05:30
Valentin Rothberg	4faa139b78	waitPidStop: reduce sleep time to 10ms Kill is a fast syscall, so we can reduce the sleep time from 100ms to 10ms in hope to speed things up a bit. [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-19 12:31:37 +01:00
Valentin Rothberg	fd42c1dcb8	StopContainer: return if cleanup process changed state Commit 067442b5701f improved stopping/killing a container by detecting whether the cleanup process has already fired and changed the state of the container. Further improve on that by returning early instead of trying to wait for the PID to finish. At that point we know that the container has exited but the previous PID may have been recycled already by the kernel. [NO NEW TESTS NEEDED] - the absence of the two flaking tests recorded in #17142 will tell. Fixes: #17142 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-19 11:16:07 +01:00
Valentin Rothberg	e0f671007d	StopSignal: add a comment Add a comment when SIGKILL is being used. It may help future readers better comprehend what's going on and why. [NO NEW TESTS NEEDED] - cannot test a comment :^) Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-19 11:16:05 +01:00
Valentin Rothberg	ac47d07194	StopContainer: small refactor Move the stopSignal decl into the branch where it's actually used. [NO NEW TESTS NEEDED] as it's just a small refactor. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-19 10:57:31 +01:00
Valentin Rothberg	e8b35a8c20	waitPidStop: simplify code The code can be simplified by using a timer directly. [NO NEW TESTS NEEDED] - should not change behavior. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-19 10:40:36 +01:00
Daniel J Walsh	ef3f098796	Remove ReservedAnnotations from kube generate specification Reserved annotations are used internally by Podman and would effect nothing when run with Kubernetes so we should not be generating these annotations. Fixes: https://github.com/containers/podman/issues/17105 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-01-18 08:46:24 -05:00
OpenShift Merge Robot	7093d1fe5c	Merge pull request #17130 from Luap99/remove-dup-code commit: use libimage code to parse changes	2023-01-17 05:10:22 -05:00
Paul Holzinger	79865c2903	commit: use libimage code to parse changes This code is duplicated in podman and c/common, we should only use one version. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-16 16:28:11 +01:00
Valentin Rothberg	067442b570	container kill: handle stopped/exited container The container lock is released before stopping/killing which implies certain race conditions with, for instance, the cleanup process changing the container state to stopped, exited or other states. The (remaining) flakes seen in #16142 and #15367 strongly indicate a race in between the stopping/killing a container and the cleanup process. To fix the flake make sure to ignore invalid-state errors. An alternative fix would be to change `KillContainer` to not return such errors at all but commit c77691f06f61 indicates an explicit desire to have these errors being reported in the sig proxy. [NO NEW TESTS NEEDED] as it's a race already covered by the system tests. Fixes: #16142 Fixes: #15367 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-16 13:56:41 +01:00
Valentin Rothberg	6f919af78b	add a comment to container removal Every time I look at a container-removal issue I wonder why the container isn't locked directly here, so let's add a comment here. I am not sure whether I would be better if callers took care of locking but for now the comment will safe the future me and probably other readers some time. [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-16 11:45:19 +01:00
OpenShift Merge Robot	f1af5b3076	Merge pull request #17100 from rhatdan/regexp Use containers/storage/pkg/regexp in place of regexp	2023-01-13 04:19:29 -05:00
Daniel J Walsh	c2b36beb40	Use containers/storage/pkg/regexp in place of regexp This is a cleaner solution and guarantees the variables will be used before they are initialized. [NO NEW TESTS NEEDED] Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-01-12 18:33:38 -05:00
Matthew Heon	1ab833fb73	Set StoppedByUser earlier in the process of stopping The StoppedByUser variable indicates that the container was requested to stop by a user. It's used to prevent restart policy from firing (so that a restart=always container won't restart if the user does a `podman stop`. The problem is we were setting it very late in the stop() function. Originally, this was fine, but after the changes to add the new Stopping state, the logic that triggered restart policy was firing before StoppedByUser was even set - so the container would still restart. Setting it earlier shouldn't hurt anything and guarantees that checks will see that the container was stopped manually. Fixes #17069 Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-01-12 14:45:34 -05:00
Valentin Rothberg	d2fb6cf05d	service container: less verbose error logs While manually playing with --service-container, I encountered a number of too verbose logs. For instance, there's no need to error-log when the service-container has already been stopped. For testing, add a new kube test with a multi-pod YAML which will implicitly show that #17024 is now working. Fixes: #17024 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-12 14:09:23 +01:00
Daniel J Walsh	758f20e20a	Compile regex on demand not in init Every podman command is paying the price for this compile even when they don't use the Regex, this will speed up start of podman by a little. [NO NEW TESTS NEEDED] Existing tests should catch issues. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-01-11 14:38:51 -05:00
Giuseppe Scrivano	4cf06fe7e0	podman: podman rm -f doesn't leave processes follow-up to 6886e80b45caae27dda81a9b44d8dd179c414580 when "podman -rm -f" is used on a container in "stopping" state, also make sure it is terminated before removing it from the local storage. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-01-09 21:01:32 +01:00
Giuseppe Scrivano	494db3e166	oci: check for valid PID before kill(pid, 0) check that the container has a valid pid before attempting to use kill($PID, 0) on it. If the PID==0, it means the container is already stopped. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-01-09 21:01:31 +01:00
Valentin Rothberg	4a7a45f973	remove service container _after_ pods Do not allow for removing the service container unless all associated pods have been removed. Previously, the service container could be removed when all pods have exited which can lead to a number of issues. Now, the service container is treated like an infra container and can only be removed along with the pods. Also make sure that a pod is unlinked from the service container once it's being removed. Fixes: #16964 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-01-09 13:44:51 +01:00
OpenShift Merge Robot	74a961a9b8	Merge pull request #17025 from giuseppe/terminate-processes-no-pid-namespace oci: terminate all container processes on cleanup	2023-01-08 06:45:03 -05:00
OpenShift Merge Robot	c83a2f8a0a	Merge pull request #17022 from mheon/fix_defer_locking Fix a potential defer logic error around locking	2023-01-08 06:42:28 -05:00
OpenShift Merge Robot	5de8cd74f9	Merge pull request #16820 from rhatdan/names Allow '/' to prefix container names to match Docker	2023-01-07 09:38:19 -05:00
Giuseppe Scrivano	9fe86ec7f6	oci: terminate all container processes on cleanup if the container has no pid namespace, they are not killed when the container process ends. In this case, attempt to kill them in the same way. The problem was noticed with toolbox where the exec'ed sessions are not terminated when the container is stopped, blocking the system shutdown. [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-01-07 15:00:51 +01:00
Matthew Heon	92cdad0315	Fix a potential defer logic error around locking in several top-level API functions. These are the first line of the function that contains them, which makes sense; we want to capture any error returned by the function. However, making this the first defer means that it is the last thing to run after the function returns - meaning that the container's `defer c.lock.Unlock()` has already fired, leading to a chance we modify the container without holding its lock. We could move the function around so it's no longer the first defer, but then we'd have to call it twice (immediately after `defer c.lock.Unlock()` if the container is not batched, and a second time in a new `else` block right after the lock/sync call to make sure we handle batched containers). Seems simpler to just leave it like this. [NO NEW TESTS NEEDED] Can't really test for DB corruption easily. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-01-06 13:12:19 -05:00
Paul Holzinger	6038200fe0	k8s-file: podman logs --until --follow exit after time When you use podman logs with --until and --follow it should exit after the requested until time and not keep hanging forever. This fixes the behavior for the k8s-file backend. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-06 15:19:23 +01:00
Paul Holzinger	767947ab88	journald: podman logs --until --follow exit after time When you use podman logs with --until and --follow it should exit after the requested until time and not keep hanging forever. To make this work I reworked the code to use the better journald event reading code for logs as well. this correctly uses the sd_journal API without having to compare the cursors to find the EOF. The same problems exists for the k8s-file driver, I will fix this in the next commit. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-06 15:19:22 +01:00
Paul Holzinger	c674b3dd83	journald: seek to time when --since is used Instead of reading the full journal which can be expensive we can seek based on the time. If you have a journald with many podman events just compare the time `time podman events --since 1s --stream=false` with and without this patch. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-06 15:19:22 +01:00
Paul Holzinger	5f032256db	podman logs: journald fix --since and --follow The `containerCouldBeLogging` bool should not be false by default, when --since is used we seek in the journal and can miss the start event so that bool would stay false forever. This means that a running container is not followed even when it should. To fix this we can just set the `containerCouldBeLogging` bool based on the current contianer state. Fixes #16950 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-01-06 15:19:16 +01:00
OpenShift Merge Robot	b7314bdc68	Merge pull request #16806 from jakecorrenti/podman-inspect-add-error-info Add container error message to ContainerState	2023-01-05 16:02:42 -05:00

1 2 3 4 5 ...

3673 Commits