278 Commits

Author SHA1 Message Date
1950555b26 podman mount: some better error wrapping
As shown in #23671 these functions can return the raw error without any
useful context to the user which makes it hard to understand where
things went wrong. Simply add some context to some error paths here.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-27 15:15:23 +02:00
de856dab99 Add --health-max-log-count, --health-max-log-size, --health-log-destination flags
These flags can affect the output of the HealtCheck log. Currently, when a container is configured with HealthCheck, the output from the HealthCheck command is only logged to the container status file, which is accessible via `podman inspect`.
It is also limited to the last five executions and the first 500 characters per execution.

This makes debugging past problems very difficult, since the only information available about the failure of the HealthCheck command is the generic `healthcheck service failed` record.

- The `--health-log-destination` flag sets the destination of the HealthCheck log.
  - `none`: (default behavior) `HealthCheckResults` are stored in overlay containers. (For example: `$runroot/healthcheck.log`)
  - `directory`: creates a log file named `<container-ID>-healthcheck.log` with JSON `HealthCheckResults` in the specified directory.
  - `events_logger`: The log will be written with logging mechanism set by events_loggeri. It also saves the log to a default directory, for performance on a system with a large number of logs.

- The `--health-max-log-count` flag sets the maximum number of attempts in the HealthCheck log file.
  - A value of `0` indicates an infinite number of attempts in the log file.
  - The default value is `5` attempts in the log file.
- The `--health-max-log-size` flag sets the maximum length of the log stored.
  - A value of `0` indicates an infinite log length.
  - The default value is `500` log characters.

Add --health-max-log-count flag

Signed-off-by: Jan Rodák <hony.com@seznam.cz>

Add --health-max-log-size flag

Signed-off-by: Jan Rodák <hony.com@seznam.cz>

Add --health-log-destination flag

Signed-off-by: Jan Rodák <hony.com@seznam.cz>
2024-09-25 14:01:35 +02:00
432325236b libpod: convert owner IDs only with :idmap
convert the owner UID and GID into the user namespace only when
":idmap" mount is used.

This changes the behaviour of :idmap with an empty volume.  Now the
existing directory ownership is copied up as in the other case.

Closes: https://github.com/containers/podman/issues/23347

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-17 12:38:53 +02:00
62956ac192 libpod: first delete container then cidfile
I am seeing a weird flake in my parallel system test PR. The issue is
that system units generated by podman systemd generate leave a container
in the Removing state behind.

As far as I can tell the porblems seems to be that the cleanup process
is killed while it tries to remove the container from the db. Because
the cidfile was removed before the ExecStopPost=podman rm ... process no
longer had access to the cidfile and reported no error because it runs
with --ignore.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-05 10:27:42 +02:00
83dbbc3a51 Replace golang.org/x/exp/slices with slices from std
Use "slices" from the standard library, this package was added in go
1.21 so we can use it now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-04-23 11:16:40 +02:00
b026e1d635 libpod: make healthcheck events more efficient
We already know the status of the healthcheck in the caller so calling
healthCheckStatus() just make the event code sync the container state
and reread the healthcheck file for no reason.

It is much better to directly pass the status down to the event call.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-04-19 17:24:44 +02:00
72f1617fac Bump Go module to v5
Moving from Go module v4 to v5 prepares us for public releases.

Move done using gomove [1] as with the v3 and v4 moves.

[1] https://github.com/KSubedi/gomove

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-08 09:35:39 -05:00
a8b2256cc1 Set interface name to the network_interface name for macvlan and ipvlan networks
When interface_name attribute in containers.conf file is set to "device", then set interface names inside containers same as the network_interface names of the respective network.

The change applies to macvlan and ipvlan networks only. The interface_name attribute value has no impact on any other types of networks.

If the interface name is set in the user request, then that takes precedence.

Fixes: #21313

Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
2024-02-06 19:05:16 -08:00
8bdf77aa20 Refactor: replace StringInSlice with slices.Contains
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-05 16:25:56 +02:00
2a2d0b0e18 chore: delete obsolete // +build lines
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-04 11:53:38 +02:00
bad25da92e libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
3cc9db8626 libpod: fix deadlock while parallel container create
When containers are created with a named volume it can deadlock because
the create logic tried to lock all volumes in a loop, this is fine if it
only ever creates a single container at any given time. However because
we multiple containers can be created at the same time they can cause a
deadlock between the volumes. This is because the order of the loop is
not stable, in fact it is based on the order of how the volumes were
specified on the cli.

So if you create two containers at the same time with
`-v vol1:/dir2 -v vol2:/dir2` and the other one with
`-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock.

Now one solution could be to order the volumes to prevent the issue but
the reason for holding the lock is dubious. The goal was to prevent the
volume from being removed in the meantime. However that could still
have happend before we acquired the lock so it didn't protect against
that.

Both boltdb and sqlite already prevent us from adding a container with
volumes that do not exists due their internal consistency checks.
Sqlite even uses FOREIGN KEY relationships so the schema will prevent us
from doing anything wrong.

The create code currently first checks if the volume exists and if not
creates it. I have checked that the db will guarantee that this will not
work:
Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume`
Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed`

Keep in mind that this error is normally not seen, only if the volume is
removed between the volume exists check and adding the container in the
db this messages will be seen wich is an acceptable race and a
pre-existing condition anyway.

[NO NEW TESTS NEEDED] Race condition, hard to test in CI.

Fixes #20313

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-11 11:40:35 +02:00
71549c642f Ignore spurious container-removal errors
When removing a container's dependency, getting an error that the
container has already been removed (ErrNoSuchCtr and
ErrCtrRemoved) should not be fatal. We wanted the container gone,
it's gone, no need to error out.

[NO NEW TESTS NEEDED] This is a race and thus hard to test for.

Fixes #18874

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-09-05 14:35:37 -04:00
27b41f0877 libpod: use /var/run instead of /run on FreeBSD
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-08-17 14:04:53 +01:00
b5ce0ab2de Fix language, typos and markdown layout
[NO NEW TESTS NEEDED]

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-07-24 11:18:25 +02:00
f1c68b79eb add hostname to network alias
We use the name as alias but using the hostname makes also sense and
this is what docker does. We have to keep the short id as well for
docker compat.

While adding some tests I removed some duplicated tests that were
executed twice for nv for no reason.

Fixes #17370

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-11 15:38:24 +02:00
b6ec2127b8 libpod: set cid network alias in setupContainer()
Since we have sqlite there is no point in duplicating this acroos two db
backends. Just set earlier when we validate the networks anyway.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-11 15:16:11 +02:00
a750cd9876 Fix a race removing multiple containers in the same pod
If the first container to get the pod lock is the infra container
it's going to want to remove the entire pod, which will also
remove every other container in the pod. Subsequent containers
will get the pod lock and try to access the pod, only to realize
it no longer exists - and that, actually, the container being
removed also no longer exists.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
0e47465e4a Discard errors when a pod is already removed
This was causing some CI flakes. I'm pretty sure that the pods
being removed already isn't a bug, but just the result of another
container in the pod removing it first - so no reason not to
ignore the errors.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
2c9f18182a The removeContainer function now accepts a struct
We had something like 6 different boolean options (removing a
container turns out to be rather complicated, because there are a
million-odd things that want to do it), and the function
signature was getting unreasonably large. Change to a struct to
clean things up.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:27:27 -04:00
ef1a22cdea Fix a deadlock when removing pods
The infra container would try to remove the pod, despite the pod
already being in the process of being removed - oops. Add a check
to ensure we don't try and remove the pod when called by the
`podman pod rm` command.

Also, wire up noLockPod - it wasn't previously wired in, which is
concerning, and could be related?

Finally, make a few minor fixes to un-break lint.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:27:25 -04:00
8cb5d39d43 Pods now return what containers were removed with them
This probably should have been in the API since the beginning,
but it's not too late to start now.

The extra information is returned (both via the REST API, and to
the CLI handler for `podman rm`) but is not yet printed - it
feels like adding it to the output could be a breaking change?

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:24:59 -04:00
bc1a31ce6d Make RemoveContainer return containers and pods removed
This allows for accurate reporting of dependency removal, but the
work is still incomplete: pods can be removed, but do not report
the containers they removed as part of said removal. Will add
this in a subsequent commit.

Major note: I made ignoring no-such-container errors automatic
once it has been determined that a container did exist in the
first place. I can't think of any case where this would not be a
TOCTOU - IE, no reason not to ignore them. The `--ignore` option
to `podman rm` should still retain meaning as it will ignore
errors from containers that didn't exist in the first place.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 16:24:56 -04:00
e8d7456278 Add an API for removing a container and dependencies
This is the initial stage of implementation. The current API
functions but does not report the additional containers and pods
removed. This is necessary to properly display results to the
user after `podman rm --all`.

The existing remove-dependencies code has been removed in favor
of this more native solution.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-01 15:32:50 -04:00
08b0d93ea3 kube play: exit-code propagation
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd.  The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.

When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint.  In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers.  The service will then behave
according to the following three exit-code policies:

 - `none`: exit 0 and ignore containers (default)
 - `any`: exit non-zero if _any_ container did
 - `all`: exit non-zero if _all_ containers did

The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.

In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.

Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero.  The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero.  Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.

Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 14:46:34 +02:00
032d4a95f0 Consolidate error handling in Runtime.removeContainer
Use a helper to handle the cleanupErr logic instead of
copy&pasting it EIGHT times.

Also modifies the returned errors to be wrapped with a context,
and changes the text of the logged errors a bit.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2023-05-22 19:14:06 +02:00
f131eaa74a auto-update: stop+start instead of restart sytemd units
It turns out the restart is _not_ a stop+start but keeps certain
resources open and is subject to some timeouts that may differ across
distributions' default settings.

[NO NEW TESTS NEEDED] as I have absolutely no idea how to reliably cause
the failure/flake/race.

Also ignore ENOENTS of the CID file when removing a container which has
been identified of actually fixing #17607.

Fixes: #17607
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-29 11:31:35 +02:00
b5a99e0816 Must use mountlabel when creating builtin volumes
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-09 12:36:52 -05:00
e9942c61dd build(deps): bump github.com/container-orchestrated-devices/container-device-interface
Bumps [github.com/container-orchestrated-devices/container-device-interface](https://github.com/container-orchestrated-devices/container-device-interface) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/container-orchestrated-devices/container-device-interface/releases)
- [Commits](https://github.com/container-orchestrated-devices/container-device-interface/compare/v0.5.3...v0.5.4)

---
updated-dependencies:
- dependency-name: github.com/container-orchestrated-devices/container-device-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

[NO NEW TESTS NEEDED]

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-20 14:51:04 +01:00
21cddd7235 journald: remove initializeJournal()
This was added as hack in commit 6b06e9b77c because the journald logs
code was not able to handle an empty journal. But since commit
767947ab88 this is no longer the case, we correctly use the sd_journal
API and know when the journal is empty.

Therefore we no longer need this hack and it should be removed because
it just adds overhead and an empty journal entry for no good reason.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-02-17 11:24:18 +01:00
08e13867a9 Fix typos. Improve language.
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-02-09 21:56:27 +01:00
29b346deab container rm: save once for exec removal and state change
Do not save the container each for changing the state and for removing
running exec sessions.  Saving the container is expensive and avoiding
the redundant save makes `container rm` 1.2 times faster on my
workstation.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-01-30 15:43:03 +01:00
c74bdae351 DB: make loading container states optional
Loading container states speed things up when listing all containers but
it comes with a price tag for many other call paths.  Hence, make
loading the state conditional to allow for keeping `podman ps` fast
without other commands regressing in performance.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-01-26 10:20:38 +01:00
6f919af78b add a comment to container removal
Every time I look at a container-removal issue I wonder why the
container isn't locked directly here, so let's add a comment here.
I am not sure whether I would be better if callers took care of
locking but for now the comment will safe the future me and probably
other readers some time.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-01-16 11:45:19 +01:00
4cf06fe7e0 podman: podman rm -f doesn't leave processes
follow-up to 6886e80b45caae27dda81a9b44d8dd179c414580

when "podman -rm -f" is used on a container in "stopping" state, also
make sure it is terminated before removing it from the local storage.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-01-09 21:01:32 +01:00
4a7a45f973 remove service container _after_ pods
Do not allow for removing the service container unless all associated
pods have been removed.  Previously, the service container could be
removed when all pods have exited which can lead to a number of issues.

Now, the service container is treated like an infra container and can
only be removed along with the pods.

Also make sure that a pod is unlinked from the service container once
it's being removed.

Fixes: #16964
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-01-09 13:44:51 +01:00
0c94f61852 Allow '/' to prefix container names to match Docker
Fixes: https://github.com/containers/podman/issues/16663

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-12-26 07:37:43 -05:00
a651cdfbc3 runtime: do not chown idmapped volumes
do not chown a volume when idmap is used.

Closes: https://github.com/containers/podman/issues/16724

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-12-05 14:10:43 +01:00
71f92d263c container create: add inspect data to event
When the new `events_container_create_inspect_data` option is enabled in
containers.conf set the `ContainersInspectData` event field for each
container-create event.

The data was requested for the purpose of auditing (e.g., intrusion
detection).

Jira: https://issues.redhat.com/browse/RUN-1702
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-11-29 10:40:24 +01:00
607cd39e15 Removing the RawInput from the API output
Including the RawInput in the API output is meaningless.

Fixes: #16497

[NO NEW TESTS NEEDED]

Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
2022-11-17 15:41:01 +09:00
5aa32e45ad Merge pull request #16377 from giuseppe/add-check-for-valid-pod-systemd-cgroup
runtime: add check for valid pod systemd cgroup
2022-11-04 07:59:22 -04:00
3fee351c35 remove container/pod id file along with container/pod
Remove the container/pod ID file along with the container/pod.  It's
primarily used in the context of systemd and are not useful nor needed
once a container/pod has ceased to exist.

Fixes: #16387
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-11-03 17:07:09 +01:00
cac4919bf6 runtime: add check for valid pod systemd cgroup
there is already the same check when using cgroupfs, but not when
using the systemd cgroup backend.  The check is needed to avoid a
confusing error from the OCI runtime.

Closes: https://github.com/containers/podman/issues/16376

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-11-03 12:23:44 +01:00
fc6dcd12b3 Add support for 'image' volume driver
We added the concept of image volumes in 2.2.0, to support
inspecting an image from within a container. However, this is a
strictly read-only mount, with no modification allowed.

By contrast, the new `image` volume driver creates a c/storage
container as its underlying storage, so we have a read/write
layer. This, in and of itself, is not especially interesting, but
what it will enable in the future is. If we add a new command to
allow these image volumes to be committed, we can now distribute
volumes - and changes to them - via a standard OCI image registry
(which is rather new and quite exciting).

Future work in this area:
- Add support for `podman volume push` (commit volume changes and
  push resulting image to OCI registry).
- Add support for `podman volume pull` (currently, we require
  that the image a volume is created from be already pulled; it
  would be simpler if we had a dedicated command that did the
  pull and made a volume from it)
- Add support for scratch images (make an empty image on demand
  to use as the base of the volume)
- Add UOR support to `podman volume push` and
  `podman volume pull` to enable both with non-image volume
  drivers

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-22 13:07:40 -04:00
e19e0de5fa Introduce graph-based pod container removal
Originally, during pod removal, we locked every container in the
pod at once, did a number of validity checks to ensure everything
was safe, and then removed all the containers in the pod.

A deadlock was recently discovered with this approach. In brief,
we cannot lock the entire pod (or much more than a single
container at a time) without causing a deadlock. As such, we
converted to an approach where we just looped over each container
in the pod, removing them individually. Unfortunately, this
removed a lot of the validity checking of the earlier approach,
allowing for a lot of unintended bad things. Infra containers
could be removed while containers in the pod still depended on
them, for example.

There's no easy way to do validity checks while in a simple loop,
so I implemented a version of our graph-traversal logic that
currently handles pod start. This version acts in the reverse
order of startup: startup starts from containers which depend on
nothing and moves outwards, while removal acts on containers which
have nothing depend on them and moves inwards. By doing graph
traversal, we can guarantee that nothing is removed while
something that depends on it still exists - so the infra
container should be the last thing in a pod that is removed, for
example.

In the (unlikely) case that a graph of the pod's containers
cannot be built (most likely impossible without database editing)
the old method of pod removal has been retained to ensure that
even misbehaving pods can be forcibly evicted from the state.

I'm fairly confident that this resolves the problem, but there
are a lot of assumptions around dependency structure built into
the original pod removal code and I am not 100% sure I have
captured all of them.

Fixes #15526

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-14 13:44:48 -04:00
75740be395 all: stop using deprecated GenerateNonCryptoID
In view of https://github.com/containers/storage/pull/1337, do this:

	for f in $(git grep -l stringid.GenerateNonCryptoID | grep -v '^vendor/'); do
		sed -i 's/stringid.GenerateNonCryptoID/stringid.GenerateRandomID/g' $f;
	done

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-09-13 16:26:26 -07:00
2c63b8439b Fix stutters
Podman adds an Error: to every error message.  So starting an error
message with "error" ends up being reported to the user as

Error: error ...

This patch removes the stutter.

Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-09-10 07:52:00 -04:00
9585147e1e fix Pod removal after OS hard shutdown
In case of a hard OS shutdown, containers may have a "removing"
state after a reboot, and an attempt to remove Pods with such
containers is unsuccessful:

error freeing lock for container ...: no such file or directory

[NO NEW TESTS NEEDED]

Signed-off-by: Mikhail Khachayants <tyler92@inbox.ru>
2022-09-06 20:41:10 +03:00
d162285f34 libpod: Don't mount /dev/shm in containers on FreeBSD
This mount has never been standard on FreeBSD, preferring to use /tmp or
/var/tmp optionally with tmpfs to ensure data is lost on a reboot.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:20:50 +01:00
98169c20dd Add emptyDir volume support to kube play
When a kube yaml has a volume set as empty dir, podman
will create an anonymous volume with the empty dir name and
attach it to the containers running in the pod. When the pod
is removed, the empy dir volume created is also removed.

Add tests and docs for this as well.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2022-08-30 10:34:45 -04:00