3765 Commits

Author SHA1 Message Date
21febcb5cf docs: add starting to HealthCheckResults.Status
Signed-off-by: Alexis Couvreur <alexiscouvreur.pro@gmail.com>
2023-04-02 02:02:11 -04:00
4d56292e7a libpod: mount safely subpaths
add a function to securely mount a subpath inside a volume.  We cannot
trust that the subpath is safe since it is beneath a volume that could
be controlled by a separate container.  To avoid TOCTOU races between
when we check the subpath and when the OCI runtime mounts it, we open
the subpath, validate it, bind mount to a temporary directory and use
it instead of the original path.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-31 19:48:03 +02:00
f131eaa74a auto-update: stop+start instead of restart sytemd units
It turns out the restart is _not_ a stop+start but keeps certain
resources open and is subject to some timeouts that may differ across
distributions' default settings.

[NO NEW TESTS NEEDED] as I have absolutely no idea how to reliably cause
the failure/flake/race.

Also ignore ENOENTS of the CID file when removing a container which has
been identified of actually fixing #17607.

Fixes: #17607
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-29 11:31:35 +02:00
9369a3c336 Merge pull request #17963 from Luap99/slirp-dns-userns
fix slirp4netns resolv.conf ip with a userns
2023-03-28 21:57:03 +02:00
81e5bffc32 fix slirp4netns resolv.conf ip with a userns
When a userns is set we setup the network after the bind mounts, at the
point where resolv.conf is generated we do not yet know the subnet.
Just like the other dns servers for bridge networks we need to add the
ip later in completeNetworkSetup()

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2182052

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-03-28 15:52:33 +02:00
cdb5b3e990 sqlite: do not Ping() after connecting
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix #17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.

[NO NEW TESTS NEEDED] as it's a hard to reproduce race.

Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-28 11:27:43 +02:00
8bd9109fb8 Merge pull request #17917 from mheon/fix_17905
Ensure that SQLite state handles name-ID collisions
2023-03-27 07:48:37 -04:00
7daab31f1f Ensure that SQLite state handles name-ID collisions
If a container with an ID starting with "db1" exists, and a
container named "db1" also exists, and they are different
containers - if I run `podman inspect db1` the container named
"db1" should be inspected, and there should not be an error that
multiple containers matched the name or id "db1". This was
already handled by BoltDB, and now is properly managed by SQLite.

Fixes #17905

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-24 15:09:25 -04:00
e061cb968c Fix a race around SQLite DB config validation
The DB config is a single-row table, and the first Podman process
to run against the database creates it. However, there was a race
where multiple Podman processes, started simultaneously, could
try and write it. Only the first would succeed, with subsequent
processes failing once (and then running correctly once re-ran),
but it was happening often in CI and deserves fixing.

[NO NEW TESTS NEEDED] It's a CI flake fix.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-23 19:48:27 -04:00
b31d9e15f2 sqlite: do not use shared cache
SQLite developers consider it a misfeature [1], and after turning it on,
we saw a new set of flakes.  Let's turn it off and trust the developers
[1] that WAL mode is sufficient for our purposes.

Turning the shared cache off also makes the DB smaller and faster.

[NO NEW TESTS NEEDED]

[1] https://sqlite.org/forum/forumpost/1f291cdca4

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-22 15:44:38 +01:00
6b9f3140fa Merge pull request #17874 from mheon/sqlite_fixes
Sqlite fixes
2023-03-22 08:13:29 -04:00
5f274e45f2 Run make codespell
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-21 16:00:54 -04:00
3925cd653b Drop SQLite max connections
The SQLite transaction lock Valentin found is (slightly) faster.
So let's go with that.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 14:20:34 -04:00
0fbc325156 sqlite: set connection attributes on open
The symptoms in #17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as #17859 also mentions an error
that the DB is locked during a transaction.

[NO NEW TESTS NEEDED] - existing tests cover the code.

Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

<MH: Cherry-picked on top of my branch>

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 12:51:31 -04:00
9f0e0e8331 Fix database locked errors with SQLite
I was searching the SQLite docs for a fix, but apparently that
was the wrong place; it's a common enough error with the Go
frontend for SQLite that the fix is prominently listed in the API
docs for go-sqlite3. Setting cache mode to 'shared' and using a
maximum of 1 simultaneous open connection should fix.

Performance implications of this are unclear, but cache=shared
sounds like it will be a benefit, not a curse.

[NO NEW TESTS NEEDED] This fixes a flake with concurrent DB
access.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 09:57:56 -04:00
9563415430 fix --health-on-failure=restart in transient unit
As described in #17777, the `restart` on-failure action did not behave
correctly when the health check is being run by a transient systemd
unit.  It ran just fine when being executed outside such a unit, for
instance, manually or, as done in the system tests, in a scripted
fashion.

There were two issue causing the `restart` on-failure action to
misbehave:

1) The transient systemd units used the default `KillMode=cgroup` which
   will nuke all processes in the specific cgroup including the recently
   restarted container/conmon once the main `podman healthcheck run`
   process exits.

2) Podman attempted to remove the transient systemd unit and timer
   during restart.  That is perfectly fine when manually restarting the
   container but not when the restart itself is being executed inside
   such a transient unit.  Ultimately, Podman tried to shoot itself in
   the foot.

Fix both issues by moving the restart logic in the cleanup process.
Instead of restarting the container, the `healthcheck run` will just
stop the container and the cleanup process will restart the container
once it has turned unhealthy.

Fixes: #17777
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-20 13:56:00 +01:00
94f905a503 Fix SQLite DB schema migration code
It now can safely run on bare databases, before any tables are
created.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-17 13:24:53 -04:00
6142c16a9c Ensure SQLite places uses the runroot in transient mode
Transient mode means the DB should not persist, so instead of
using the GraphRoot we should use the RunRoot instead.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-15 14:45:28 -04:00
2ec11b16ab Fix various integration test issues with SQLite state
Two main changes:
- The transient state tests relied on BoltDB paths, change to
  make them agnostic
- The volume code in SQLite wasn't retrieving and setting the
  volume plugin for volumes that used one.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-15 14:45:18 -04:00
6e0f11da5d Improve handling of existing container names in SQLite
Return more sensible errors than SQLite's embedded constraint
failure ones. Should fix a number of integration tests.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-15 14:44:47 -04:00
2718f54a29 Merge pull request #17729 from rhatdan/selinux
Support running nested SELinux container separation
2023-03-15 12:07:03 -04:00
408e764b94 events: no duplicates when streaming during a log rotation
When streaming events, prevent returning duplicates after a log rotation
by marking a beginning and an end for rotated events.  Before starting to
stream, get a timestamp while holding the event lock.  The timestamp
allows for detecting whether a rotation event happened while reading the
log file and to skip all events between the begin and end rotation
event.

In an ideal scenario, we could detect rotated events by enforcing a
chronological order when reading and skip those detected to not be more
recent than the last read event.  However, events are not always
_written_ in chronological order.  While this can be changed, existing
event files could not be read correctly anymore.

Fixes: #17665
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-15 10:28:16 +01:00
2d1f4a8bff cgroupns: private cgroupns on cgroupv1 breaks --systemd
On cgroup v1 we need to mount only the systemd named hierarchy as
writeable, so we configure the OCI runtime to mount /sys/fs/cgroup as
read-only and on top of that bind mount /sys/fs/cgroup/systemd.

But when we use a private cgroupns, we cannot do that since we don't
know the final cgroup path.

Also, do not override the mount if there is already one for
/sys/fs/cgroup/systemd.

Closes: https://github.com/containers/podman/issues/17727

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-14 12:34:52 +01:00
01fd5bcc30 libpod: remove error stutter
the error is already clear.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-14 12:34:52 +01:00
ad8a96ab95 Support running nested SELinux container separation
Currently Podman prevents SELinux container separation,
when running within a container. This PR adds a new
--security-opt label=nested

When setting this option, Podman unmasks and mountsi
/sys/fs/selinux into the containers making /sys/fs/selinux
fully exposed. Secondly Podman sets the attribute
run.oci.mount_context_type=rootcontext

This attribute tells crun to mount volumes with rootcontext=MOUNTLABEL
as opposed to context=MOUNTLABEL.

With these two settings Podman inside the container is allowed to set
its own SELinux labels on tmpfs file systems mounted into its parents
container, while still being confined by SELinux. Thus you can have
nested SELinux labeling inside of a container.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-13 14:21:12 -04:00
9a45503c80 Merge pull request #17249 from rhatdan/qm
Must use mountlabel when creating builtin volumes
2023-03-09 14:27:05 -05:00
b5a99e0816 Must use mountlabel when creating builtin volumes
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-09 12:36:52 -05:00
21651706e3 podman inspect list network when using --net=host or none
This will match Docker behaviour.

Fixes: https://github.com/containers/podman/issues/17385

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-08 17:27:08 -05:00
34ff27b813 libpod: avoid nil pointer dereference in (*Container).Cleanup
On FreeBSD, c.config.Spec.Linux is not populated - in this case, we can
assume that the container is not using a pid namespace.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-03-06 11:51:42 +00:00
e77f370f86 sqlite: add a hidden --db-backend flag
Add a hidden flag to set the database backend and plumb it into
podman-info.  Further add a system test to make sure the flag and the
info output are working properly.

Note that the test may need to be changed once we settled on how
to test the sqlite backend in CI.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-02 13:43:11 +01:00
96d439913e Merge pull request #17658 from vrothberg/sqlite
sqlite updates
2023-03-02 07:55:04 +01:00
8457bb5542 Merge pull request #16717 from umohnani8/detach
play kube: Add --wait option
2023-03-01 16:46:54 +01:00
2c67ff5d40 sqlite: add container short ID to network aliases
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
38acab832d sqlite: remove dead code
Found by golangci-lint.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
2342d1a314 sqlite: addContainer: add named volume only once
There's a unique constraint in the table, so we shouldn't add the same
volume more than once to the same container.

[NO NEW TESTS NEEDED] as it fixes an existing one.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
86d12520e9 sqlite: implement RewriteVolumeConfig
[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
df88f546b6 sqlite: LookupVolume: fix partial name match
A partial name match is tricky as we want it to be fast but also make
sure there's only one partial match iff there's no full one.

[NO NEW TESTS NEEDED] as it fixes a system test.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
01359457c4 sqlite: LookupVolume: wrap error
Wrap the error with the message expexted by the system tests.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
69ff04f736 sqlite: fix type rewriting container config
It's `UPDATE $NAME` not `UPDATE TABLE $NAME`.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
e87014e444 sqlite: return correct error on pod-name conflict
I wasn't able to find a way to get error-checks working with the sqlite3
library with the time at hand.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
84b5c6c713 sqlite: RewritePodConfig: update error message
Use the same error message as the boltdb backend.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
02a77d27a2 Merge pull request #17450 from danishprakash/add-group-entry
create: add entry to /etc/group via `--group-entry`
2023-02-28 21:59:59 +01:00
20a42d0e4f play kube: Add --wait option
Add a way to keep play kube running in the foreground and terminating all pods
after receiving a a SIGINT or SIGTERM signal. The pods will also be
cleaned up after the containers in it have exited.
If an error occurrs during kube play, any resources created till the
error point will be cleane up also.

Add tests for the various scenarios.

Fixes #14522

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2023-02-28 13:45:36 -05:00
9d93486d21 Vendor in latest containers/storage
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-02-24 08:25:04 -05:00
5d2d609be4 sqlite: fix volume lookups with partial names
Requires the trailing `%` to work correctly, see
        https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:56:58 +01:00
495314a16a sqlite: fix container lookups with partial IDs
Requires the trailing `%` to work correctly, see
	https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:47:32 +01:00
efe7aeb1da sqlite: fix LookupPod
To return the error message expected by the system tests.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:42:41 +01:00
19c2f37ba5 sqlite: fix pod create/rm
A number of fixes for pod creation and removal.

The important part is that matching partial IDs requires a trailing `%`
for SQL to interpret it as a wildcard.  More information at
	https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:38:17 +01:00
e32bea9378 sqlite: LookupContainer: update error message
As expected by the system tests.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 11:36:47 +01:00
565bb56454 sqlite: AddContainerExitCode: allow to replace
Allow to replace existing exit codes.  A container may be started and
stopped multiple times etc.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 11:30:46 +01:00