403 Commits

Author SHA1 Message Date
a9de888a15 libpod: do not resuse networking on start
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.

Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-07 17:50:28 +02:00
7243c7109c fix(libpod): add newline character to the end of container's hostname file
debian's man (5) hostname page states "The file should contain a single newline-terminated hostname
string."
[NO NEW TESTS NEEDED]

fix #22729

Signed-off-by: Bo Wang <wangbob@uniontech.com>
2024-06-04 15:20:04 +08:00
15b8bb72a8 libpod: restart always reconfigure the netns
Always teardown the network, trying to reuse the netns has caused
a significant amount of bugs in this code here. It also never worked
for containers with user namespaces. So once and for all simplify this
by never reusing the netns. Originally this was done to have a faster
restart of containers but with netavark now we are much faster so it
shouldn't be that noticeable in practice. It also makes more sense to
reconfigure the netns as it is likely that the container exited due
some broken network state in which case reusing would just cause more
harm than good.

The main motivation for this change was the pasta change to use
--dns-forward by default. As the restarted contianer had no idea what
nameserver to use as pasta just kept running.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-03-19 12:21:18 +01:00
dc1795b4b2 use new c/common pasta2 setup logic to fix dns
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.

A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.

Fixes #22044

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-03-19 12:09:31 +01:00
72f1617fac Bump Go module to v5
Moving from Go module v4 to v5 prepares us for public releases.

Move done using gomove [1] as with the v3 and v4 moves.

[1] https://github.com/KSubedi/gomove

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-08 09:35:39 -05:00
2a2d0b0e18 chore: delete obsolete // +build lines
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-04 11:53:38 +02:00
5c7f745468 Remove deprecated field ContainerState.NetworkStatusOld
This field drags in a dependency on CNI and thereby blocks us from disabling CNI
support via a build tag

[NO NEW TESTS NEEDED]

Signed-off-by: Dan Čermák <dcermak@suse.com>
2023-12-12 17:09:39 +01:00
a687c38860 use rootless netns from c/common
Use the new rootlessnetns logic from c/common, drop the podman code
here and make use of the new much simpler API.

ref: https://github.com/containers/common/pull/1761

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-12-07 11:24:46 +01:00
45e53ed7b0 libpod: Detect whether we have a private UTS namespace on FreeBSD
Right now, we always use a private UTS namespace on FreeBSD. This should
be made optional but implementing that cleanly needs a FreeBSD extension
to the OCI runtime config. The process for that is starting
(https://github.com/opencontainers/tob/pull/133) but in the meantime,
assume that the UTS namespace is private on FreeBSD.

This moves the Linux-specific namespace logic to
container_internal_linux.go and adds a FreeBSD stub.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-12-01 12:37:39 +00:00
4f6a8f0d50 Merge pull request #20483 from vrothberg/RUN-1934
container.conf: support attributed string slices
2023-10-27 17:49:13 +00:00
c6d410cc36 Do not add powercap mask if no paths are masked
This solves `--security-opt unmask=ALL` still masking the path.

[NO NEW TESTS NEEDED] Can't easily test this as we do not have
access to it in CI.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-10-27 09:55:12 -04:00
e966c86d98 container.conf: support attributed string slices
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-10-27 12:44:33 +02:00
be7dd128ef Mask /sys/devices/virtual/powercap
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-10-26 18:24:25 -04:00
bad25da92e libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
27b41f0877 libpod: use /var/run instead of /run on FreeBSD
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-08-17 14:04:53 +01:00
f256f4f954 Use constants for mount types
Inspired by https://github.com/containers/podman/pull/19238

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-14 07:17:21 -04:00
f8213a6d53 libpod: don't make a broken symlink for /etc/mtab on FreeBSD
This file has not been present in BSD systems since 2.9.1 BSD and as far
as I remember /proc/mounts has never existed on BSD systems.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-07-10 12:41:41 +01:00
614c962c23 use libnetwork/slirp4netns from c/common
Most of the code moved there so if from there and remove it here.

Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.

[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-22 11:16:13 +02:00
f07aa1bfdc make lint: enable wastedassign
Because we shouldn't waste assigns.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-19 14:14:48 +02:00
4d56292e7a libpod: mount safely subpaths
add a function to securely mount a subpath inside a volume.  We cannot
trust that the subpath is safe since it is beneath a volume that could
be controlled by a separate container.  To avoid TOCTOU races between
when we check the subpath and when the OCI runtime mounts it, we open
the subpath, validate it, bind mount to a temporary directory and use
it instead of the original path.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-31 19:48:03 +02:00
2718f54a29 Merge pull request #17729 from rhatdan/selinux
Support running nested SELinux container separation
2023-03-15 12:07:03 -04:00
2d1f4a8bff cgroupns: private cgroupns on cgroupv1 breaks --systemd
On cgroup v1 we need to mount only the systemd named hierarchy as
writeable, so we configure the OCI runtime to mount /sys/fs/cgroup as
read-only and on top of that bind mount /sys/fs/cgroup/systemd.

But when we use a private cgroupns, we cannot do that since we don't
know the final cgroup path.

Also, do not override the mount if there is already one for
/sys/fs/cgroup/systemd.

Closes: https://github.com/containers/podman/issues/17727

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-14 12:34:52 +01:00
01fd5bcc30 libpod: remove error stutter
the error is already clear.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-14 12:34:52 +01:00
ad8a96ab95 Support running nested SELinux container separation
Currently Podman prevents SELinux container separation,
when running within a container. This PR adds a new
--security-opt label=nested

When setting this option, Podman unmasks and mountsi
/sys/fs/selinux into the containers making /sys/fs/selinux
fully exposed. Secondly Podman sets the attribute
run.oci.mount_context_type=rootcontext

This attribute tells crun to mount volumes with rootcontext=MOUNTLABEL
as opposed to context=MOUNTLABEL.

With these two settings Podman inside the container is allowed to set
its own SELinux labels on tmpfs file systems mounted into its parents
container, while still being confined by SELinux. Thus you can have
nested SELinux labeling inside of a container.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-13 14:21:12 -04:00
0999991b20 add support for limiting tmpfs size for systemd-specific mnts
* add tests
* add documentation for --shm-size-systemd
* add support for both pod and standalone run

Signed-off-by: danishprakash <danish.prakash@suse.com>
2023-02-14 14:56:09 +05:30
0bc3d35791 libpod: move NetNS into state db instead of extra bucket
This should simplify the db logic. We no longer need a extra db bucket
for the netns, it is still supported in read only mode for backwards
compat. The old version required us to always open the netns before we
could attach it to the container state struct which caused problem in
some cases were the netns was no longer valid.

Now we use the netns as string throughout the code, this allow us to
only open it when needed reducing possible errors.

[NO NEW TESTS NEEDED] Existing tests should cover it and it is only a
flake so hard to reproduce the error.

Fixes #16140

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-12-16 18:30:12 +01:00
51c376c8a1 libpod: Factor out the call to PidFdOpen from (*Container).WaitForExit
This allows us to add a simple stub for FreeBSD which returns -1,
leading WaitForExit to fall back to the sleep loop approach.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-14 13:24:32 +01:00
36cfd05a7d libpod: Move platform-specific bind mounts to a per-platform method
This adds a new per-platform method makePlatformBindMounts and moves the
/etc/hostname mount. This file is only needed on Linux.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-12 16:11:25 +01:00
2c63b8439b Fix stutters
Podman adds an Error: to every error message.  So starting an error
message with "error" ends up being reported to the user as

Error: error ...

This patch removes the stutter.

Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-09-10 07:52:00 -04:00
f75c3181bf podman: skip /sys/fs/cgroup/systemd if not present
skip adding the /sys/fs/cgroup/systemd bind mount if it is not already
present on the host.

[NO NEW TESTS NEEDED] requires a system without systemd.

Closes: https://github.com/containers/podman/issues/15647

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-09-07 15:33:08 +02:00
a3aecf0f26 libpod: Factor out setting volume atime to container_internal_linux.go
It turns out that field names in syscall.Stat_t are platform-specific.
An alternative to this could change fixVolumePermissions to use
unix.Lstat since unix.Stat_t uses the same mmember name for Atim on both
Linux and FreeBSD.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:20:50 +01:00
7a1abd03c5 libpod: Move miscellaneous file handlling to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:20:50 +01:00
212b11c34c libpod: Factor out handling of slirp4netns and net=none
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:20:50 +01:00
eab4291d99 libpod: Move functions related to /etc bind mounts to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:20:50 +01:00
b3989be768 libpod: Move getRootNetNsDepCtr to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:50 +01:00
7518a9136a libpod: Move functions related to checkpoints to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
be5d1261b4 libpod: Move mountNotifySocket to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
71e2074e83 libpod: Move getUserOverrides, lookupHostUser to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
232eea5a00 libpod: Move isWorkDirSymlink, resolveWorkDir to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
0889215d83 libpod: Use platform-specific mount type for volume mounts
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
c1a86a8c4c libpod: Factor out platform-specific sections from generateSpec
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
e101f4350b libpod: Move getOverlayUpperAndWorkDir and generateSpec to container_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-05 10:17:49 +01:00
d82a41687e Add container GID to additional groups
Mitigates a potential permissions issue. Mirrors Buildah PR #4200
and CRI-O PR #6159.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-09-02 15:51:36 -04:00
1572420c3f libpod: Move uses of unix.O_PATH to container_internal_linux.go
The O_PATH flag is a recent addition to the open syscall and is not
present in darwin or in FreeBSD releases before 13.1. The constant is
not present in the FreeBSD version of x/sys/unix since that package
supports FreeBSD 12.3 and later.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-08-17 12:55:41 +01:00
5d7778411a libpod: Move rootless network setup details to container_internal_linux.go
This removes a use of state.NetNS which is a linux-specific field defined
in container_linux.go from the generic container_internal.go, allowing
that to build on non-linux platforms.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-08-17 12:55:32 +01:00
92bbae40de Merge pull request #15248 from vrothberg/RUN-1606
kube play: sd-notify integration
2022-08-11 15:44:55 +00:00
3fc126e152 libpod: allow the notify socket to be passed programatically
The notify socket can now either be specified via an environment
variable or programatically (where the env is ignored).  The
notify mode and the socket are now also displayed in `container inspect`
which comes in handy for debugging and allows for propper testing.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-08-10 21:10:17 +02:00
658960c97b build(deps) bump CDI dependency from 0.4.0 to 0.5.0
bump github.com/container-orchestrated-devices/container-device-interface from 0.4.0 to 0.5.0

This requires that the cdi.Registry be instantiated with AutoRefresh disabled for CLI clients.

[NO NEW TESTS NEEDED]

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2022-08-10 10:49:42 +02:00
dd2b794061 libpod: create /etc/passwd if missing
create the /etc/passwd and /etc/group files if they are missing in the
image.

Closes: https://github.com/containers/podman/issues/14966

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-07-21 17:58:16 +02:00
377057b400 [CI:DOCS] Improve language. Fix spelling and typos.
* Correct spelling and typos.

* Improve language.

Co-authored-by: Ed Santiago <santiago@redhat.com>
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2022-07-11 21:59:32 +02:00