Use the new rootlessnetns logic from c/common, drop the podman code
here and make use of the new much simpler API.
ref: https://github.com/containers/common/pull/1761
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Right now, we always use a private UTS namespace on FreeBSD. This should
be made optional but implementing that cleanly needs a FreeBSD extension
to the OCI runtime config. The process for that is starting
(https://github.com/opencontainers/tob/pull/133) but in the meantime,
assume that the UTS namespace is private on FreeBSD.
This moves the Linux-specific namespace logic to
container_internal_linux.go and adds a FreeBSD stub.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This solves `--security-opt unmask=ALL` still masking the path.
[NO NEW TESTS NEEDED] Can't easily test this as we do not have
access to it in CI.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This file has not been present in BSD systems since 2.9.1 BSD and as far
as I remember /proc/mounts has never existed on BSD systems.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
Most of the code moved there so if from there and remove it here.
Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.
[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
add a function to securely mount a subpath inside a volume. We cannot
trust that the subpath is safe since it is beneath a volume that could
be controlled by a separate container. To avoid TOCTOU races between
when we check the subpath and when the OCI runtime mounts it, we open
the subpath, validate it, bind mount to a temporary directory and use
it instead of the original path.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
On cgroup v1 we need to mount only the systemd named hierarchy as
writeable, so we configure the OCI runtime to mount /sys/fs/cgroup as
read-only and on top of that bind mount /sys/fs/cgroup/systemd.
But when we use a private cgroupns, we cannot do that since we don't
know the final cgroup path.
Also, do not override the mount if there is already one for
/sys/fs/cgroup/systemd.
Closes: https://github.com/containers/podman/issues/17727
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Currently Podman prevents SELinux container separation,
when running within a container. This PR adds a new
--security-opt label=nested
When setting this option, Podman unmasks and mountsi
/sys/fs/selinux into the containers making /sys/fs/selinux
fully exposed. Secondly Podman sets the attribute
run.oci.mount_context_type=rootcontext
This attribute tells crun to mount volumes with rootcontext=MOUNTLABEL
as opposed to context=MOUNTLABEL.
With these two settings Podman inside the container is allowed to set
its own SELinux labels on tmpfs file systems mounted into its parents
container, while still being confined by SELinux. Thus you can have
nested SELinux labeling inside of a container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* add tests
* add documentation for --shm-size-systemd
* add support for both pod and standalone run
Signed-off-by: danishprakash <danish.prakash@suse.com>
This should simplify the db logic. We no longer need a extra db bucket
for the netns, it is still supported in read only mode for backwards
compat. The old version required us to always open the netns before we
could attach it to the container state struct which caused problem in
some cases were the netns was no longer valid.
Now we use the netns as string throughout the code, this allow us to
only open it when needed reducing possible errors.
[NO NEW TESTS NEEDED] Existing tests should cover it and it is only a
flake so hard to reproduce the error.
Fixes#16140
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This allows us to add a simple stub for FreeBSD which returns -1,
leading WaitForExit to fall back to the sleep loop approach.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This adds a new per-platform method makePlatformBindMounts and moves the
/etc/hostname mount. This file is only needed on Linux.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
Podman adds an Error: to every error message. So starting an error
message with "error" ends up being reported to the user as
Error: error ...
This patch removes the stutter.
Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
skip adding the /sys/fs/cgroup/systemd bind mount if it is not already
present on the host.
[NO NEW TESTS NEEDED] requires a system without systemd.
Closes: https://github.com/containers/podman/issues/15647
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
It turns out that field names in syscall.Stat_t are platform-specific.
An alternative to this could change fixVolumePermissions to use
unix.Lstat since unix.Stat_t uses the same mmember name for Atim on both
Linux and FreeBSD.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
The O_PATH flag is a recent addition to the open syscall and is not
present in darwin or in FreeBSD releases before 13.1. The constant is
not present in the FreeBSD version of x/sys/unix since that package
supports FreeBSD 12.3 and later.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This removes a use of state.NetNS which is a linux-specific field defined
in container_linux.go from the generic container_internal.go, allowing
that to build on non-linux platforms.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
The notify socket can now either be specified via an environment
variable or programatically (where the env is ignored). The
notify mode and the socket are now also displayed in `container inspect`
which comes in handy for debugging and allows for propper testing.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
bump github.com/container-orchestrated-devices/container-device-interface from 0.4.0 to 0.5.0
This requires that the cdi.Registry be instantiated with AutoRefresh disabled for CLI clients.
[NO NEW TESTS NEEDED]
Signed-off-by: Evan Lezar <elezar@nvidia.com>
* Correct spelling and typos.
* Improve language.
Co-authored-by: Ed Santiago <santiago@redhat.com>
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
currently, setting any sort of resource limit in a pod does nothing. With the newly refactored creation process in c/common, podman ca now set resources at a pod level
meaning that resource related flags can now be exposed to podman pod create.
cgroupfs and systemd are both supported with varying completion. cgroupfs is a much simpler process and one that is virtually complete for all resource types, the flags now just need to be added. systemd on the other hand
has to be handeled via the dbus api meaning that the limits need to be passed as recognized properties to systemd. The properties added so far are the ones that podman pod create supports as well as `cpuset-mems` as this will
be the next flag I work on.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
* Replace "setup", "lookup", "cleanup", "backup" with
"set up", "look up", "clean up", "back up"
when used as verbs. Replace also variations of those.
* Improve language in a few places.
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>