104 Commits

Author SHA1 Message Date
a1e6603133 libpod: make use of new pasta option from c/common
pasta added a new --map-guest-addr to option that maps a to the actual
host ip. This is exactly what we need for host.containers.internal
entry. So we now make use of this option by default but still have to
keep the exclude fallback because the option is very new and some
users/distros will not have it yet.

This also fixes an issue where the --dns-forward ip were not used when
using the bridge network mode, only useful when not using aardvark-dns
as this used the proper ips there already from the rootless netns
resolv.conf file.

Fixes #19213

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-06 14:43:18 +02:00
def182d396 restore: fix missing network setup
The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so  I don't see it as problem.

Fixes #22901

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-24 18:52:02 +02:00
bf2de4177b Merge pull request #23064 from giuseppe/podman-pass-timeout-stop-to-systemd
container: pass StopTimeout to the systemd slice
2024-06-23 14:57:55 +00:00
49eb5af301 libpod: intermediate mount if UID not mapped into the userns
if the current user is not mapped into the new user namespace, use an
intermediate mount to allow the mount point to be accessible instead
of opening up all the parent directories for the mountpoint.

Closes: https://github.com/containers/podman/issues/23028

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
08a8429459 libpod: avoid chowning the rundir to root in the userns
so it is possible to remove the code to make the entire directory
world accessible.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
c81f075f43 libpod: do not chmod bind mounts
with the new mount API is available, the OCI runtime doesn't require
that each parent directory for a bind mount must be accessible.
Instead it is opened in the initial user namespace and passed down to
the container init process.

This requires that the kernel supports the new mount API and that the
OCI runtime uses it.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
7d22f04f56 container: pass KillSignal and StopTimeout to the systemd scope
so that they are honored when systemd terminates the scope.

Closes: https://issues.redhat.com/browse/RHEL-16375

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 13:46:08 +02:00
046c0e5fc2 Only stop chowning volumes once they're not empty
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.

For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.

In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.

Fixes #22571

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2024-05-22 17:47:01 -04:00
fb2ab832a7 fix incorrect host.containers.internal entry for rootless bridge mode
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.

The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.

Fixes #22653

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-17 12:28:44 +02:00
693ae0ebc6 Add support for image volume subpaths
Image volumes (the `--mount type=image,...` kind, not the
`podman volume create --driver image ...` kind - it's strange
that we have two) are needed for our automount scheme, but the
request is that we mount only specific subpaths from the image
into the container. To do that, we need image volume subpath
support. Not that difficult code-wise, mostly just plumbing.

Also, add support to the CLI; not strictly necessary, but it
doesn't hurt anything and will make testing easier.

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-04-25 14:12:27 -04:00
83dbbc3a51 Replace golang.org/x/exp/slices with slices from std
Use "slices" from the standard library, this package was added in go
1.21 so we can use it now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-04-23 11:16:40 +02:00
5656ad40b1 libpod: use fileutils.(Le|E)xists
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 09:52:14 +02:00
a40cf3195a Bump tags.cncf.io/container-device-interface to v0.7.1
This includes migrating from cdi.GetRegistry() to cdi.Configure() and
cdi.GetDefaultCache() as applicable.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-06 12:25:26 +02:00
519a66c6a9 container: do not chown to dest target with U
if the 'U' option is provided, do not chown the destination target to
the existing target in the image.

Closes: https://github.com/containers/podman/issues/22224

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-03 14:41:33 +02:00
d81319eb71 libpod: use original IDs if idmap is provided
if the volume is mounted with "idmap", there should not be any mapping
using the user namespace mappings since this is done at runtime using
the "idmap" kernel feature.

Closes: https://github.com/containers/podman/issues/22228

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-03-31 23:46:17 +02:00
dc1795b4b2 use new c/common pasta2 setup logic to fix dns
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.

A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.

Fixes #22044

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-03-19 12:09:31 +01:00
f59a5f1351 Fix running container from docker client with rootful in rootless podman.
This effectively fix errors like "unable to upgrade to tcp, received
409" like #19930 in the special case where podman itself is running
rootful but inside a container which itself is rootless.

[NO NEW TESTS NEEDED]

Signed-off-by: Romain Geissler <romain.geissler@amadeus.com>
2024-02-18 15:09:14 +00:00
c29fde2656 libpod: correctly map UID/GID for existing dirs
if the target mount path already exists and the container uses a user
namespace, correctly map the target UID/GID to the host values before
attempting a chown.

Closes: https://github.com/containers/podman/issues/21608

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-02-12 23:04:24 +01:00
72f1617fac Bump Go module to v5
Moving from Go module v4 to v5 prepares us for public releases.

Move done using gomove [1] as with the v3 and v4 moves.

[1] https://github.com/KSubedi/gomove

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-08 09:35:39 -05:00
fcae702205 Reuse timezone code from containers/common
Replaces: https://github.com/containers/podman/pull/21077

[NO NEW TESTS NEEDED] Existing tests should handle this.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-02-06 07:09:16 -05:00
9fb57d346f Cease using deprecated runc userlookup
Instead switch to github.com/moby/sys/user, which we already had
as an indirect dependency.

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-02 11:02:43 -05:00
83f89db6c8 Merge pull request #20961 from karuboniru/patch-1
fix checking of relative idmapped mount
2024-01-11 17:20:56 +00:00
380fa1c836 Remove redundant code in generateSpec()
Conditional expression duplicates the
code above, therefore, remove it

Found by Linux Verification Center (linuxtesting.org) with SVACE.

[NO NEW TESTS NEEDED]

Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
2024-01-09 17:26:03 +03:00
8bdf77aa20 Refactor: replace StringInSlice with slices.Contains
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-05 16:25:56 +02:00
2a2d0b0e18 chore: delete obsolete // +build lines
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-04 11:53:38 +02:00
db68764d8b Fix Docker API compatibility with network alias (#17167)
* Add BaseHostsFile to container configuration
* Do not copy /etc/hosts file from host when creating a container using Docker API

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
2023-12-14 23:31:44 -05:00
e7eb97b84a fix checking of relative idmapped mount
Like stated in [PR for crun](https://github.com/containers/crun/pull/1372)

that HostID is what being mapped here, so we should be checking `HostID` instead of `ContainerID`. `v.ContainerID` here is the id of owner of files on filesystem, that can be totally unrelated to the uid maps.

Signed-off-by: Karuboniru <yanqiyu01@gmail.com>
2023-12-09 20:16:38 +00:00
c8f262fec9 Use idtools.SafeChown and SafeLchown everywhere
If we get an error chowning a file or directory to a UID/GID pair
for something like ENOSUP or EPERM, then we should ignore as long as the UID/GID
pair on disk is correct.

Fixes: https://github.com/containers/podman/issues/20801

[NO NEW TESTS NEEDED]

Since this is difficult to test and existing tests should be sufficient
to ensure no regression.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-27 20:41:56 -05:00
ddd6cdfd77 Ignore SELinux relabel on unsupported file systems
We were ignoreing relabel requests on certain unsupported
file systems and not on others, this changes to consistently
logrus.Debug ENOTSUP file systems.

Fixes: https://github.com/containers/podman/discussions/20745

Still needs some work on the Buildah side.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-22 09:25:38 -05:00
e40d70cecc new 'no-dereference' mount option
Add a new `no-dereference` mount option supported by crun 1.11+ to
re-create/copy a symlink if it's the source of a mount.  By default the
kernel will resolve the symlink on the host and mount the target.
As reported in #20098, there are use cases where the symlink structure
must be preserved by all means.

Fixes: #20098
Fixes: issues.redhat.com/browse/RUN-1935
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-11-21 13:17:58 +01:00
7d107b9892 Merge pull request #19879 from rhatdan/ulimits
Support passing of Ulimits as -1 to mean max
2023-11-10 10:47:43 +00:00
942bcf34b8 Update container-device-interface (CDI) to v0.6.2
This updates the container-device-interface dependency to v0.6.2 and renames the import to
tags.cncf.io/container-device-interface to make use of the new vanity URL.

[NO NEW TESTS NEEDED]

Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-04 01:12:06 +01:00
18d6bb40d5 Support passing of Ulimits as -1 to mean max
Docker allows the passing of -1 to indicate the maximum limit
allowed for the current process.

Fixes: https://github.com/containers/podman/issues/19319

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-01 08:46:55 -04:00
4f6a8f0d50 Merge pull request #20483 from vrothberg/RUN-1934
container.conf: support attributed string slices
2023-10-27 17:49:13 +00:00
e966c86d98 container.conf: support attributed string slices
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-10-27 12:44:33 +02:00
be7dd128ef Mask /sys/devices/virtual/powercap
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-10-26 18:24:25 -04:00
bad25da92e libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
29273cda10 lint: fix warnings found by perfsprint
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-20 16:27:46 +02:00
9b17d6cb06 vendor: update checkpointctl to v1.1.0
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2023-09-12 08:41:02 +01:00
19bd9b33dd libpod: move oom_score_adj clamp to init
commit 8b4a79a744ac3fd176ca4dc0e3ae40f75159e090 introduced
oom_score_adj clamping when the container oom_score_adj value is lower
than the current one in a rootless environment.  Move the check to
init() time so it is performed every time the container starts and not
only when it is created.  It is more robust if the oom_score_adj value
is changed for the current user session.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-09-11 17:04:37 +02:00
702709a916 libpod: do not parse --hostuser in base 8
fix the parsing of --hostuser to treat the input in base 10.

Closes: https://github.com/containers/podman/issues/19800

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-08-31 12:34:58 +02:00
6f284dbd46 podman exec should set umask to match container
Fixes: https://github.com/containers/podman/issues/19713

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-08-24 13:20:06 -04:00
27b41f0877 libpod: use /var/run instead of /run on FreeBSD
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-08-17 14:04:53 +01:00
22a8b68866 make /dev & /dev/shm read/only when --read-only --read-only-tmpfs=false
The intention of --read-only-tmpfs=fals when in --read-only mode was to
not allow any processes inside of the container to write content
anywhere, unless the caller also specified a volume or a tmpfs. Having
/dev and /dev/shm writable breaks this assumption.

Fixes: https://github.com/containers/podman/issues/12937

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-30 06:09:30 -04:00
a224ff7313 Should be checking tmpfs versus type not source
Paul found this in https://github.com/containers/podman/pull/19241

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-15 07:57:37 -04:00
f256f4f954 Use constants for mount types
Inspired by https://github.com/containers/podman/pull/19238

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-14 07:17:21 -04:00
2a9b9bb53f netavark: macvlan networks keep custom nameservers
The change to use the custom dns server in aardvark-dns caused a
regression here because macvlan networks never returned the nameservers
in netavark and it also does not make sense to do so.

Instead check here if we got any network nameservers, if not we then use
the ones from the config if set otherwise fallback to host servers.

Fixes #19169

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-12 14:07:34 +02:00
dee94ea699 bugfix: do not try to parse empty ranges
An empty range caused a panic as parseOptionIDs tried to check further
down for an @ at index 0 without taking into account that the splitted
out string could be empty.

Signed-off-by: Simon Brakhane <simon@brakhane.net>
2023-07-06 11:16:34 +02:00
39624473b0 pasta: Create /etc/hosts entries for pods using pasta networking
For pods with bridged and slirp4netns networking we create /etc/hosts
entries to make it more convenient for the containers to address each
other.  We omitted to do this for pasta networking, however.  Add the
necessary code to do this.

Closes: https://github.com/containers/podman/issues/17922

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-30 13:04:02 +10:00
614c962c23 use libnetwork/slirp4netns from c/common
Most of the code moved there so if from there and remove it here.

Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.

[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-22 11:16:13 +02:00