uid/gid mapping flags

Motivation
===========

This feature aims to make --uidmap and --gidmap easier to use, especially in rootless podman setups.

(I will focus here on the --gidmap option, although the same applies for --uidmap.)

In rootless podman, the user namespace mapping happens in two steps, through an intermediate mapping.

See https://docs.podman.io/en/latest/markdown/podman-run.1.html#uidmap-container-uid-from-uid-amount
for further detail, here is a summary:

First the user GID is mapped to 0 (root), and all subordinate GIDs (defined at /etc/subgid, and
usually >100000) are mapped starting at 1.

One way to customize the mapping is through the `--gidmap` option, that maps that intermediate mapping
to the final mapping that will be seen by the container.

As an example, let's say we have as main GID the group 1000, and we also belong to the additional GID 2000,
that we want to make accessible inside the container.

We first ask the sysadmin to subordinate the group to us, by adding "$user:2000:1" to /etc/subgid.

Then we need to use --gidmap to specify that we want to map GID 2000 into some GID inside the container.

And here is the first trouble:

Since the --gidmap option operates on the intermediate mapping, we first need to figure out where has
podman placed our GID 2000 in that intermediate mapping using:

    podman unshare cat /proc/self/gid_map

Then, we may see that GID 2000 was mapped to intermediate GID 5. So our --gidmap option should include:

    --gidmap 20000:5:1

This intermediate mapping may change in the future if further groups are subordinated to us (or we stop
having its subordination), so we are forced to verify the mapping with
`podman unshare cat /proc/self/gid_map` every time, and parse it if we want to script it.

**The first usability improvement** we agreed on #18333 is to be able to use:

    --gidmap 20000:@2000:1

so podman does this lookup in the parent user namespace for us.

But this is only part of the problem. We must specify a **full** gidmap and not only what we want:

    --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1

This is becoming complicated. We had to break the gidmap at 5, because the intermediate 5 had to
be mapped to another value (20000), and then we had to keep mapping all other subordinate ids... up to
close to the maximum number of subordinate ids that we have (or some reasonable value). This is hard
to explain to someone who does not understand how the mappings work internally.

To simplify this, **the second usability improvement** is to be able to use:

   --gidmap "+20000:@2000:1"

where the plus flag (`+`) states that the given mapping should extend any previous/default mapping,
overriding any previous conflicting assignment.

Podman will set that mapping and fill the rest of mapped gids with all other subordinated gids, leading
to the same (or an equivalent) full gidmap that we were specifying before.

One final usability improvement related to this is the following:

By default, when podman  gets a --gidmap argument but not a --uidmap argument, it copies the mapping.
This is convenient in many scenarios, since usually subordinated uids and gids are assigned in chunks
simultaneously, and the subordinated IDs in /etc/subuid and /etc/subgid for a given user match.

For scenarios with additional subordinated GIDs, this map copying is annoying, since it forces the user
to provide a --uidmap, to prevent the copy from being made. This means, that when the user wants:

    --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1

The user has to include a uidmap as well:

    --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 --uidmap 0:0:65000

making everything even harder to understand without proper context.

For this reason, besides the "+" flag, we introduce the "u" and "g" flags. Those flags applied to a
mapping tell podman that the mapping should only apply to users or groups, and ignored otherwise.

Therefore we can use:

   --gidmap "+g20000:@2000:1"

So the mapping only applies to groups and is ignored for uidmaps. If no "u" nor "g" flag is assigned
podman assumes the mapping applies to both users and groups as before, so we preserve backwards compatibility.

Co-authored-by: Tom Sweeney <tsweeney@redhat.com>
Signed-off-by: Sergio Oller <sergioller@gmail.com>
This commit is contained in:
Sergio Oller
2023-05-14 10:44:44 +02:00
parent 18c2a2be87
commit 91b8bc7f13
7 changed files with 1157 additions and 22 deletions

View File

@ -2,11 +2,11 @@
####> podman create, run
####> If file is edited, make sure the changes
####> are applicable to all of those.
#### **--gidmap**=*container_gid:host_gid:amount*
#### **--gidmap**=*[flags]container_uid:from_uid[:amount]*
Run the container in a new user namespace using the supplied GID mapping. This
option conflicts with the **--userns** and **--subgidname** options. This
option provides a way to map host GIDs to container GIDs in the same way as
__--uidmap__ maps host UIDs to container UIDs. For details see __--uidmap__.
Note: the **--gidmap** flag cannot be called in conjunction with the **--pod** flag as a gidmap cannot be set on the container level when in a pod.
Note: the **--gidmap** option cannot be called in conjunction with the **--pod** option as a gidmap cannot be set on the container level when in a pod.

View File

@ -2,16 +2,23 @@
####> podman create, run
####> If file is edited, make sure the changes
####> are applicable to all of those.
#### **--uidmap**=*container_uid:from_uid:amount*
#### **--uidmap**=*[flags]container_uid:from_uid[:amount]*
Run the container in a new user namespace using the supplied UID mapping. This
option conflicts with the **--userns** and **--subuidname** options. This
option provides a way to map host UIDs to container UIDs. It can be passed
several times to map different ranges.
The possible values of the optional *flags* are discussed further down on this page.
The *amount* value is optional and assumed to be **1** if not given.
The *from_uid* value is based upon the user running the command, either rootful or rootless users.
* rootful user: *container_uid*:*host_uid*:*amount*
* rootless user: *container_uid*:*intermediate_uid*:*amount*
* rootful user: [*flags*]*container_uid*:*host_uid*[:*amount*]
* rootless user: [*flags*]*container_uid*:*intermediate_uid*[:*amount*]
`Rootful mappings`
When **podman <<subcommand>>** is called by a privileged user, the option **--uidmap**
works as a direct mapping between host UIDs and container UIDs.
@ -28,6 +35,8 @@ If for example _amount_ is **4** the mapping looks like:
| *from_uid* + 2 | *container_uid* + 2 |
| *from_uid* + 3 | *container_uid* + 3 |
`Rootless mappings`
When **podman <<subcommand>>** is called by an unprivileged user (i.e. running rootless),
the value *from_uid* is interpreted as an "intermediate UID". In the rootless
case, host UIDs are not mapped directly to container UIDs. Instead the mapping
@ -76,13 +85,112 @@ Every additional range is added sequentially afterward:
| 1 | $FIRST_RANGE_ID | $FIRST_RANGE_LENGTH |
| 1+$FIRST_RANGE_LENGTH | $SECOND_RANGE_ID | $SECOND_RANGE_LENGTH|
By default, providing either **--uidmap** or **--gidmap** replaces the
whole mapping. If only one of those two options is given, the other one is
copied by default. If only one value of the two needs to be changed,
both values should be provided.
`Referencing a host ID from the parent namespace`
As a rootless user, the given host ID in **--uidmap** or **--gidmap**
is mapped from the *intermediate namespace* generated by Podman. Sometimes
it is desirable to refer directly at the *host namespace*. It is possible
to manually do so, by running `podman unshare cat /proc/self/gid_map`,
finding the desired host id at the second column of the output, and getting
the corresponding intermediate id from the first column.
Podman can perform all that by preceding the host id in the mapping
with the `@` symbol. For instance, by specifying `--gidmap 100000:@2000:1`,
podman will look up the intermediate id corresponding to host id `2000` and
it will map the found intermediate id to the container id `100000`. The
given host id must have been subordinated (otherwise it would not be mapped
into the intermediate space in the first place).
If the length is greater than one, for instance with `--gidmap 100000:@2000:2`,
Podman will map host ids `2000` and `2001` to `100000` and `100001`, respectively,
regardless of how the intermediate mapping is defined.
`Extending previous mappings`
Some mapping modifications may be cumbersome. For instance, a user
starts with a mapping such as `--gidmap="0:0:65000"`, that needs to be
changed such as the parent id `1000` is mapped to container id `100000`
instead, leaving container id `1` unassigned. The corresponding `--gidmap`
becomes `--gidmap="0:0:1" --gidmap="2:2:65534" --gidmap="100000:1:1"`.
This notation can be simplified using the `+` flag, that takes care of
breaking previous mappings removing any conflicting assignment with
the given mapping. The flag is given before the container id
as follows: `--gidmap="0:0:65000" --gidmap="+100000:1:1"`
Flag | Example | Description
-----------|---------------|-------------
`+` | `+100000:1:1` | Extend the previous mapping
This notation leads to gaps in the assignment, so it may be convenient to
fill those gaps afterwards: `--gidmap="0:0:65000" --gidmap="+100000:1:1" --gidmap="1:65001:1"`
One specific use case for this flag is in the context of rootless
users. A rootless user may specify mappings with the `+` flag as
in `--gidmap="+100000:1:1"`. Podman will then "fill the gaps" starting
from zero with all the remaining intermediate ids. This is convenient when
a user wants to map a specific intermediate id to a container id, leaving
the rest of subordinate ids to be mapped by Podman at will.
`Passing only one of --uidmap or --gidmap`
Usually, subordinated user and group ids are assigned simultaneously, and
for any user the subordinated user ids match the subordinated group ids.
For convenience, if only one of **--uidmap** or **--gidmap** is given,
podman assumes the mapping refers to both UIDs and GIDs and applies the
given mapping to both. If only one value of the two needs to be changed,
the mappings should include the `u` or the `g` flags to specify that
they only apply to UIDs or GIDs and should not be copied over.
flag | Example | Description
---------|-----------------|-----------------
`u` | `u20000:2000:1` |The mapping only applies to UIDs
`g` | `g10000:1000:1` |The mapping only applies to GIDs
For instance given the command
podman <<subcommand>> --gidmap "0:0:1000" --gidmap "g2000:2000:1"
Since no **--uidmap** is given, the **--gidmap** is copied to **--uidmap**,
giving a command equivalent to
podman <<subcommand>> --gidmap "0:0:1000" --gidmap "2000:2000:1" --uidmap "0:0:1000"
The `--gidmap "g2000:2000:1"` used the `g` flag and therefore it was
not copied to **--uidmap**.
`Rootless mapping of additional host GIDs`
A rootless user may desire to map a specific host group that has already been
subordinated within _/etc/subgid_ without specifying the rest of the mapping.
This can be done with **--gidmap "+g*container_gid*:@*host_gid*"**
Where:
- The host GID is given through the `@` symbol
- The mapping of this GID is not copied over to **--usermap** thanks to the `g` flag.
- The rest of the container IDs will be mapped starting from 0 to n,
with all the remaining subordinated GIDs, thanks to the `+` flag.
For instance, if a user belongs to the group `2000` and that group is
subordinated to that user (with `usermod --add-subgids 2000-2000 $USER`),
the user can map the group into the container with: **--gidmap=+g100000:@2000**.
If this mapping is combined with the option, **--group-add=keep-groups**, the
process in the container will belong to group `100000`, and files belonging
to group `2000` in the host will appear as being owned by group `100000`
inside the container.
podman run --group-add=keep-groups --gidmap="+100000:@2000" ...
`No subordinate UIDs`
Even if a user does not have any subordinate UIDs in _/etc/subuid_,
**--uidmap** can be used to map the normal UID of the user to a
container UID by running `podman <<subcommand>> --uidmap $container_uid:0:1 --user $container_uid ...`.
Note: the **--uidmap** flag cannot be called in conjunction with the **--pod** flag as a uidmap cannot be set on the container level when in a pod.
`Pods`
The **--uidmap** option cannot be called in conjunction with the **--pod** option as a uidmap cannot be set on the container level when in a pod.