126 Commits

Author SHA1 Message Date
e88d8dbeae fix rootless port forwarding with network dis-/connect
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]

To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.

Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().

Fixes #10052

[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-03 16:29:09 +02:00
d24fc6b843 Merge pull request #10939 from Luap99/rootless-cni
Fix race conditions in rootless cni setup
2021-07-15 11:11:10 -04:00
0007c98ddb Fix race conditions in rootless cni setup
There was an race condition when calling `GetRootlessCNINetNs()`. It
created the rootless cni directory before it got locked. Therefore
another process could have called cleanup and removed this directory
before it was used resulting in errors. The lockfile got moved into the
XDG_RUNTIME_DIR directory to prevent a panic when the parent dir was
removed by cleanup.

Fixes #10930
Fixes #10922

To make this even more robust `GetRootlessCNINetNs()` will now return
locked. This guarantees that we can run `Do()` after `GetRootlessCNINetNs()`
before another process could have called `Cleanup()` in between.

[NO TESTS NEEDED] CI is flaking, hopefully this will fix it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-07-15 14:33:56 +02:00
e73d482990 CNI-in-slirp4netns: fix bind-mount for /run/systemd/resolve/stub-resolv.conf
Fix issue 10929 : `[Regression in 3.2.0] CNI-in-slirp4netns DNS gets broken when running a rootful container after running a rootless container`

When /etc/resolv.conf on the host is a symlink to /run/systemd/resolve/stub-resolv.conf,
we have to mount an empty filesystem on /run/systemd/resolve in the child namespace,
so as to isolate the directory from the host mount namespace.

Otherwise our bind-mount for /run/systemd/resolve/stub-resolv.conf is unmounted
when systemd-resolved unlinks and recreates /run/systemd/resolve/stub-resolv.conf on the host.

[NO TESTS NEEDED]

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-15 17:25:09 +09:00
2c7c679584 Make rootless-cni setup more robust
The rootless cni namespace needs a valid /etc/resolv.conf file. On some
distros is a symlink to somewhere under /run. Because the kernel will
follow the symlink before mounting, it is not possible to mount a file
at exactly /etc/resolv.conf. We have to ensure that the link target will
be available in the rootless cni mount ns.

Fixes #10855

Also fixed a bug in the /var/lib/cni directory lookup logic. It used
`filepath.Base` instead of `filepath.Dir` and thus looping infinitely.

Fixes #10857

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-07-06 18:40:03 +02:00
e159eb892b Merge pull request #10754 from Luap99/sync-lock
getContainerNetworkInfo: lock netNsCtr before sync
2021-06-23 04:25:44 -04:00
a84fa194b7 getContainerNetworkInfo: lock netNsCtr before sync
`syncContainer()` requires the container to be locked, otherwise we can
end up with undefined behavior.

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-22 16:51:21 +02:00
e014608539 Do not use inotify for OCICNI
Podman does not need to watch the cni config directory. If a network is
not found in the cache, OCICNI will reload the networks anyway and thus
even podman system service should work as expected.
Also include a change to not mount a "new" /var by default in the
rootless cni ns, instead try to use /var/lib/cni first and then the
parent dir. This allows users to store cni configs under /var/... which
is the case for the CI compose test.

[NO TESTS NEEDED]

Fixes #10686

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-22 16:00:47 +02:00
44d9c453d3 Fix network connect race with docker-compose
Network connect/disconnect has to call the cni plugins when the network
namespace is already configured. This is the case for `ContainerStateRunning`
and `ContainerStateCreated`. This is important otherwise the network is
not attached to this network namespace and libpod will throw errors like
`network inspection mismatch...` This problem happened when using
`docker-compose up` in attached mode.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-11 16:00:12 +02:00
7ef3981abe Enable port forwarding on host
Using the gvproxy application on the host, we can now port forward from
the machine vm on the host.  It requires that 'gvproxy' be installed in
an executable location.  gvproxy can be found in the
containers/gvisor-tap-vsock github repo.

[NO TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2021-06-01 10:13:18 -05:00
62a7d4b61e Merge pull request #9972 from bblenard/issue-5651-hostname-for-container-gateway
Add host.containers.internal entry into container's etc/hosts
2021-05-17 10:45:23 -04:00
c8dfcce6db Add host.containers.internal entry into container's etc/hosts
This change adds the entry `host.containers.internal` to the `/etc/hosts`
file within a new containers filesystem. The ip address is determined by
the containers networking configuration and points to the gateway address
for the containers networking namespace.

Closes #5651

Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
2021-05-17 08:21:22 -05:00
4462113c5e podman network reload add rootless support
Allow podman network reload to be run as rootless user. While it is
unlikely that the iptable rules are flushed inside the rootless cni
namespace, it could still happen. Also fix podman network reload --all
to ignore errors when a container does not have the bridge network mode,
e.g. slirp4netns.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-05-17 10:55:02 +02:00
f99b7a314b Fix rootlesskit port forwarder with custom slirp cidr
The source ip for the rootlesskit port forwarder was hardcoded to the
standard slirp4netns ip. This is incorrect since users can change the
subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`.
The container interface ip is always the .100 in the subnet. Only when
the rootlesskit port forwarder child ip matches the container interface
ip the port forwarding will work.

Fixes #9828

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-23 11:12:49 +02:00
0a39ad196c podman unshare: add --rootless-cni to join the ns
Add a new --rootless-cni option to podman unshare to also join the
rootless-cni network namespace. This is useful if you want to connect
to a rootless container via IP address. This is only possible from the
rootless-cni namespace and not from the host namespace. This option also
helps to debug problems in the rootless-cni namespace.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-07 15:54:12 +02:00
f230214db1 rootless cni add /usr/sbin to PATH if not present
The CNI plugins need access to iptables in $PATH. On debian /usr/sbin
is not added to $PATH for rootless users. This will break rootless
cni completely. To prevent breaking existing users add /usr/sbin to
$PATH in podman if needed.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-06 23:55:05 +02:00
973807092d Use the slrip4netns dns in the rootless cni ns
If a user only has a local dns server in the resolv.conf file the dns
resolution will fail. Instead we create a new resolv.conf which will use
the slirp4netns dns.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
6cd807e3b7 Cleanup the rootless cni namespace
Delte the network namespace and kill the slirp4netns process when it is
no longer needed.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
d7e003f362 Remove unused rootless-cni-infra container files
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
db19224b6d Only use rootless RLK when the container has ports
Do not invoke the rootlesskit port forwarder when the container has no
ports.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
294c90b05e Enable rootless network connect/disconnect
With the new rootless cni supporting network connect/disconnect is easy.
Combine common setps into extra functions to prevent code duplication.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
94e67ba9a2 Move slirp4netns functions into an extra file
This should make maintenance easier.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
00b2ec5e6f Add rootless support for cni and --uidmap
This is supported with the new rootless cni logic.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
54b588c07d rootless cni without infra container
Instead of creating an extra container create a network and mount
namespace inside the podman user namespace. This ns is used to
for rootless cni operations.
This helps to align the rootless and rootful network code path.
If we run as rootless we just have to set up a extra net ns and
initialize slirp4netns in it. The ocicni lib will be called in
that net ns.

This design allows allows easier maintenance, no extra container
with pause processes, support for rootless cni with --uidmap
and possibly more.

The biggest problem is backwards compatibility. I don't think
live migration can be possible. If the user reboots or restart
all cni containers everything should work as expected again.
The user is left with the rootless-cni-infa container and image
but this can safely be removed.

To make the existing cni configs work we need execute the cni plugins
in a extra mount namespace. This ensures that we can safely mount over
/run and /var which have to be writeable for the cni plugins without
removing access to these files by the main podman process. One caveat
is that we need to keep the netns files at `XDG_RUNTIME_DIR/netns`
accessible.

`XDG_RUNTIME_DIR/rootless-cni/{run,var}` will be mounted to `/{run,var}`.
To ensure that we keep the netns directory we bind mount this relative
to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns
before we mount the run directory. The run directory is mounted recursive,
this makes the netns directory at the same path accessible as before.

This also allows iptables-legacy to work because /run/xtables.lock is
now writeable.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
c5f9819dac Silence podman network reload errors with iptables-nft
Make sure we do not display the expected error when using podman network
reload. This is already done for iptables-legacy however iptables-nft
creates a slightly different error message so check for this as well.
The error is logged at info level.

[NO TESTS NEEDED] The test VMs do not use iptables-nft so there is no
way to test this. It is already tested for iptables-legacy.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-03-30 10:48:26 +02:00
aa0a57f095 Fix cni teardown errors
Make sure to pass the cni interface descriptions to cni teardowns.
Otherwise cni cannot find the correct cache files because the
interface name might not match the networks. This can only happen
when network disconnect was used.

Fixes #9602

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-03-04 11:43:59 +01:00
f152f9cf09 Network connect error if net mode is not bridge
Only the the network mode bridge supports cni networks.
Other network modes cannot use network connect/disconnect
so we should throw a error.

Fixes #9496

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-23 22:30:04 +01:00
9d818be732 Fix podman network IDs handling
The libpod network logic knows about networks IDs but OCICNI
does not. We cannot pass the network ID to OCICNI. Instead we
need to make sure we only use network names internally. This
is also important for libpod since we also only store the
network names in the state. If we would add a ID there the
same networks could accidentally be added twice.

Fixes #9451

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-22 15:51:49 +01:00
5dded6fae7 bump go module to v3
We missed bumping the go module, so let's do it now :)

* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-02-22 09:03:51 +01:00
69ab67bf90 Enable golint linter
Use the golint linter and fix the reported problems.

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-11 23:01:49 +01:00
5c6ab3075e Fix podman network disconnect wrong NetworkStatus number
The allocated `tmpNetworkStatus` must be allocated with the length 0.
Otherwise append would add new elements to the end of the slice and
not at the beginning of the allocated memory.

This caused inspect to fail since the number of networks did not
matched the number of network statuses.

Fixes #9234

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-04 19:41:30 +01:00
0959196807 Make slirp MTU configurable (network_cmd_options)
The mtu default value is currently forced to 65520.
This let the user control it using the config key network_cmd_options,
i.e.: network_cmd_options=["mtu=9000"]

Signed-off-by: bitstrings <pino.silvaggio@gmail.com>
2021-02-02 13:50:26 -05:00
02ec5299f6 Add default net info in container inspect
when inspecting a container that is only connected to the default
network, we should populate the default network in the container inspect
information.

Fixes: #6618

Signed-off-by: baude <bbaude@redhat.com>

MH: Small fixes, added another test

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2021-01-26 16:00:06 -05:00
0ba1942f26 networking: lookup child IP in networks
if a CNI network is added to the container, use the IP address in that
network instead of hard-coding the slirp4netns default.

commit 5e65f0ba30f3fca73f8c207825632afef08378c1 introduced this
regression.

Closes: https://github.com/containers/podman/issues/9065

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-23 18:28:56 +01:00
ef654941d1 libpod: move slirp magic IPs to consts
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-22 08:08:27 +01:00
5e65f0ba30 rootlessport: set source IP to slirp4netns device
set the source IP to the slirp4netns address instead of 127.0.0.1 when
using rootlesskit.

Closes: https://github.com/containers/podman/issues/5138

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-22 08:08:26 +01:00
d9ebbbfe5b Switch references of /var/run -> /run
Systemd is now complaining or mentioning /var/run as a legacy directory.
It has been many years where /var/run is a symlink to /run on all
most distributions, make the change to the default.

Partial fix for https://github.com/containers/podman/issues/8369

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-07 05:37:24 -05:00
25b7198441 The slirp4netns sandbox requires pivot_root
Disable the sandbox, when running on rootfs

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
2020-12-29 18:03:49 +01:00
4fa1fce930 Spelling
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-12-22 13:34:31 -05:00
46337b4708 Make podman stats slirp check more robust
Just checking for `rootless.IsRootless()` does not catch all the
cases where slirp4netns is in use - we actually allow it to be
used as root as well. Fortify the conditional here so we don't
fail in the root + slirp case.

Fixes #7883

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-12-08 09:59:00 -05:00
9b3a81a002 Merge pull request #8571 from Luap99/podman-network-reload
Implement pod-network-reload
2020-12-08 06:15:40 -05:00
b0286d6b43 Implement pod-network-reload
This adds a new command, 'podman network reload', to reload the
networks of existing containers, forcing recreation of firewall
rules after e.g. `firewall-cmd --reload` wipes them out.

Under the hood, this works by calling CNI to tear down the
existing network, then recreate it using identical settings. We
request that CNI preserve the old IP and MAC address in most
cases (where the container only had 1 IP/MAC), but there will be
some downtime inherent to the teardown/bring-up approach. The
architecture of CNI doesn't really make doing this without
downtime easy (or maybe even possible...).

At present, this only works for root Podman, and only locally.
I don't think there is much of a point to adding remote support
(this is very much a local debugging command), but I think adding
rootless support (to kill/recreate slirp4netns) could be
valuable.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2020-12-07 19:26:23 +01:00
d6d3af9e8e Add ability to set system wide options for slirp4netns
Wire in containers.conf options for slirp

Signed-off-by: Ashley Cui <acui@redhat.com>
2020-12-04 13:37:22 -05:00
7d43cc06dc network connect disconnect on non-running containers
a container can connect and disconnet to networks even when not in a
running state.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-30 16:10:01 -06:00
1ddb19bc8e update container status with new results
a bug was being caused by the fact that the container network results
were not being updated properly.

given that jhon is on PTO, this PR will replace #8362

Signed-off-by: baude <bbaude@redhat.com>
2020-11-23 15:20:39 -06:00
44da01f45c Refactor compat container create endpoint
* Make endpoint compatibile with docker-py network expectations
* Update specgen helper when called from compat endpoint
* Update godoc on types
* Add test for network/container create using docker-py method
* Add syslog logging when DEBUG=1 for tests

Fixes #8361

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2020-11-23 15:20:39 -06:00
ce775248ad Make c.networks() list include the default network
This makes things a lot more clear - if we are actually joining a
CNI network, we are guaranteed to get a non-zero length list of
networks.

We do, however, need to know if the network we are joining is the
default network for inspecting containers as it determines how we
populate the response struct. To handle this, add a bool to
indicate that the network listed was the default network, and
only the default network.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-11-20 14:03:24 -05:00
a3e0b7d117 add network connect|disconnect compat endpoints
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.

additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-19 08:16:19 -06:00
d3e794bda3 add network connect|disconnect compat endpoints
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.

additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-17 14:22:39 -06:00
a65ecc70c2 Merge pull request #8304 from rhatdan/error
Cleanup error reporting
2020-11-12 22:33:25 +01:00