177 Commits

Author SHA1 Message Date
ffce869daa Revert "Exec: use ErrorConmonRead"
This reverts commit d3d97a25e8c87cf741b2e24ac01ef84962137106.

This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:40 -04:00
d3d97a25e8 Exec: use ErrorConmonRead
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.

instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:43:31 -05:00
4b72f9e401 exec: get the exit code from sync pipe instead of file
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there

Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:35:35 -05:00
b41c864d56 Ensure that exec sessions inherit supplemental groups
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-28 11:32:56 -05:00
170fd7b038 rootless: fix a regression when using -d
when using -d and port mapping, make sure the correct fd is injected
into conmon.

Move the pipe creation earlier as the fd must be known at the time we
create the container through conmon.

Closes: https://github.com/containers/libpod/issues/5167

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-02-18 15:33:38 +01:00
9f146b1b54 Merge pull request #4861 from giuseppe/add-cgroups-disabled-conmon
oci_conmon: do not create a cgroup under systemd
2020-01-22 17:00:48 +01:00
1951ff168a oci_conmon: do not create a cgroup under systemd
Detect whether we are running under systemd (if the INVOCATION_ID is
set).  If Podman is running under a systemd service, we do not need to
create a cgroup for conmon.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 20:29:28 +01:00
ac47e80b07 Add an API for Attach over HTTP API
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.

To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.

At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.

I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-16 13:49:21 -05:00
ba0a6f34e3 podman: add new option --cgroups=no-conmon
it allows to disable cgroups creation only for the conmon process.

A new cgroup is created for the container payload.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 18:56:51 +01:00
68185048cf oci_conmon: not make accessible dirs if not needed
do not change the permissions mask for the rundir and the tmpdir when
running a container with a user namespace and the current user is
mapped inside the user namespace.

The change was introduced with
849548ffb8e958e901317eceffdcc2d918cafd8d, that dropped the
intermediate mount namespace in favor of allowing root into the user
namespace to access these directories.

Closes: https://github.com/containers/libpod/issues/4846

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-14 14:45:14 +01:00
67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
0e9c208d3f Merge pull request #4805 from giuseppe/log-tag
log: support --log-opt tag=
2020-01-10 23:55:45 +01:00
71341a1948 log: support --log-opt tag=
support a custom tag to add to each log for the container.

It is currently supported only by the journald backend.

Closes: https://github.com/containers/libpod/issues/3653

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-10 10:35:19 +01:00
154b5ca85d Merge pull request #4818 from haircommander/piped-exec-fix
exec: fix pipes
2020-01-09 14:01:42 +01:00
776eb64ab2 exec: fix pipes
In a largely anticlimatic solution to the saga of piped input from conmon, we come to this solution.

When we pass the Stdin stream to the exec.Command structure, it's immediately consumed and lost, instead of being consumed through CopyDetachable().

When we don't pass -i in, conmon is not told to create a masterfd_stdin, and won't pass anything to the container.

With both, we can do

echo hi | podman exec -til cat

and get the expected hi

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-01-08 11:09:08 -05:00
8ae672632b fix lint: correct func identifier in comment
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-08 13:52:52 +01:00
da7595a69f rootless: use RootlessKit port forwarder
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:

* Very high throughput.
  Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
  (https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)

* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
  No UDP issue (#4586)

* No tcp_rmem issue (#4537)

* Probably works with IPv6. Even if not, it is trivial to support IPv6.  (#4311)

* Easily extensible for future support of SCTP

* Easily extensible for future support of `lxc-user-nic` SUID network

RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.

As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.

Fix #4586
May-fix #4559
Fix #4537
May-fix #4311

See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-08 19:35:17 +09:00
bd44fd5c81 Reap exec sessions on cleanup and removal
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).

To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.

Fixes #4666

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-12 16:35:37 -05:00
3e2d9f8662 Merge pull request #4352 from vrothberg/config-package
refactor libpod config into libpod/config
2019-10-31 19:21:46 +01:00
11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
381fa4df87 Merge pull request #4380 from giuseppe/rootless-create-cgroup-for-conmon
libpod, rootless: create cgroup for conmon
2019-10-30 21:42:47 +01:00
78e2a31943 libpod, rootless: create cgroup for conmon
always create a new cgroup for conmon also when running as rootless.
We were previously creating one only when necessary, but that behaves
differently than root containers.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-30 17:04:05 +01:00
0b9e07f7f2 Processes execed into container should match container label
Processes execed into a container were not being run with the correct label.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-10-29 16:05:42 -04:00
06850ea2c0 exec: remove unused var
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-10-21 17:04:27 -04:00
cab7bfbb21 Add a MissingRuntime implementation
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).

Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-15 15:59:20 -04:00
2d2646883f change error wording when conmon fails without logs
In some cases, conmon can fail without writing logs.  Change the wording
of the error message from

	"error reading container (probably exited) json message"
to
	"container create failed (no logs from conmon)"

to have a more helpful error message that is more consistent with other
errors at that stage of execution.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-14 13:46:10 +02:00
6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00