251 Commits

Author SHA1 Message Date
118e78c5d6 Add structure for new exec session tracking to DB
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.

This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.

Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-18 11:02:14 -04:00
521ff14d83 Revert "exec: get the exit code from sync pipe instead of file"
This reverts commit 4b72f9e4013411208751df2a92ab9f322d4da5b2.

Continues what began with revert of
d3d97a25e8c87cf741b2e24ac01ef84962137106 in previous commit.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:55 -04:00
ffce869daa Revert "Exec: use ErrorConmonRead"
This reverts commit d3d97a25e8c87cf741b2e24ac01ef84962137106.

This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:40 -04:00
6be87b2186 Revert "exec: fix error code when conmon fails"
This reverts commit 4632b81c81a73025a960e339f40bc805f8a6c70a.

We are reverting #5373 as well, which lays the foundation for
this commit, so it has to go as well.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:25:56 -04:00
ac354ac94a Fix spelling mistakes in code found by codespell
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-07 10:30:44 -05:00
4632b81c81 exec: fix error code when conmon fails
this is a cosmetic change that makes sure podman returns a sane error code when conmon dies underneath it

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-04 17:10:14 -05:00
d3d97a25e8 Exec: use ErrorConmonRead
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.

instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:43:31 -05:00
4b72f9e401 exec: get the exit code from sync pipe instead of file
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there

Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:35:35 -05:00
b163640c61 Allow devs to set labels in container images for default capabilities.
This patch allows users to specify the list of capabilities required
to run their container image.

Setting a image/container label "io.containers.capabilities=setuid,setgid"
tells podman that the contained image should work fine with just these two
capabilties, instead of running with the default capabilities, podman will
launch the container with just these capabilties.

If the user or image specified capabilities that are not in the default set,
the container will print an error message and will continue to run with the
default capabilities.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-02 16:37:32 -05:00
47c4ea3919 Merge pull request #5347 from baude/apiv2wait
rework apiv2 wait endpoint|binding
2020-03-02 20:23:26 +01:00
b41c864d56 Ensure that exec sessions inherit supplemental groups
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-28 11:32:56 -05:00
0904873100 rework apiv2 wait endpoint|binding
added the ability to wait on a condition (stopped, running, paused...) for a container.  if a condition is not provided, wait will default to the stopped condition which uses the original wait code paths.  if the condition is stopped, the container exit code will be returned.

also, correct a mux issue we discovered.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-02-28 09:36:53 -06:00
156ce5cd7d add pkg/capabilities
Add pkg/capabibilities to deal with capabilities.  The code has been
copied from Docker (and attributed with the copyright) but changed
significantly to only do what we really need.  The code has also been
simplified and will perform better due to removed redundancy.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-02-14 12:00:45 +01:00
ac47e80b07 Add an API for Attach over HTTP API
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.

To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.

At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.

I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-16 13:49:21 -05:00
123b8c627d if container is not in a pid namespace, stop all processes
When a container is in a PID namespace, it is enought to send
the stop signal to the PID 1 of the namespace, only send signals
to all processes in the container when the container is not in
a pid namespace.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-12-19 13:33:17 -05:00
bd44fd5c81 Reap exec sessions on cleanup and removal
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).

To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.

Fixes #4666

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-12 16:35:37 -05:00
25cc43c376 Add ContainerStateRemoving
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.

When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.

The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.

We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.

Fixes #3906

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-11-19 15:38:03 -05:00
2497b6c77b podman: add support for specifying MAC
I basically copied and adapted the statements for setting IP.

Closes #1136

Signed-off-by: Jakub Filak <jakub.filak@sap.com>
2019-11-06 16:22:19 +01:00
1df4dba0a0 Switch to bufio Reader for exec streams
There were many situations that made exec act funky with input. pipes didn't work as expected, as well as sending input before the shell opened.
Thinking about it, it seemed as though the issues were because of how os.Stdin buffers (it doesn't). Dropping this input had some weird consequences.
Instead, read from os.Stdin as bufio.Reader, allowing the input to buffer before passing it to the container.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-10-31 11:20:12 -04:00
5f8bf3d07d Add ensureState helper for checking container state
We have a lot of checks for container state scattered throughout
libpod. Many of these need to ensure the container is in one of a
given set of states so an operation may safely proceed.
Previously there was no set way of doing this, so we'd use unique
boolean logic for each one. Introduce a helper to standardize
state checks.

Note that this is only intended to replace checks for multiple
states. A simple check for one state (ContainerStateRunning, for
example) should remain a straight equality, and not use this new
helper.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-10-28 13:09:01 -04:00
cab7bfbb21 Add a MissingRuntime implementation
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).

Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-15 15:59:20 -04:00
6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00
82ac0d8925 Podman-remote run should wait for exit code
This change matches what is happening on the podman local side
and should eliminate a race condition.

Also exit commands on the server side should start to return to client.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-09-12 16:20:01 -04:00
535111b5d5 Use exit code constants
We have leaked the exit number codess all over the code, this patch
removes the numbers to constants.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-09-12 16:20:01 -04:00
8e337aff5a libpod: avoid polling container status
use the inotify backend to be notified on the container exit instead
of polling continuosly the runtime.  Polling the runtime slowns
significantly down the podman execution time for short lived
processes:

$ time bin/podman run --rm -ti fedora true

real	0m0.324s
user	0m0.088s
sys	0m0.064s

from:

$ time podman run --rm -ti fedora true

real	0m4.199s
user	0m5.339s
sys	0m0.344s

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-09-04 19:55:54 +02:00
cc3d8da968 exec: run with user specified on container start
Before, if the container was run with a specified user that wasn't root, exec would fail because it always set to root unless respecified by user.
instead, inherit the user from the container start.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-20 11:44:27 -04:00
337358ae63 Merge pull request #3690 from adrianreber/ignore-static-ip
restore: added --ignore-static-ip option
2019-08-05 16:11:50 +02:00
c23b92b409 restore: added --ignore-static-ip option
If a container is restored multiple times from an exported checkpoint
with the help of '--import --name', the restore will fail if during
'podman run' a static container IP was set with '--ip'. The user can
tell the restore process to ignore the static IP with
'--ignore-static-ip'.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-02 10:10:54 +02:00
9dcd76e369 Ensure we generate a 'stopped' event on force-remove
When forcibly removing a container, we are initiating an explicit
stop of the container, which is not reflected in 'podman events'.
Swap to using our standard 'stop()' function instead of a custom
one for force-remove, and move the event into the internal stop
function (so internal calls also register it).

This does add one more database save() to `podman remove`. This
should not be a terribly serious performance hit, and does have
the desirable side effect of making things generally safer.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
479eeac62c move editing of exitCode to runtime
There's no way to get the error if we successfully get an exit code (as it's just printed to stderr instead).
instead of relying on the error to be passed to podman, and edit based on the error code, process it on the varlink side instead

Also move error codes to define package

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-07-23 13:29:33 -04:00
a1a79c08b7 Implement conmon exec
This includes:
	Implement exec -i and fix some typos in description of -i docs
	pass failed runtime status to caller
	Add resize handling for a terminal connection
	Customize exec systemd-cgroup slice
	fix healthcheck
	fix top
	add --detach-keys
	Implement podman-remote exec (jhonce)
	* Cleanup some orphaned code (jhonce)
	adapt remote exec for conmon exec (pehunt)
	Fix healthcheck and exec to match docs
		Introduce two new OCIRuntime errors to more comprehensively describe situations in which the runtime can error
		Use these different errors in branching for exit code in healthcheck and exec
	Set conmon to use new api version

Signed-off-by: Jhon Honce <jhonce@redhat.com>

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-07-22 15:57:23 -04:00
db826d5d75 golangci-lint round #3
this is the third round of preparing to use the golangci-lint on our
code base.

Signed-off-by: baude <bbaude@redhat.com>
2019-07-21 14:22:39 -05:00
deb087d7b1 Merge pull request #3443 from adrianreber/rootfs-changes-migration
Include changes to the container's root file-system in the checkpoint archive
2019-07-19 02:38:26 +02:00
5bbede9d9f Remove exec PID files after use to prevent memory leaks
We have another patch running to do the same for exit files, with
a much more in-depth explanation of why it's necessary. Suffice
to say that persistent files in tmpfs tied to container CGroups
lead to significant memory allocations that last for the lifetime
of the file.

Based on a patch by Andrea Arcangeli (aarcange@redhat.com).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-18 09:06:11 -04:00
a78c885397 golangci-lint pass number 2
clean up and prepare to migrate to the golangci-linter

Signed-off-by: baude <bbaude@redhat.com>
2019-07-11 09:13:06 -05:00
05549e8b29 Add --ignore-rootfs option for checkpoint/restore
The newly added functionality to include the container's root
file-system changes into the checkpoint archive can now be explicitly
disabled. Either during checkpoint or during restore.

If a container changes a lot of files during its runtime it might be
more effective to migrated the root file-system changes in some other
way and to not needlessly increase the size of the checkpoint archive.

If a checkpoint archive does not contain the root file-system changes
information it will automatically be skipped. If the root file-system
changes are part of the checkpoint archive it is also possible to tell
Podman to ignore these changes.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-11 14:43:35 +02:00
1a32074884 Fix typo in checkpoint/restore related texts
Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-11 14:43:35 +02:00
150778820f Merge pull request #3324 from marcov/detach-keys-configurable
libpod: specify a detach keys sequence in libpod.conf
2019-07-01 15:54:27 +02:00
8561b99644 libpod removal from main (phase 2)
this is phase 2 for the removal of libpod from main.

Signed-off-by: baude <bbaude@redhat.com>
2019-06-27 07:56:24 -05:00
4f56964d55 libpod: fix hang on container start and attach
When a container is attached upon start, the WaitGroup counter may
never be decremented if an error is raised before start, causing
the caller to hang.
Synchronize with the start & attach goroutine using a channel, to be
able to detect failures before start.

Signed-off-by: Marco Vedovati <mvedovati@suse.com>
2019-06-26 10:17:29 +02:00
dd81a44ccf remove libpod from main
the compilation demands of having libpod in main is a burden for the
remote client compilations.  to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.

this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).

Signed-off-by: baude <bbaude@redhat.com>
2019-06-25 13:51:24 -05:00
de75b1a277 Fix a segfault in 'podman ps --sync'
We weren't properly populating the container's OCI Runtime in
Batch(), causing segfaults on attempting to access it. Add a test
to make sure we actually catch cases like this in the future.

Fixes #3411

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-24 09:26:03 -04:00
92bae8d308 Begin adding support for multiple OCI runtimes
Allow Podman containers to request to use a specific OCI runtime
if multiple runtimes are configured. This is the first step to
properly supporting containers in a multi-runtime environment.

The biggest changes are that all OCI runtimes are now initialized
when Podman creates its runtime, and containers now use the
runtime requested in their configuration (instead of always the
default runtime).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-19 17:08:43 -04:00
04858a218f stop/kill: inproper state errors: s/in state/is in state/
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 14:31:55 +02:00
0f75410e1c kill: print ID and state for non-running containers
Extend kill's error message to include the container's ID and state.
This address cases where error messages caused by other containers
may confuse users.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 10:55:54 +02:00
3bbb692d80 If container is not in correct state podman exec should exit with 126
This way a tool can determine if the container exists or not, but is in the
wrong state.

Since 126 is documeted as:
**_126_** if the **_contained command_** cannot be invoked

It makes sense that the container would exit with this state.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-06-12 05:15:58 -04:00
39f5ea4c04 Merge pull request #3180 from mheon/inspect_volumes
Begin to break up pkg/inspect
2019-06-08 14:45:24 +02:00
bef83c42ea migration: add possibility to restore a container with a new name
The option to restore a container from an external checkpoint archive
(podman container restore -i /tmp/checkpoint.tar.gz) restores a
container with the same name and same ID as id had before checkpointing.

This commit adds the option '--name,-n' to 'podman container restore'.
With this option the restored container gets the name specified after
'--name,-n' and a new ID. This way it is possible to restore one
container multiple times.

If a container is restored with a new name Podman will not try to
request the same IP address for the container as it had during
checkpointing. This implicitly assumes that if a container is restored
from a checkpoint archive with a different name, that it will be
restored multiple times and restoring a container multiple times with
the same IP address will fail as each IP address can only be used once.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-04 14:02:51 +02:00
0028578b43 Added support to migrate containers
This commit adds an option to the checkpoint command to export a
checkpoint into a tar.gz file as well as importing a checkpoint tar.gz
file during restore. With all checkpoint artifacts in one file it is
possible to easily transfer a checkpoint and thus enabling container
migration in Podman. With the following steps it is possible to migrate
a running container from one system (source) to another (destination).

 Source system:
  * podman container checkpoint -l -e /tmp/checkpoint.tar.gz
  * scp /tmp/checkpoint.tar.gz destination:/tmp

 Destination system:
  * podman pull 'container-image-as-on-source-system'
  * podman container restore -i /tmp/checkpoint.tar.gz

The exported tar.gz file contains the checkpoint image as created by
CRIU and a few additional JSON files describing the state of the
checkpointed container.

Now the container is running on the destination system with the same
state just as during checkpointing. If the container is kept running
on the source system with the checkpoint flag '-R', the result will be
that the same container is running on two different hosts.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-03 22:05:12 +02:00
a05cfd24bb Added helper functions for container migration
This adds a couple of function in structure members needed in the next
commit to make container migration actually work. This just splits of
the function which are not modifying existing code.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-03 22:05:12 +02:00