149 Commits

Author SHA1 Message Date
8489dc4345 move go module to v2
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules.  While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.

Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`.  The renaming of the imports
was done via `gomove` [1].

[1] https://github.com/KSubedi/gomove

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-07-06 15:50:12 +02:00
c2a0ccd394 Merge pull request #6747 from giuseppe/fix-user-volumes
container: move volume chown after spec generation
2020-06-30 12:01:40 -04:00
b32172e20b container: move volume chown after spec generation
move the chown for newly created volumes after the spec generation so
the correct UID/GID are known.

Closes: https://github.com/containers/libpod/issues/5698

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-06-29 17:58:50 +02:00
6ee5f740a4 podman: add new cgroup mode split
When running under systemd there is no need to create yet another
cgroup for the container.

With conmon-delegated the current cgroup will be split in two sub
cgroups:

- supervisor
- container

The supervisor cgroup will hold conmon and the podman process, while
the container cgroup is used by the OCI runtime (using the cgroupfs
backend).

Closes: https://github.com/containers/libpod/issues/6400

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-06-25 17:16:12 +02:00
7fe4c5204e Set stop signal to 15 when not explicitly set
When going through the output of `podman inspect` to try and
identify another issue, I noticed that Podman 2.0 was setting
StopSignal to 0 on containers by default. After chasing it
through the command line and SpecGen, I determined that we were
actually not setting a default in Libpod, which is strange
because I swear we used to do that. I re-added the disappeared
default and now all is well again.

Also, while I was looking for the bug in SpecGen, I found a bunch
of TODOs that have already been done. Eliminate the comments for
these.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-06-24 09:27:20 -04:00
0e171b7b33 Do not share container log driver for exec
When the container uses journald logging, we don't want to
automatically use the same driver for its exec sessions. If we do
we will pollute the journal (particularly in the case of
healthchecks) with large amounts of undesired logs. Instead,
force exec sessions logs to file for now; we can add a log-driver
flag later (we'll probably want to add a `podman logs` command
that reads exec session logs at the same time).

As part of this, add support for the new 'none' logs driver in
Conmon. It will be the default log driver for exec sessions, and
can be optionally selected for containers.

Great thanks to Joe Gooch (mrwizard@dok.org) for adding support
to Conmon for a null log driver, and wiring it in here.

Fixes #6555

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-06-17 11:11:46 -04:00
200cfa41a4 Turn on More linters
- misspell
    - prealloc
    - unparam
    - nakedret

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-06-15 07:05:56 -04:00
9d964ffb9f Ensure Conmon is alive before waiting for exit file
This came out of a conversation with Valentin about
systemd-managed Podman. He discovered that unit files did not
properly handle cases where Conmon was dead - the ExecStopPost
`podman rm --force` line was not actually removing the container,
but interestingly, adding a `podman cleanup --rm` line would
remove it. Both of these commands do the same thing (minus the
`podman cleanup --rm` command not force-removing running
containers).

Without a running Conmon instance, the container process is still
running (assuming you killed Conmon with SIGKILL and it had no
chance to kill the container it managed), but you can still kill
the container itself with `podman stop` - Conmon is not involved,
only the OCI Runtime. (`podman rm --force` and `podman stop` use
the same code to kill the container). The problem comes when we
want to get the container's exit code - we expect Conmon to make
us an exit file, which it's obviously not going to do, being
dead. The first `podman rm` would fail because of this, but
importantly, it would (after failing to retrieve the exit code
correctly) set container status to Exited, so that the second
`podman cleanup` process would succeed.

To make sure the first `podman rm --force` succeeds, we need to
catch the case where Conmon is already dead, and instead of
waiting for an exit file that will never come, immediately set
the Stopped state and remove an error that can be caught and
handled.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-06-08 13:48:29 -04:00
6b9e9610d8 Enable cleanup processes for detached exec
The cleanup command creation logic is made public as part of this
and wired such that we can call it both within SpecGen (to make
container exit commands) and from the ABI detached exec handler.
Exit commands are presently only used for detached exec, but
theoretically could be turned on for all exec sessions if we
wanted (I'm declining to do this because of potential overhead).

I also forgot to copy the exit command from the exec config into
the ExecOptions struct used by the OCI runtime, so it was not
being added.

There are also two significant bugfixes for exec in here. One is
for updating the status of running exec sessions - this was
always failing as I had coded it to remove the exit file *before*
reading it, instead of after (oops). The second was that removing
a running exec session would always fail because I inverted the
check to see if it was running.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-05-20 16:11:05 -04:00
892d81685c Ensure that cleanup runs before we set Removing state
Cleaning up the OCI runtime is not allowed in the Removing state.
To ensure it is actually cleaned up, when calling cleanup() as
part of removing a container, do so before we set the Removing
state, so we can successfully remove.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-05-14 11:58:02 -04:00
c3c030f550 Enable prune integration test. Fixes container prune.
Fixes container prune to prune created and configured containers.
Disables couple of system prune test as not yet in with v2.

Signed-off-by: Sujil02 <sushah@redhat.com>
2020-04-30 12:03:09 -04:00
ba430bfe5e podman v2 remove bloat v2
rid ourseleves of libpod references in v2 client

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-04-16 12:04:46 -05:00
ec4060aef6 Ability to prune container in api V2
Adds ability to prune containers for v2.
Adds client side prompt with force flag and filters options to prune.

Signed-off-by: Sujil02 <sushah@redhat.com>
2020-04-15 11:17:33 -04:00
4352d58549 Add support for containers.conf
vendor in c/common config pkg for containers.conf

Signed-off-by: Qi Wang qiwan@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-27 14:36:03 -04:00
0c40b62c77 Implement APIv2 Exec Create and Inspect Endpoints
Start and Resize require further implementation work.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-23 16:20:42 -04:00
aa6c8c2e55 Merge pull request #5088 from mheon/begin_exec_rework
Begin exec rework
2020-03-19 22:09:40 +01:00
118e78c5d6 Add structure for new exec session tracking to DB
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.

This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.

Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-18 11:02:14 -04:00
f4e873c4e1 auto updates
Add support to auto-update containers running in systemd units as
generated with `podman generate systemd --new`.

`podman auto-update` looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated.  We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image.  If an image must be
updated, Podman pulls it down and restarts the container.  Note that the
restarting sequence relies on systemd.

At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
`podman-generate-systemd` and is set to `%n` (i.e., the name of systemd
unit starting the container).  This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container.  This enforcement is necessary to know which image to
actually check and pull.  If we used an image ID, we would not know
which image to check/pull anymore.

Fixes: #3575
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-03-17 17:18:56 +01:00
8b5e2a6297 add default network for apiv2 create
during container creation, if no network is provided, we need to add a default value so the container can be later started.

use apiv2 container creation for RunTopContainer instead of an exec to the system podman. RunTopContainer now also returns the container id and an error.

added a libpod commit endpoint.

also, changed the use of the connections and bindings slightly to make it more convenient to write tests.

Fixes: 5366
Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-03-06 14:31:45 -06:00
e45456223c Add validate() for containers
Until now, we've been validating every part of container
configuration through the With... functions that set the options.
This if fine when we are just validating the options to an
individual function, but things get complicated once we need to
validate conflicts between different options. We don't know the
order in which things were passed, so we need the validation on
both of the potential options that can conflict, resulting in
significant code duplication. To solve this, add a validate()
function for containers, and use this to check whether everything
is in a good state.

We can probably move more into this function (there are other
parts of container creation that also do validation of a sort)
but this is a good start to simplifying our options.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-02 10:58:11 -05:00
4004f646cd Add basic deadlock detection for container start/remove
We can easily tell if we're going to deadlock by comparing lock
IDs before actually taking the lock. Add a few checks for this in
common places where deadlocks might occur.

This does not yet cover pod operations, where detection is more
difficult (and costly) due to the number of locks being involved
being higher than 2.

Also, add some error wrapping on the Podman side, so we can tell
people to use `system renumber` when it occurs.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-24 09:29:34 -05:00
5d93c731af Deprecate & remove IsCtrSpecific in favor of IsAnon
In Podman 1.6.3, we added support for anonymous volumes - fixing
our old, broken support for named volumes that were created with
containers. Unfortunately, this reused the database field we used
for the old implementation, and toggled volume removal on for
`podman run --rm` - so now, we were removing *named* volumes
created with older versions of Podman.

We can't modify these old volumes in the DB, so the next-safest
thing to do is swap to a new field to indicate volumes should be
removed. Problem: Volumes created with 1.6.3 and up until this
lands, even anonymous volumes, will not be removed. However, this
is safer than removing too many volumes, as we were doing before.

Fixes #5009

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-29 14:04:51 -05:00
cf7be58b2c Review corrections pass #2
Add API review comments to correct documentation and endpoints.  Also, add a libpode prune method to reduce code duplication.  Only used right now for the API but when the remote client is wired, we will switch over there too.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-01-23 11:58:26 -06:00
67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
8bc394ce6e Add the pod name when we use podman ps -p
The pod name does not appear when doing `podman ps -p`.
It is missing as the documentation says:
-p, --pod              Print the ID and name of the pod the containers are associated with

The pod name is added in the ps output and checked in unit tests.

Closes #4703

Signed-off-by: NevilleC <neville.cain@qonto.eu>
2019-12-28 00:03:57 +01:00
e33d7e9fab Merge pull request #4727 from rhatdan/pidns
if container is not in a pid namespace, stop all processes
2019-12-20 12:13:22 +01:00
123b8c627d if container is not in a pid namespace, stop all processes
When a container is in a PID namespace, it is enought to send
the stop signal to the PID 1 of the namespace, only send signals
to all processes in the container when the container is not in
a pid namespace.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-12-19 13:33:17 -05:00
88917e4a93 Remove volumes after containers in pod remove
When trying to reproduce #4704 I noticed that the named volumes
from the Postgres containers in the reproducer weren't being
removed by `podman pod rm -f` saying that the container they were
attached to was still in use. This was rather odd, considering
they were only in use by one container, and that container was in
the process of being removed with the pod.

After a bit of tracing, I realized that the cause is the ordering
of container removal when we remove a pod. Normally, it's done
in removeContainer() before volume removal (which is the last
thing in that function). However, when we are removing a pod, we
remove containers all at once, after removeContainer has already
finished - meaning the container still exists when we try to
remove its volumes, and thus the volume can't be removed.

Solution: collect a list of all named volumes in use by the pod,
and remove them all at once after every container in the pod is
gone. This ensures that there are no dependency issues.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-17 21:41:31 -05:00
25cc43c376 Add ContainerStateRemoving
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.

When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.

The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.

We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.

Fixes #3906

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-11-19 15:38:03 -05:00
11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
0d623914d0 Add support for anonymous volumes to podman run -v
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.

We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.

This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.

The sum total of these changes let us turn on volume removal in
`--rm` by default.

Fixes: #4276

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-17 13:18:17 -04:00
b6a7d88397 When restoring containers, reset cgroup path
Previously, `podman checkport restore` with exported containers,
when told to create a new container based on the exported
checkpoint, would create a new container, with a new container
ID, but not reset CGroup path - which contained the ID of the
original container.

If this was done multiple times, the result was two containers
with the same cgroup paths. Operations on these containers would
this have a chance of crossing over to affect the other one; the
most notable was `podman rm` once it was changed to use the --all
flag when stopping the container; all processes in the cgroup,
including the ones in the other container, would be stopped.

Reset cgroups on restore to ensure that the path matches the ID
of the container actually being run.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 14:53:29 -04:00
6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00
bb803b8f7a When evicting containers, perform a normal remove first
This ensures that containers that didn't require an evict will be
dealt with normally, and we only break out evict for containers
that refuse to be removed by normal means.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-04 11:04:43 -04:00
dacbc5beb2 rm: add containers eviction with rm --force
Add ability to evict a container when it becomes unusable. This may
happen when the host setup changes after a container creation, making it
impossible for that container to be used or removed.
Evicting a container is done using the `rm --force` command.

Signed-off-by: Marco Vedovati <mvedovati@suse.com>
2019-09-25 19:44:38 +02:00
7ac6ed3b4b Merge pull request #3581 from mheon/no_cgroups
Support running containers without CGroups
2019-09-11 00:58:46 +02:00
c2284962c7 Add support for launching containers without CGroups
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-10 10:52:37 -04:00
b6106341fb When first mounting any named volume, copy up
Previously, we only did this for volumes created at the same time
as the container. However, this is not correct behavior - Docker
does so for all named volumes, even those made with
'podman volume create' and mounted into a container later.

Fixes #3945

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-09 17:17:39 -04:00
e563f41116 Re-add locks to volumes.
This will require a 'podman system renumber' after being applied
to get lock numbers for existing volumes.

Add the DB backend code for rewriting volume configs and use it
for updating lock numbers as part of 'system renumber'.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 11:35:00 -04:00
a891b84528 Fix up ConmonPidFile after restore
After restoring a container with a different name (ID) the ConmonPidFile
was still pointing to the path of the original container.

This means that the last restored container will overwrite the
ConmonPidFile of the original container. It was also not possible to
restore a container with a new name (ID) if the original container was
not running.

The ConmonPidFile is only changed if the ConmonPidFile starts with the
value of RunRoot. This assumes that if RunRoot is part of ConmonPidFile
the user did not specify --conmon-pidfile' during run or create.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-09 19:26:56 +02:00
82b586349c restore: correctly set StartedTime
A container restored from an exported checkpoint did not have its
StartedTime set. Which resulted in a status like 'Up 292 years ago'
after the restore.

This just sets the StartedTime to time.Now() if a container is restored
from an exported checkpoint.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-05 14:29:07 +02:00
9dcd76e369 Ensure we generate a 'stopped' event on force-remove
When forcibly removing a container, we are initiating an explicit
stop of the container, which is not reflected in 'podman events'.
Swap to using our standard 'stop()' function instead of a custom
one for force-remove, and move the event into the internal stop
function (so internal calls also register it).

This does add one more database save() to `podman remove`. This
should not be a terribly serious performance hit, and does have
the desirable side effect of making things generally safer.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
db826d5d75 golangci-lint round #3
this is the third round of preparing to use the golangci-lint on our
code base.

Signed-off-by: baude <bbaude@redhat.com>
2019-07-21 14:22:39 -05:00
8713483362 Fix a bug where ctrs could not be removed from pods
Using pod removal worked, but container removal was missing the
most critical step - the actual removal. Must have been
accidentally removed during a refactor.

Fixes #3556

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-11 10:17:33 -04:00
18c4d73867 runtime: drop spurious message log
fix a regression introduced by 1d36501f961889f554daf3c696fe95443ef211b6

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-10 15:47:38 +02:00
fce2e6577e Merge pull request #3497 from QazerLab/bugfix/systemd-generate-pidfile
Use conmon pidfile in generated systemd unit as PIDFile.
2019-07-08 23:39:42 +02:00
edc7f52c95 Merge pull request #3425 from adrianreber/restore-mount-label
Set correct SELinux label on restored containers
2019-07-08 20:31:59 +02:00
1d36501f96 code cleanup
clean up code identified as problematic by golands inspection

Signed-off-by: baude <bbaude@redhat.com>
2019-07-08 09:18:11 -05:00
37b134054e Use default conmon pidfile location for root containers.
The conmon pidfile is crucial for podman-generated systemd units, because
these units rely on it for determining service's main process ID.

With this change, every container has ConmonPidFile set (at least to
default value).

Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-07-04 21:08:06 +03:00
e92de11a69 Ensure locks are freed when ctr/pod creation fails
If we don't do this, we can leak locks on every failure, and that
is very, very bad - can render Podman unusable without a 'system
renumber' being run.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-02 12:51:39 -04:00