98 Commits

Author SHA1 Message Date
d7c367aa61 Address review comments on restart policy
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-03 10:36:16 -04:00
7ba1b609aa Move to using constants for valid restart policy types
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-03 10:36:16 -04:00
f4db6d5cf6 Add support for retry count with --restart flag
The on-failure restart option supports restarting only a given
number of times. To do this, we need one additional field in the
DB to track restart count (which conveniently fills a field in
Inspect we weren't populating), plus some plumbing logic.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-03 10:36:16 -04:00
dc42304f38 Sending signals to containers prevents restart policy
Noticed this when testing some behavior with Docker.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-03 10:36:16 -04:00
0d73ee40b2 Add container restart policy to Libpod & Podman
This initial version does not support restart count, but it works
as advertised otherwise.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-03 10:36:16 -04:00
0b2c9c2acc Add basic structure of podman init command
As part of this, rework the number of workers used by various
Podman tasks to match original behavior - need an explicit
fallthrough in the switch statement for that block to work as
expected.

Also, trivial change to Podman cleanup to work on initialized
containers - we need to reset to a different state after cleaning
up the OCI runtime.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-01 11:12:24 -04:00
09ff62429a Implement podman-remote rm
* refactor command output to use one function
* Add new worker pool parallel operations
* Implement podman-remote umount
* Refactored podman wait to use printCmdOutput()

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2019-04-09 11:55:26 -07:00
99318b0894 Remove wait event
It's not necessary to log an event for a read-only operation like
wait.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-03-29 14:50:43 -04:00
22fc5a3e57 Merge pull request #2621 from mheon/event_on_death
Add event on container death
2019-03-13 12:03:07 -07:00
8f418f1568 Vendor docker/docker, fsouza and more #2
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>

Vendors in fsouza/docker-client, docker/docker and
a few more related. Of particular note, changes to the TweakCapabilities()
function from docker/docker along with the parse.IDMappingOptions() function
from Buildah. Please pay particular attention to the related changes in
the call from libpod to those functions during the review.

Passes baseline tests.
2019-03-13 11:40:39 -04:00
3b5805d521 Add event on container death
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-03-13 10:18:51 -04:00
ca1e76ff63 Add event logging to libpod, even display to podman
In lipod, we now log major events that occurr.  These events
can be displayed using the `podman events` command. Each
event contains:

* Type (container, image, volume, pod...)
* Status (create, rm, stop, kill, ....)
* Timestamp in RFC3339Nano format
* Name (if applicable)
* Image (if applicable)

The format of the event and the varlink endpoint are to not
be considered stable until cockpit has done its enablement.

Signed-off-by: baude <bbaude@redhat.com>
2019-03-11 15:08:59 -05:00
0b34327ad4 exec: support --preserve-fds
Allow to pass additional FDs to the process being executed.

Closes: https://github.com/containers/libpod/issues/2372

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-02 11:45:42 +01:00
d780e69559 Allow Exec API user to override streams
Allow passing in of AttachStreams to libpod.Exec() for usage in podman healthcheck. An API caller can now specify different streams for stdout, stderr and stdin, or no streams at all.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-02-28 14:55:11 -05:00
7141f97270 OpenTracing support added to start, stop, run, create, pull, and ps
Drop context.Context field from cli.Context

Signed-off-by: Sebastian Jug <sejug@redhat.com>
2019-02-18 09:57:08 -05:00
81804fc464 pod infra container is started before a container in a pod is run, started, or attached.
Prior, a pod would have to be started immediately when created, leading to confusion about what a pod state should be immediately after creation. The problem was podman run --pod ... would error out if the infra container wasn't started (as it is a dependency). Fix this by allowing for recursive start, where each of the container's dependencies are started prior to the new container. This is only applied to the case where a new container is attached to a pod.

Also rework container_api Start, StartAndAttach, and Init functions, as there was some duplicated code, which made addressing the problem easier to fix.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-02-15 16:39:24 -05:00
43c6da22b9 Add darwin support for remote-client
Add the ability to cross-compile podman remote for OSX.

Also, add image exists and tag to remote-client.

Signed-off-by: baude <bbaude@redhat.com>
2019-01-11 11:30:28 -06:00
867669374c Add a --workdir option to 'podman exec'
Signed-off-by: Debarshi Ray <rishi@fedoraproject.org>
2019-01-08 17:42:37 +01:00
b945d9128a Add locking to Sync() on containers
Previously not needed as it only worked inside of Batch(), but
now that it can be called anywhere we need to add mutual
exclusion on its config changes.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2018-12-06 09:10:45 -05:00
a0c9be2061 Add --sync option to podman rm
With the changes made recently to ensure Podman does not hit the
OCI runtime as often to sync state, we can find ourselves in a
situation where the runtime's state does not match ours.

Add a --sync flag to podman rm to ensure we can still remove
containers when this happens.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2018-12-06 09:10:45 -05:00
61d4db4806 Fix golang formatting issues
Whe running unittests on newer golang versions, we observe failures with some
formatting types when no declared correctly.

Signed-off-by: baude <bbaude@redhat.com>
2018-11-28 09:26:24 -06:00
effd63d6d5 Merge pull request #1848 from adrianreber/master
Add tcp-established to checkpoint/restore
2018-11-28 07:00:24 -08:00
03c88a3deb Added tcp-established to checkpoint/restore
CRIU can checkpoint and restore processes/containers with established
TCP connections if the correct option is specified. To implement
checkpoint and restore with support for established TCP connections with
Podman this commit adds the necessary options to runc during checkpoint
and also tells conmon during restore to use 'runc restore' with
'--tcp-established'.

For this Podman feature to work a corresponding conmon change is
required.

Example:

$ podman run --tmpfs /tmp --name podman-criu-test -d docker://docker.io/yovfiatbeb/podman-criu-test
$ nc `podman inspect -l | jq -r '.[0].NetworkSettings.IPAddress'` 8080
GET /examples/servlets/servlet/HelloWorldExample
Connection: keep-alive

1
GET /examples/servlets/servlet/HelloWorldExample
Connection: keep-alive

2
$ # Using HTTP keep-alive multiple requests are send to the server in the container
$ # Different terminal:
$ podman container checkpoint -l
criu failed: type NOTIFY errno 0
$ # Looking at the log file would show errors because of established TCP connections
$ podman container checkpoint -l --tcp-established
$ # This works now and after the restore the same connection as above can be used for requests
$ podman container restore -l --tcp-established

The restore would fail without '--tcp-established' as the checkpoint image
contains established TCP connections.

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-11-28 08:00:38 +01:00
0592558289 Use also a struct to pass options to Restore()
This is basically the same change as

 ff47a4c2d5485fc49f937f3ce0c4e2fd6bdb1956 (Use a struct to pass options to Checkpoint())

just for the Restore() function. It is used to pass multiple restore
options to the API and down to conmon which is used to restore
containers. This is for the upcoming changes to support checkpointing
and restoring containers with '--tcp-established'.

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-11-28 08:00:37 +01:00
070ce0c855 exec: don't wait for pidfile when the runtime exited
don't wait for the timeout to expire if the runtime process exited.
I've noticed podman to hang on exit and keeping the container lock
taken when the OCI runtime already exited.

Additionally, it reduces the waiting time as we won't hit the 25
milliseconds waiting time in the worst case.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-11-27 12:34:11 +01:00
b0572d6229 Added option to keep containers running after checkpointing
CRIU supports to leave processes running after checkpointing:

  -R|--leave-running    leave tasks in running state after checkpoint

runc also support to leave containers running after checkpointing:

   --leave-running      leave the process running after checkpointing

With this commit the support to leave a container running after
checkpointing is brought to Podman:

   --leave-running, -R  leave the container running after writing checkpoint to disk

Now it is possible to checkpoint a container at some point in time
without stopping the container. This can be used to rollback the
container to an early state:

$ podman run --tmpfs /tmp --name podman-criu-test -d docker://docker.io/yovfiatbeb/podman-criu-test
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
3
$ podman container checkpoint -R -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
5
$ podman stop -l
$ podman container restore -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4

So after checkpointing the container kept running and was stopped after
some time. Restoring this container will restore the state right at the
checkpoint.

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-11-20 17:25:44 +01:00
ff47a4c2d5 Use a struct to pass options to Checkpoint()
For upcoming changes to the Checkpoint() functions this commit switches
checkpoint options from a boolean to a struct, so that additional
options can be passed easily to Checkpoint() without changing the
function parameters all the time.

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-11-20 17:25:44 +01:00
c3d8328150 Increase pidWaitTimeout to 60s
At scale, it appears that we sometimes hit the 1000ms timeout to create
the PID file when a container is created or executed.
Increasing the value to 60s should help when running a lot of containers
in heavy-loaded environment.

Related #1495
Fixes #1816
Signed-off-by: Emilien Macchi <emilien@redhat.com>
2018-11-15 10:58:27 -05:00
1370c311f5 Merge pull request #1771 from baude/prepare
move defer'd function declaration ahead of prepare error return
2018-11-07 10:55:51 -08:00
e022efa0f8 move defer'd function declaration ahead of prepare error return
Signed-off-by: baude <bbaude@redhat.com>
2018-11-07 10:44:33 -06:00
140f87c474 EXPERIMENTAL: Do not call out to runc for sync
When syncing container state, we normally call out to runc to see
the container's status. This does have significant performance
implications, though, and we've seen issues with large amounts of
runc processes being spawned.

This patch attempts to use stat calls on the container exit file
created by Conmon instead to sync state. This massively decreases
the cost of calling updateContainer (it has gone from an
almost-unconditional fork/exec of runc to a single stat call that
can be avoided in most states).

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
319a7a7043 Merge pull request #1715 from baude/getusergroup
get user and group information using securejoin and runc's user library
2018-10-30 11:49:15 -07:00
079208cdbc unmount: fix error logic
Only return `ErrCtrStateInvalid` errors when the mount counter is equal
to 1.  Also fix the "can't unmount [...] last mount[..]" error which
hasn't been returned when the error passed to `errors.Errorf()` is nil.

Fixes: #1695
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
2018-10-29 15:46:54 +01:00
1dd7f13dfb get user and group information using securejoin and runc's user library
for the purposes of performance and security, we use securejoin to contstruct
the root fs's path so that symlinks are what they appear to be and no pointing
to something naughty.

then instead of chrooting to parse /etc/passwd|/etc/group, we now use the runc user/group
methods which saves us quite a bit of performance.

Signed-off-by: baude <bbaude@redhat.com>
2018-10-29 08:59:46 -05:00
ee8f19e7be Make podman ps fast
Like Ricky Bobby, we want to go fast.

Signed-off-by: baude <bbaude@redhat.com>
2018-10-23 08:26:21 -05:00
f7c8fd8a3d Add support to checkpoint/restore containers
runc uses CRIU to support checkpoint and restore of containers. This
brings an initial checkpoint/restore implementation to podman.

None of the additional runc flags are yet supported and container
migration optimization (pre-copy/post-copy) is also left for the future.

The current status is that it is possible to checkpoint and restore a
container. I am testing on RHEL-7.x and as the combination of RHEL-7 and
CRIU has seccomp troubles I have to create the container without
seccomp.

With the following steps I am able to checkpoint and restore a
container:

 # podman run --security-opt="seccomp=unconfined" -d registry.fedoraproject.org/f27/httpd
 # curl -I 10.22.0.78:8080
 HTTP/1.1 403 Forbidden # <-- this is actually a good answer
 # podman container checkpoint <container>
 # curl -I 10.22.0.78:8080
 curl: (7) Failed connect to 10.22.0.78:8080; No route to host
 # podman container restore <container>
 # curl -I 10.22.0.78:8080
 HTTP/1.1 403 Forbidden

I am using CRIU, runc and conmon from git. All required changes for
checkpoint/restore support in podman have been merged in the
corresponding projects.

To have the same IP address in the restored container as before
checkpointing, CNI is told which IP address to use.

If the saved network configuration cannot be found during restore, the
container is restored with a new IP address.

For CRIU to restore established TCP connections the IP address of the
network namespace used for restore needs to be the same. For TCP
connections in the listening state the IP address can change.

During restore only one network interface with one IP address is handled
correctly. Support to restore containers with more advanced network
configuration will be implemented later.

v2:
 * comment typo
 * print debug messages during cleanup of restore files
 * use createContainer() instead of createOCIContainer()
 * introduce helper CheckpointPath()
 * do not try to restore a container that is paused
 * use existing helper functions for cleanup
 * restructure code flow for better readability
 * do not try to restore if checkpoint/inventory.img is missing
 * git add checkpoint.go restore.go

v3:
 * move checkpoint/restore under 'podman container'

v4:
 * incorporated changes from latest reviews

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-10-03 21:41:39 +02:00
2c7f97d5a7 Add ContainerStateExited and OCI delete() in cleanup()
To work better with Kata containers, we need to delete() from the
OCI runtime as a part of cleanup, to ensure resources aren't
retained longer than they need to be.

To enable this, we need to add a new state to containers,
ContainerStateExited. Containers transition from
ContainerStateStopped to ContainerStateExited via cleanupRuntime
which is invoked as part of cleanup(). A container in the Exited
state is identical to Stopped, except it has been removed from
the OCI runtime and thus will be handled differently when
initializing the container.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 12:05:22 -04:00
9e81f9daa4 Refactor Wait() to not require a timeout
We added a timeout for convenience, but most invocations don't
care about it. Refactor it into WaitWithTimeout() and add a
Wait() that doesn't require a timeout and uses the default.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #1527
Approved by: mheon
2018-09-21 20:07:51 +00:00
0b2cfa7fcf Increase pidWaitTimeout to 1000ms
When managing the containers with systemd, it takes a bit more than
250ms to have podman creating the pidfile.
Increasing the value to 1 second will avoid timeout issues when running
a lot of containers managed by systemd.

This patch was tested in a VM with 56 services (OpenStack) deployed by
TripleO and managed by systemd.

Fixes #1495

Signed-off-by: Emilien Macchi <emilien@redhat.com>

Closes: #1497
Approved by: rhatdan
2018-09-18 12:24:39 +00:00
9ec82caa31 Add --interval flag to podman wait
Waiting uses a lot of CPU, so drop back to checking once/second
and allow user to pass in the interval.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-09-13 10:11:00 -04:00
4291a43a54 Up time between checks for podman wait
Prior to this patch, we were polling continuously to check if a
container had died. This patch changes this to poll 10 times a
second, which should be more than sufficient and drastically
reduce CPU utilization.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-08-31 13:18:34 -04:00
c5753f57c1 rootless: exec handle processes that create an user namespace
Manage the case where the main process of the container creates and
joins a new user namespace.

In this case we want to join only the first child in the new
hierarchy, which is the user namespace that was used to create the
container.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

Closes: #1331
Approved by: rhatdan
2018-08-26 07:22:42 +00:00
c276a13880 Properly translate users into runc format for exec
Runc exec expects the --user flag to be formatted as UID:GID.
Use chrootuser code to translate whatever user is passed to exec
into this format.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #1315
Approved by: vrothberg
2018-08-23 12:07:59 +00:00
d20f3a5146 switch projectatomic to containers
Need to get some small changes into libpod to pull back into buildah
to complete buildah transition.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1270
Approved by: mheon
2018-08-16 17:12:36 +00:00
92e9d7891e We need to sort mounts so that one mount does not over mount another.
Currently we add mounts from images, volumes and internal.
We can accidently over mount an existing mount.  This patch sorts the mounts
to make sure a parent directory is always mounted before its content.

Had to change the default propagation on image volume mounts from shared
to private to stop mount points from leaking out of the container.

Also switched from using some docker/docker/pkg to container/storage/pkg
to remove some dependencies on Docker.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1243
Approved by: mheon
2018-08-10 21:18:19 +00:00
8e1ef558eb Add --force to podman umount to force the unmounting of the rootfs
podman umount will currently only unmount file system if not other
process is using it, otherwise the umount decrements the container
storage to indicate that the caller is no longer using the mount
point, once the count gets to 0, the file system is actually unmounted.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1184
Approved by: TomSweeneyRedHat
2018-08-01 17:53:30 +00:00
7789284cbe Added pod.Restart() functionality to libpod.
Moved contents of RestartWithTimeout to restartWithTimeout in container_internal to be able to call restart without locking in function.
Refactored startNode to be able to either start or restart a node.
Built pod Restart() with new startNode with refresh true.

Signed-off-by: haircommander <pehunt@redhat.com>

Closes: #1152
Approved by: rhatdan
2018-07-25 17:54:27 +00:00
85db3f09bf Let containers/storage keep track of mounts
Currently we unmount storage that is still in use.
We should not be unmounting storeage that we mounted
via a different command or by podman mount. This
change relies on containers/storage to umount keep track of
how many times the storage was mounted before really unmounting
it from the system.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-07-19 17:01:07 -04:00
4f9b1ae625 Allow Init() on stopped containers
Signed-off-by: Matthew Heon <mheon@redhat.com>

Closes: #1068
Approved by: baude
2018-07-09 20:33:09 +00:00
c3602075ec Make CGroups cleanup optional on whether they exist
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #981
Approved by: baude
2018-06-22 19:26:46 +00:00