for the purposes of performance and security, we use securejoin to contstruct
the root fs's path so that symlinks are what they appear to be and no pointing
to something naughty.
then instead of chrooting to parse /etc/passwd|/etc/group, we now use the runc user/group
methods which saves us quite a bit of performance.
Signed-off-by: baude <bbaude@redhat.com>
runc uses CRIU to support checkpoint and restore of containers. This
brings an initial checkpoint/restore implementation to podman.
None of the additional runc flags are yet supported and container
migration optimization (pre-copy/post-copy) is also left for the future.
The current status is that it is possible to checkpoint and restore a
container. I am testing on RHEL-7.x and as the combination of RHEL-7 and
CRIU has seccomp troubles I have to create the container without
seccomp.
With the following steps I am able to checkpoint and restore a
container:
# podman run --security-opt="seccomp=unconfined" -d registry.fedoraproject.org/f27/httpd
# curl -I 10.22.0.78:8080
HTTP/1.1 403 Forbidden # <-- this is actually a good answer
# podman container checkpoint <container>
# curl -I 10.22.0.78:8080
curl: (7) Failed connect to 10.22.0.78:8080; No route to host
# podman container restore <container>
# curl -I 10.22.0.78:8080
HTTP/1.1 403 Forbidden
I am using CRIU, runc and conmon from git. All required changes for
checkpoint/restore support in podman have been merged in the
corresponding projects.
To have the same IP address in the restored container as before
checkpointing, CNI is told which IP address to use.
If the saved network configuration cannot be found during restore, the
container is restored with a new IP address.
For CRIU to restore established TCP connections the IP address of the
network namespace used for restore needs to be the same. For TCP
connections in the listening state the IP address can change.
During restore only one network interface with one IP address is handled
correctly. Support to restore containers with more advanced network
configuration will be implemented later.
v2:
* comment typo
* print debug messages during cleanup of restore files
* use createContainer() instead of createOCIContainer()
* introduce helper CheckpointPath()
* do not try to restore a container that is paused
* use existing helper functions for cleanup
* restructure code flow for better readability
* do not try to restore if checkpoint/inventory.img is missing
* git add checkpoint.go restore.go
v3:
* move checkpoint/restore under 'podman container'
v4:
* incorporated changes from latest reviews
Signed-off-by: Adrian Reber <areber@redhat.com>
To work better with Kata containers, we need to delete() from the
OCI runtime as a part of cleanup, to ensure resources aren't
retained longer than they need to be.
To enable this, we need to add a new state to containers,
ContainerStateExited. Containers transition from
ContainerStateStopped to ContainerStateExited via cleanupRuntime
which is invoked as part of cleanup(). A container in the Exited
state is identical to Stopped, except it has been removed from
the OCI runtime and thus will be handled differently when
initializing the container.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
We added a timeout for convenience, but most invocations don't
care about it. Refactor it into WaitWithTimeout() and add a
Wait() that doesn't require a timeout and uses the default.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #1527
Approved by: mheon
When managing the containers with systemd, it takes a bit more than
250ms to have podman creating the pidfile.
Increasing the value to 1 second will avoid timeout issues when running
a lot of containers managed by systemd.
This patch was tested in a VM with 56 services (OpenStack) deployed by
TripleO and managed by systemd.
Fixes#1495
Signed-off-by: Emilien Macchi <emilien@redhat.com>
Closes: #1497
Approved by: rhatdan
Waiting uses a lot of CPU, so drop back to checking once/second
and allow user to pass in the interval.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Prior to this patch, we were polling continuously to check if a
container had died. This patch changes this to poll 10 times a
second, which should be more than sufficient and drastically
reduce CPU utilization.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Manage the case where the main process of the container creates and
joins a new user namespace.
In this case we want to join only the first child in the new
hierarchy, which is the user namespace that was used to create the
container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Closes: #1331
Approved by: rhatdan
Runc exec expects the --user flag to be formatted as UID:GID.
Use chrootuser code to translate whatever user is passed to exec
into this format.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #1315
Approved by: vrothberg
Need to get some small changes into libpod to pull back into buildah
to complete buildah transition.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1270
Approved by: mheon
Currently we add mounts from images, volumes and internal.
We can accidently over mount an existing mount. This patch sorts the mounts
to make sure a parent directory is always mounted before its content.
Had to change the default propagation on image volume mounts from shared
to private to stop mount points from leaking out of the container.
Also switched from using some docker/docker/pkg to container/storage/pkg
to remove some dependencies on Docker.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1243
Approved by: mheon
podman umount will currently only unmount file system if not other
process is using it, otherwise the umount decrements the container
storage to indicate that the caller is no longer using the mount
point, once the count gets to 0, the file system is actually unmounted.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1184
Approved by: TomSweeneyRedHat
Moved contents of RestartWithTimeout to restartWithTimeout in container_internal to be able to call restart without locking in function.
Refactored startNode to be able to either start or restart a node.
Built pod Restart() with new startNode with refresh true.
Signed-off-by: haircommander <pehunt@redhat.com>
Closes: #1152
Approved by: rhatdan
Currently we unmount storage that is still in use.
We should not be unmounting storeage that we mounted
via a different command or by podman mount. This
change relies on containers/storage to umount keep track of
how many times the storage was mounted before really unmounting
it from the system.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The Refresh() function is used to reset a container's state after
a database format change to state is made that requires migration
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #981
Approved by: baude
Since we are checking if err is non nil in defer function we need
to define it, so that the check will work correctly.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #985
Approved by: mheon
Move the StartContainer call after the attach to the UNIX socket. It
solves a race where the StartContainer could be done earlier and a
short-lived container could already exit by the time we tried to
attach to the socket.
Closes: https://github.com/projectatomic/libpod/issues/835
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This allows us to restart containers that have never been started
without error. This makes RestartWithTimeout work with running,
stopped, and created containers.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #719
Approved by: rhatdan
first pass at adding in the container related endpoints/methods for the libpod
backend. Couple of important notes:
* endpoints that can use a console are not going to be done until we have "remote" console
* several of the container methods should probably be able to stream as opposed to a one-off return
Signed-off-by: baude <bbaude@redhat.com>
Closes: #708
Approved by: baude
Made necessary changes to functions to include contex.Context wherever needed
Signed-off-by: umohnani8 <umohnani@redhat.com>
Closes: #640
Approved by: baude
Comparing Go interfaces, like io.Reader, to nil does not work. As
such, we need to include a bool with each stream telling whether
to attach to it.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #608
Approved by: baude
This allows us to attach to attach to just stdout or stderr or
stdin, or any combination of these.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #608
Approved by: baude
Instead of checking during init(), which could result in major
locking issues when used with pods, make our dependency checks in
the public API instead. This avoids doing them when we start pods
(where, because of the dependency graph, we can reasonably say
all dependencies are up before we start a container).
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #577
Approved by: rhatdan
Cull funcs from runtime_img.go which are no longer needed. Also, fix any remaining
spots that use the old image technique.
Signed-off-by: baude <bbaude@redhat.com>
Closes: #532
Approved by: mheon
Migrate the podman create and commit subcommandis to leverage the images library. I also had
to migrate the cmd/ portions of run and rmi.
Signed-off-by: baude <bbaude@redhat.com>
Closes: #498
Approved by: mheon
This solves our prior problems with attach races by ensuring the
order is correct.
Also contains substantial cleanups to the attach code.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #482
Approved by: baude
Separate Init() and Start() does not make sense on the pod side,
where we may have to start containers in order to initialize
others due to dependency orders.
Also adjusts internal containers API for more code sharing.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #478
Approved by: rhatdan
Refactors creation of bind mounts into a separate function that
can be called from elsewhere (e.g. pod start or container
restart). This function stores the mounts in the DB using the
field established last commit.
Spec generation now relies upon this field in the DB instead of
manually enumerating files to be bind mounted in.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #462
Approved by: baude
The progress should not be show for import, load, and commit. It makes machine
parsing of the output much more difficult. Also, each command should output an
image ID or name for the user.
Added a --verbose flag for users that still want to see progress.
Resolves issue #450
Signed-off-by: baude <bbaude@redhat.com>
Closes: #456
Approved by: rhatdan
Notify conmon when the terminal size changes. Use the same notification
to set the correct initial size.
Closes: https://github.com/projectatomic/libpod/issues/351
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Closes: #448
Approved by: baude
This will behave better if we need to add anything to it at a
later date - we can add fields to the struct without breaking
existing BoltDB databases.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #412
Approved by: baude
This allows containers to be used by `ps` and other commands
while they have ongoing exec sessions. Concurrent exec should
also work but is not tested.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #412
Approved by: baude
This ensures that containers with active exec sessions will not
have storage unmounted under them or network namespaces destroyed
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #412
Approved by: baude
Exec sessions now have an ID generated and assigned to their PID
and stored in the database state. This allows us to track what
exec sessions are currently active.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #412
Approved by: baude