155 Commits

Author SHA1 Message Date
c657dc4fdb Switch all referencs to image.ContainerConfig to image.Config
This will more closely match what Docker is doing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-12-21 15:59:34 -05:00
bc57ecec42 Prevent a second lookup of user for image volumes
Instead of forcing another user lookup when mounting image
volumes, just use the information we looked up when we started
generating the spec.

This may resolve #1817

Signed-off-by: Matthew Heon <mheon@redhat.com>
2018-12-11 13:36:50 -05:00
176f76d794 Fix errors where OCI hooks directory does not exist
Signed-off-by: Matthew Heon <mheon@redhat.com>
2018-12-07 11:35:43 -05:00
39a036e24d bind mount /etc/resolv.conf|hosts in pods
containers inside pods need to make sure they get /etc/resolv.conf
and /etc/hosts bind mounted when network is expected

Signed-off-by: baude <bbaude@redhat.com>
2018-12-06 13:56:57 -06:00
a4b483c848 libpod/container_internal: Deprecate implicit hook directories
Part of the motivation for 800eb863 (Hooks supports two directories,
process default and override, 2018-09-17, #1487) was [1]:

> We only use this for override. The reason this was caught is people
> are trying to get hooks to work with CoreOS. You are not allowed to
> write to /usr/share... on CoreOS, so they wanted podman to also look
> at /etc, where users and third parties can write.

But we'd also been disabling hooks completely for rootless users.  And
even for root users, the override logic was tricky when folks actually
had content in both directories.  For example, if you wanted to
disable a hook from the default directory, you'd have to add a no-op
hook to the override directory.

Also, the previous implementation failed to handle the case where
there hooks defined in the override directory but the default
directory did not exist:

  $ podman version
  Version:       0.11.2-dev
  Go Version:    go1.10.3
  Git Commit:    "6df7409cb5a41c710164c42ed35e33b28f3f7214"
  Built:         Sun Dec  2 21:30:06 2018
  OS/Arch:       linux/amd64
  $ ls -l /etc/containers/oci/hooks.d/test.json
  -rw-r--r--. 1 root root 184 Dec  2 16:27 /etc/containers/oci/hooks.d/test.json
  $ podman --log-level=debug run --rm docker.io/library/alpine echo 'successful container' 2>&1 | grep -i hook
  time="2018-12-02T21:31:19-08:00" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d"
  time="2018-12-02T21:31:19-08:00" level=warning msg="failed to load hooks: {}%!(EXTRA *os.PathError=open /usr/share/containers/oci/hooks.d: no such file or directory)"

With this commit:

  $ podman --log-level=debug run --rm docker.io/library/alpine echo 'successful container' 2>&1 | grep -i hook
  time="2018-12-02T21:33:07-08:00" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d"
  time="2018-12-02T21:33:07-08:00" level=debug msg="reading hooks from /etc/containers/oci/hooks.d"
  time="2018-12-02T21:33:07-08:00" level=debug msg="added hook /etc/containers/oci/hooks.d/test.json"
  time="2018-12-02T21:33:07-08:00" level=debug msg="hook test.json matched; adding to stages [prestart]"
  time="2018-12-02T21:33:07-08:00" level=warning msg="implicit hook directories are deprecated; set --hooks-dir="/etc/containers/oci/hooks.d" explicitly to continue to load hooks from this directory"
  time="2018-12-02T21:33:07-08:00" level=error msg="container create failed: container_linux.go:336: starting container process caused "process_linux.go:399: container init caused \"process_linux.go:382: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: oh, noes!\\\\n\\\"\""

(I'd setup the hook to error out).  You can see that it's silenly
ignoring the ENOENT for /usr/share/containers/oci/hooks.d and
continuing on to load hooks from /etc/containers/oci/hooks.d.

When it loads the hook, it also logs a warning-level message
suggesting that callers explicitly configure their hook directories.
That will help consumers migrate, so we can drop the implicit hook
directories in some future release.  When folks *do* explicitly
configure hook directories (via the newly-public --hooks-dir and
hooks_dir options), we error out if they're missing:

  $ podman --hooks-dir /does/not/exist run --rm docker.io/library/alpine echo 'successful container'
  error setting up OCI Hooks: open /does/not/exist: no such file or directory

I've dropped the trailing "path" from the old, hidden --hooks-dir-path
and hooks_dir_path because I think "dir(ectory)" is already enough
context for "we expect a path argument".  I consider this name change
non-breaking because the old forms were undocumented.

Coming back to rootless users, I've enabled hooks now.  I expect they
were previously disabled because users had no way to avoid
/usr/share/containers/oci/hooks.d which might contain hooks that
required root permissions.  But now rootless users will have to
explicitly configure hook directories, and since their default config
is from ~/.config/containers/libpod.conf, it's a misconfiguration if
it contains hooks_dir entries which point at directories with hooks
that require root access.  We error out so they can fix their
libpod.conf.

[1]: https://github.com/containers/libpod/pull/1487#discussion_r218149355

Signed-off-by: W. Trevor King <wking@tremily.us>
2018-12-03 12:54:30 -08:00
b504623a11 Merge pull request #1317 from rhatdan/privileged
Disable mount options when running --privileged
2018-11-30 11:09:51 -08:00
a5be3ffa4d /dev/shm should be mounted even in rootless mode.
Currently we are mounting /dev/shm from disk, it should be from a tmpfs.
User Namespace supports tmpfs mounts for nonroot users, so this section of
code should work fine in bother root and rootless mode.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-11-28 15:48:25 -05:00
61d4db4806 Fix golang formatting issues
Whe running unittests on newer golang versions, we observe failures with some
formatting types when no declared correctly.

Signed-off-by: baude <bbaude@redhat.com>
2018-11-28 09:26:24 -06:00
effd63d6d5 Merge pull request #1848 from adrianreber/master
Add tcp-established to checkpoint/restore
2018-11-28 07:00:24 -08:00
3beacb73bc Disable mount options when running --privileged
We now default to setting storage options to "nodev", when running
privileged containers, we need to turn this off so the processes can
manipulate the image.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-11-28 07:53:28 -05:00
95f22a2ca0 network: allow slirp4netns mode also for root containers
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-11-28 09:21:59 +01:00
0592558289 Use also a struct to pass options to Restore()
This is basically the same change as

 ff47a4c2d5485fc49f937f3ce0c4e2fd6bdb1956 (Use a struct to pass options to Checkpoint())

just for the Restore() function. It is used to pass multiple restore
options to the API and down to conmon which is used to restore
containers. This is for the upcoming changes to support checkpointing
and restoring containers with '--tcp-established'.

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-11-28 08:00:37 +01:00
bb6c1cf8d1 libpod should know if the network is disabled
/etc/resolv.conf and /etc/hosts should not be created and mounted when the
network is disabled.

We should not be calling the network setup and cleanup functions when it is
disabled either.

In doing this patch, I found that all of the bind mounts were particular to
Linux along with the generate functions, so I moved them to
container_internal_linux.go

Since we are checking if we are using a network namespace, we need to check
after the network namespaces has been created in the spec.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-11-13 06:33:10 -05:00
e9f8aed407 Merge pull request #1764 from rhatdan/nopasswd
Don't fail if /etc/passwd or /etc/group does not exists
2018-11-07 11:24:57 -08:00
1370c311f5 Merge pull request #1771 from baude/prepare
move defer'd function declaration ahead of prepare error return
2018-11-07 10:55:51 -08:00
ae03137861 Merge pull request #1689 from mheon/add_runc_timeout
Do not call out to runc for sync
2018-11-07 09:36:03 -08:00
e022efa0f8 move defer'd function declaration ahead of prepare error return
Signed-off-by: baude <bbaude@redhat.com>
2018-11-07 10:44:33 -06:00
ae68bec75c Don't fail if /etc/passwd or /etc/group does not exists
Container images can be created without passwd or group file, currently
if one of these containers gets run with a --user flag the container blows
up complaining about t a missing /etc/passwd file.

We just need to check if the error on read is ENOEXIST then allow the
read to return, not fail.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-11-07 11:41:51 -05:00
536af1f689 Print error status code if we fail to parse it
When we read the conmon error status file, if Atoi fails to parse
the string we read from the file as an int, print the string as
part of the error message so we know what might have gone wrong.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2018-11-07 11:36:01 -05:00
c9e9ca5671 Properly set Running state when starting containers
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
3286b0185d Retrieve container PID from conmon
Instead of running a full sync after starting a container to pick
up its PID, grab it from Conmon instead.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
140f87c474 EXPERIMENTAL: Do not call out to runc for sync
When syncing container state, we normally call out to runc to see
the container's status. This does have significant performance
implications, though, and we've seen issues with large amounts of
runc processes being spawned.

This patch attempts to use stat calls on the container exit file
created by Conmon instead to sync state. This massively decreases
the cost of calling updateContainer (it has gone from an
almost-unconditional fork/exec of runc to a single stat call that
can be avoided in most states).

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
f714ee4fb1 Actually save changes from post-stop sync
After stopping containers, we run updateContainerStatus to sync
our state with runc (pick up exit code, for example). Then we
proceed to not save this to the database, requiring us to grab it
again on the next sync. This should remove the need to read the
exit file more than once.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
879f9116de Add hostname to /etc/hosts
Signed-off-by: Qi Wang <qiwan@redhat.com>
2018-11-07 09:55:59 -05:00
1dd7f13dfb get user and group information using securejoin and runc's user library
for the purposes of performance and security, we use securejoin to contstruct
the root fs's path so that symlinks are what they appear to be and no pointing
to something naughty.

then instead of chrooting to parse /etc/passwd|/etc/group, we now use the runc user/group
methods which saves us quite a bit of performance.

Signed-off-by: baude <bbaude@redhat.com>
2018-10-29 08:59:46 -05:00
e2aef6341d run prepare in parallel
run prepare() -- which consists of creating a network namespace and
mounting the container image is now run in parallel.   This saves 25-40ms.

Signed-off-by: baude <bbaude@redhat.com>
2018-10-25 06:34:23 -05:00
a95d71f113 Allow containers/storage to handle on SELinux labeling
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-23 10:57:23 -04:00
81e63ac309 Merge pull request #1609 from giuseppe/fix-volume-rootless
volume: resolve symlink paths in volumes
2018-10-16 13:25:27 -04:00
d8d4c0f0e1 Touchup fileo typo
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2018-10-15 08:13:42 -04:00
6dd6ce1ebc volume: resolve symlinks in paths
ensure the volume paths are resolved in the mountpoint scope.

Otherwise we might end up using host paths.

Closes: https://github.com/containers/libpod/issues/1608

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-14 16:57:30 +02:00
2ad6012ea1 volume: write the correct ID of the container in error messages
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-14 16:57:29 +02:00
04a537756d Generate a passwd file for users not in container
If someone runs podman as a user (uid) that is not defined in the container
we want generate a passwd file so that getpwuid() will work inside of container.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-12 07:08:13 -04:00
e9ab8583d0 Ensure resolv.conf has the right label and path
Adds a few missing things from writeStringToRundir() to the new
resolv.conf function, specifically relabelling and returning a
path compatible with rootless podman

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-04 17:38:09 -04:00
52de75501c Drop libnetwork vendor and move the code into pkg/
The vendoring issues with libnetwork were significant (it was
dragging in massive amounts of code) and were just not worth
spending the time to work through. Highly unlikely we'll ever end
up needing to update this code, so move it directly into pkg/ so
we don't need to vendor libnetwork. Make a few small changes to
remove the need for the remainder of libnetwork.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-04 17:34:59 -04:00
e4ded6ce7f Switch to using libnetwork's resolvconf package
Libnetwork provides a well-tested package for generating
resolv.conf from the host's that has some features our current
implementation does not. Swap to using their code and remove our
built-in implementation.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-04 17:34:59 -04:00
f7c8fd8a3d Add support to checkpoint/restore containers
runc uses CRIU to support checkpoint and restore of containers. This
brings an initial checkpoint/restore implementation to podman.

None of the additional runc flags are yet supported and container
migration optimization (pre-copy/post-copy) is also left for the future.

The current status is that it is possible to checkpoint and restore a
container. I am testing on RHEL-7.x and as the combination of RHEL-7 and
CRIU has seccomp troubles I have to create the container without
seccomp.

With the following steps I am able to checkpoint and restore a
container:

 # podman run --security-opt="seccomp=unconfined" -d registry.fedoraproject.org/f27/httpd
 # curl -I 10.22.0.78:8080
 HTTP/1.1 403 Forbidden # <-- this is actually a good answer
 # podman container checkpoint <container>
 # curl -I 10.22.0.78:8080
 curl: (7) Failed connect to 10.22.0.78:8080; No route to host
 # podman container restore <container>
 # curl -I 10.22.0.78:8080
 HTTP/1.1 403 Forbidden

I am using CRIU, runc and conmon from git. All required changes for
checkpoint/restore support in podman have been merged in the
corresponding projects.

To have the same IP address in the restored container as before
checkpointing, CNI is told which IP address to use.

If the saved network configuration cannot be found during restore, the
container is restored with a new IP address.

For CRIU to restore established TCP connections the IP address of the
network namespace used for restore needs to be the same. For TCP
connections in the listening state the IP address can change.

During restore only one network interface with one IP address is handled
correctly. Support to restore containers with more advanced network
configuration will be implemented later.

v2:
 * comment typo
 * print debug messages during cleanup of restore files
 * use createContainer() instead of createOCIContainer()
 * introduce helper CheckpointPath()
 * do not try to restore a container that is paused
 * use existing helper functions for cleanup
 * restructure code flow for better readability
 * do not try to restore if checkpoint/inventory.img is missing
 * git add checkpoint.go restore.go

v3:
 * move checkpoint/restore under 'podman container'

v4:
 * incorporated changes from latest reviews

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-10-03 21:41:39 +02:00
3750b35ae2 Merge pull request #1578 from baude/addubuntuci
Add Ubuntu-18.04 to CI testing
2018-10-03 11:35:13 -07:00
14473270d7 Add ability for ubuntu to be tested
unfortunately the papr CI system cannot test ubuntu as a VM; therefore,
this PR still keeps travis.  but it does include fixes that will be required
for running on modern versions of ubuntu.

Signed-off-by: baude <bbaude@redhat.com>
2018-10-03 12:45:37 -05:00
7abf46d15e selinux: drop superflous relabel
The same relabel is already done in writeStringToRundir so we don't
need to do it twice.  The version in writeStringToRundir takes into
account the correct file path when using user namespaces.

Closes: https://github.com/containers/libpod/pull/1584

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-03 16:54:28 +02:00
a3c4ce6717 Merge pull request #1531 from mheon/add_exited_state
Add ContainerStateExited and OCI delete() in cleanup()
2018-10-03 06:06:14 -07:00
b7c5fa70ab Fix Wait() to allow Exited state as well as Stopped
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 14:26:19 -04:00
7e23fb6c5d Fix cleanupRuntime to only save if container is valid
We call cleanup() (which calls cleanupRuntime()) as part of
removing containers, after the container has already been removed
from the database. cleanupRuntime() tries to update and save the
state, which obviously fails if the container no longer exists.
Make the save() conditional on the container not being in the
process of being removed.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 13:47:53 -04:00
b63b1f9cb6 Merge pull request #1562 from mheon/update_install_instructions
Update docs to build a runc that works with systemd
2018-10-02 10:34:32 -07:00
2c7f97d5a7 Add ContainerStateExited and OCI delete() in cleanup()
To work better with Kata containers, we need to delete() from the
OCI runtime as a part of cleanup, to ensure resources aren't
retained longer than they need to be.

To enable this, we need to add a new state to containers,
ContainerStateExited. Containers transition from
ContainerStateStopped to ContainerStateExited via cleanupRuntime
which is invoked as part of cleanup(). A container in the Exited
state is identical to Stopped, except it has been removed from
the OCI runtime and thus will be handled differently when
initializing the container.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 12:05:22 -04:00
4fe1979b9c Need to allocate memory for hook struct
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-02 05:47:29 -04:00
dd73525fd5 Update docs to build a runc that works with systemd
Runc disables systemd cgroup support when build statically, so
don't tell people to do that now that we're defaulting to systemd
for cgroup management.

Also, fix some error messages to use the proper ID() call for
containers.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-01 10:23:06 -04:00
52c1365f32 Add --mount option for create & run command
Signed-off-by: Kunal Kushwaha <kushwaha_kunal_v7@lab.ntt.co.jp>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1524
Approved by: mheon
2018-09-21 21:33:41 +00:00
800eb86338 Hooks supports two directories, process default and override
ALso cleanup files section or podman man page

Add description of policy.json
Sort alphabetically.
Add more info on  oci hooks

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1487
Approved by: umohnani8
2018-09-17 16:28:28 +00:00
294c3f4cab container: resolve rootfs symlinks
Prevent a runc error that doesn't like symlinks as part
of the rootfs.

Closes: https://github.com/containers/libpod/issues/1389

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

Closes: #1390
Approved by: rhatdan
2018-08-31 17:37:20 +00:00
822c327997 Resolve /etc/resolv.conf before reading
In some cases, /etc/resolv.conf can be a symlink to something like
/run/systemd/resolve/resolv.conf.  We currently check for that file
and if it exists, use it instead of /etc/resolv.conf. However, we are
no seeing cases where the systemd resolv.conf exists but /etc/resolv.conf
is NOT a symlink.

Therefore, we now obtain the endpoint for /etc/resolv.conf whether it is a
symlink or not.  That endpoint is now what is read to generate a container's
resolv.conf.

Signed-off-by: baude <bbaude@redhat.com>

Closes: #1368
Approved by: rhatdan
2018-08-28 17:03:19 +00:00