1297 Commits

Author SHA1 Message Date
e01ca4912e Spoof json-file logging support
For docker scripting compatibility, allow for json-file logging when creating args for conmon. That way, when json-file is supported, that case can be easily removed.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-11-11 15:58:17 -05:00
b49ff958d5 performance fix for podman events with large journalds
in the case where the host has a large journald, iterating the journal
without using a Match is very poor performance.  this might be a
temporary fix while we figure out why the systemd library does not seem to
behave properly.

Signed-off-by: baude <bbaude@redhat.com>
2019-08-27 15:46:51 -04:00
bbb1ba78d4 Small optimization - only store exit code when nonzero
JSON optimizes it out in that case anyways, so don't waste cycles
doing an Itoa (and Atoi on the decode side).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-27 15:45:51 -04:00
3849d8cbfb Fix container exit code with Journald backend
We weren't actually storing this, so we'd lose the exit code for
containers run with --rm or force-removed while running if the
journald backend for events was in use.

Fixes #3795

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-27 15:45:43 -04:00
5255a1bb11 Avoid a read-write transaction on DB init
Instead, use a less expensive read-only transaction to see if the
DB is ready for use (it probably is), and only fire the expensive
RW transaction if absolutely necessary.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-08-05 14:40:28 -04:00
a682cd3814 Make configuration validation not require a DB commit
If there are missing fields, we still require a commit, but that
should not happen often.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-05 14:40:28 -04:00
9e5175b06c Remove exec PID files after use to prevent memory leaks
We have another patch running to do the same for exit files, with
a much more in-depth explanation of why it's necessary. Suffice
to say that persistent files in tmpfs tied to container CGroups
lead to significant memory allocations that last for the lifetime
of the file.

Based on a patch by Andrea Arcangeli (aarcange@redhat.com).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-08-05 14:40:28 -04:00
c528d7a622 Use file-based eventer for integration tests
This adds several top-level Podman flags for specifying different
events backend types, which are then used in CI. It resolves a
number of serious issues with events-based testing.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-05 14:40:28 -04:00
c115f4eca0 get last container event
an internal change in libpod will soon required the ability to lookup
the last container event using the continer name or id and the type of
event.  this pr is in preperation for that need.

Signed-off-by: baude <bbaude@redhat.com>
2019-08-05 14:40:28 -04:00
7e0a6a57ef podman: fix memleak caused by renaming and not deleting
the exit file

If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.

This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.

No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.

A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.

    rm -f /run/libpod/exits/*

This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.

The upstream podman commit used for testing is
1fe2965e4f672674f7b66648e9973a0ed5434bb4 (v1.4.4).

The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).

Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>

<Further tweaks to cherry pick into 1.4.2>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-05 14:40:28 -04:00
9dd9705c2f Merge pull request #3358 from mheon/use_disk_spec
Swap to using the on-disk spec for inspect mounts
2019-06-18 23:10:06 +02:00
3cabd81045 Merge pull request #3352 from mheon/inspect_config_to_libpod
Move the Config portion of Inspect into libpod
2019-06-18 20:34:30 +02:00
dc4d20b573 Swap to using the on-disk spec for inspect mounts
When available, using the on-disk spec will show full mount
options in use when the container is running, which can differ
from mount options provided in the original spec - on generating
the final spec, for example, we ensure that some form of root
propagation is set.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-18 09:38:01 -04:00
6ee0f3e99f Merge pull request #3257 from weirdwiz/load
Add warning while untagging an image podman-load
2019-06-17 22:14:26 +02:00
33b71944c0 Move the Config portion of Inspect into libpod
While we're at it, rewrite how we populate it. There were several
potential segfaults in the optional spec.Process block, and a few
fields not being populated correctly versus 'docker inspect'.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-06-17 15:36:55 -04:00
bce4a93575 Merge pull request #3297 from rhatdan/systemd
Accidently removed /run/lock from systemd mounts
2019-06-17 21:26:33 +02:00
29be1764b4 Merge pull request #3348 from vrothberg/kill-error
kill: print ID and state for non-running containers
2019-06-17 15:31:51 +02:00
04858a218f stop/kill: inproper state errors: s/in state/is in state/
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 14:31:55 +02:00
0f75410e1c kill: print ID and state for non-running containers
Extend kill's error message to include the container's ID and state.
This address cases where error messages caused by other containers
may confuse users.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 10:55:54 +02:00
7baa6b6266 Remove unnecessary var type to fix lint warning
Signed-off-by: Lawrence Chan <element103@gmail.com>
2019-06-14 17:42:05 -05:00
373048aaca Move installPrefix and etcDir into runtime.go
Signed-off-by: Lawrence Chan <element103@gmail.com>
2019-06-14 17:42:05 -05:00
6ea12e3028 Improve DESTDIR/PREFIX/ETCDIR handling
- PREFIX is now passed saved in the binary at build-time so that default
  paths match installation paths.
- ETCDIR is also overridable in a similar way.
- DESTDIR is now applied on top of PREFIX for install/uninstall steps.
  Previously, a DESTDIR=/foo PREFIX=/bar make would install into /bar,
  rather than /foo/bar.

Signed-off-by: Lawrence Chan <element103@gmail.com>
2019-06-14 17:42:05 -05:00
49e696642d Add --storage flag to 'podman rm' (local only)
This flag switches to removing containers directly from c/storage
and is mostly used to remove orphan containers.

It's a superior solution to our former one, which attempted
removal from storage under certain circumstances and could, under
some conditions, not trigger.

Also contains the beginning of support for storage in `ps` but
wiring that in is going to be a much bigger pain.

Fixes #3329.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-13 17:02:20 -04:00
2784cf3ca3 Merge pull request #3312 from mheon/podman_inspect_fixes_cont
Further fixes for podman inspect
2019-06-13 18:28:33 +02:00
031280cfe4 Merge pull request #3319 from mheon/purge_easyjson
Purge all use of easyjson and ffjson in libpod
2019-06-13 18:12:40 +02:00
7b7853d8c7 Purge all use of easyjson and ffjson in libpod
We're no longer using either of these JSON libraries, dropped
them in favor of jsoniter. We can't completely remove ffjson as
c/storage uses it and can't easily migrate, but we can make sure
that libpod itself isn't doing anything with them anymore.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-13 11:03:20 -04:00
bcd95f9ddc Split mount options in inspect further
Docker only uses Mode for :z/:Z, so move other options out into a
new field.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-13 09:34:56 -04:00
13e1afdb02 oci: allow to specify what runtimes support JSON
add a new configuration `runtime_supports_json` to list what OCI
runtimes support the --log-format=json option.  If the runtime is not
listed here, libpod will redirect stdout/stderr from the runtime
process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-13 14:21:13 +02:00
6e4ce54d33 oci: use json formatted errors from the runtime
request json formatted error messages from the OCI runtime so that we
can nicely print them.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-13 10:27:06 +02:00
4e7e5f5cbd Make Inspect's mounts struct accurate to Docker
We were formerly dumping spec.Mount structs, with no care as to
whether it was user-generated or not - a relic of the very early
days when we didn't know whether a user made a mount or not.

Now that we do, match our output to Docker's dedicated mount
struct.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-12 17:14:21 -04:00
0084b04aca Provide OCI spec path in podman inspect output
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-12 15:53:41 -04:00
77d1cf0a32 Merge pull request #3305 from giuseppe/slirp-dns-first
rootless: use the slirp4netns builtin DNS first
2019-06-12 16:30:34 +02:00
3bbb692d80 If container is not in correct state podman exec should exit with 126
This way a tool can determine if the container exists or not, but is in the
wrong state.

Since 126 is documeted as:
**_126_** if the **_contained command_** cannot be invoked

It makes sense that the container would exit with this state.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-06-12 05:15:58 -04:00
0e34d9093e rootless: use the slirp4netns builtin DNS first
When using slirp4netns, be sure the built-in DNS server is the first
one to be used.

Closes: https://github.com/containers/libpod/issues/3277

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-12 10:29:57 +02:00
805d1d96fa Accidently removed /run/lock from systemd mounts
This is blowing up systemd containers on Ubuntu.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-06-11 07:54:55 -04:00
c93b8d6b02 Merge pull request #3240 from rhatdan/storageopts
When you change the storage driver we ignore the storage-options
2019-06-10 20:33:46 +02:00
39f5ea4c04 Merge pull request #3180 from mheon/inspect_volumes
Begin to break up pkg/inspect
2019-06-08 14:45:24 +02:00
629017bb19 When you change the storage driver we ignore the storage-options
The storage driver and the storage options in storage.conf should
match, but if you change the storage driver via the command line
then we need to nil out the default storage options from storage.conf.

If the user wants to change the storage driver and use storage options,
they need to specify them on the command line.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-06-08 06:20:31 -04:00
346128792c Merge pull request #2272 from adrianreber/migration
Add support to migrate containers
2019-06-07 14:33:20 +02:00
ef1a025356 Add warning while untagging an image podman-load
Signed-off-by: Divyansh Kamboj <kambojdivyansh2000@gmail.com>
2019-06-04 17:54:07 +05:30
bef83c42ea migration: add possibility to restore a container with a new name
The option to restore a container from an external checkpoint archive
(podman container restore -i /tmp/checkpoint.tar.gz) restores a
container with the same name and same ID as id had before checkpointing.

This commit adds the option '--name,-n' to 'podman container restore'.
With this option the restored container gets the name specified after
'--name,-n' and a new ID. This way it is possible to restore one
container multiple times.

If a container is restored with a new name Podman will not try to
request the same IP address for the container as it had during
checkpointing. This implicitly assumes that if a container is restored
from a checkpoint archive with a different name, that it will be
restored multiple times and restoring a container multiple times with
the same IP address will fail as each IP address can only be used once.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-04 14:02:51 +02:00
8fe22d48fb Inherit rootless init_path from system libpod.conf
Signed-off-by: Lawrence Chan <element103@gmail.com>
2019-06-03 18:44:36 -05:00
0028578b43 Added support to migrate containers
This commit adds an option to the checkpoint command to export a
checkpoint into a tar.gz file as well as importing a checkpoint tar.gz
file during restore. With all checkpoint artifacts in one file it is
possible to easily transfer a checkpoint and thus enabling container
migration in Podman. With the following steps it is possible to migrate
a running container from one system (source) to another (destination).

 Source system:
  * podman container checkpoint -l -e /tmp/checkpoint.tar.gz
  * scp /tmp/checkpoint.tar.gz destination:/tmp

 Destination system:
  * podman pull 'container-image-as-on-source-system'
  * podman container restore -i /tmp/checkpoint.tar.gz

The exported tar.gz file contains the checkpoint image as created by
CRIU and a few additional JSON files describing the state of the
checkpointed container.

Now the container is running on the destination system with the same
state just as during checkpointing. If the container is kept running
on the source system with the checkpoint flag '-R', the result will be
that the same container is running on two different hosts.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-03 22:05:12 +02:00
a05cfd24bb Added helper functions for container migration
This adds a couple of function in structure members needed in the next
commit to make container migration actually work. This just splits of
the function which are not modifying existing code.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-03 22:05:12 +02:00
1be345bd9d Begin to break up pkg/inspect
Let's put inspect structs where they're actually being used. We
originally made pkg/inspect to solve circular import issues.
There are no more circular import issues.

Image structs remain for now, I'm focusing on container inspect.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-03 15:54:53 -04:00
294448c2ea Merge pull request #2709 from haircommander/journald
Add libpod journald logging
2019-05-29 17:51:27 +02:00
aed91ce3bf Merge pull request #3188 from giuseppe/fix-join-existing-containers
rootless: new function to join existing conmon processes
2019-05-29 17:12:40 +02:00
bc7afd6d71 Merge pull request #3208 from vrothberg/fix-3207
runtime: unlock the alive lock only once
2019-05-28 17:19:56 +02:00
88429242dd Add --follow to journald ctr logging
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-05-28 11:14:08 -04:00
51bdf29f04 Address comments
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-05-28 11:10:57 -04:00