4789 Commits

Author SHA1 Message Date
7dd1df4323 Retrieve exit codes for containers via events
As we previously removed our exit code retrieval code to stop a
memory leak, we need a new way of doing this. Fortunately, events
is able to do the job for us.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
ebacfbd091 podman: fix memleak caused by renaming and not deleting
the exit file

If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.

This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.

No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.

A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.

    rm -f /run/libpod/exits/*

This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.

The upstream podman commit used for testing is
1fe2965e4f672674f7b66648e9973a0ed5434bb4 (v1.4.4).

The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).

Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
a622f8d345 Merge pull request #3682 from cevich/fix_release_rerun
Cirrus: Fix re-run of release task into no-op.
2019-07-31 20:10:03 +02:00
3e3afb942a Cirrus: Fix release dependencies
The release-task ***must*** always execute last, in order to guarantee a
consistent cache of release archives from dependent tasks.  It
accomplishes this by verifying it's task-number matches one-less than
the total number of tasks.  Previous to this commit, a YAML anchor/alias
was used to avoid duplication of the dependency list between 'success'
and 'release'

However, it's been observed that this opens the possibility for
'release' and 'success' tasks to race when running on a PR.  Because
YAML anchor/aliases cannot be used to modify lists, duplication is
required to make 'release' actually depend upon 'success'.

This duplication will introduce an additional maintenance burden.
Though when adding a new task, it's already very easy to forget to
update the 'depends_on' list.  Assist both cases by the addition
unit-tests to verify ``.cirrus.yml`` dependency contents and structure.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-31 11:49:53 -04:00
cb2ea1a27b Cirrus: Fix re-run of release task into no-op.
This task depends upon other tasks caching their binaries.  If for
whatever reason the `release` task is re-run and/or is out-of-order
with it's dependents, the state of cache will be undefined. Previously
this would result in an error, and failing of the release task.
This commit alters this behavior to issue a warning instead.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-31 09:42:52 -04:00
680a383874 Merge pull request #3672 from petejohanson/32bit-build-fixes
Build fix for 32-bit systems.
2019-07-30 22:07:32 +02:00
e84ed3c1bc Merge pull request #3665 from QiWang19/env
Set -env variables as appropriate
2019-07-30 21:20:34 +02:00
32aaf8da56 Build fix for 32-bit systems.
* Fixes #3664.

Signed-off-by: Pete Johanson <peter@peterjohanson.com>
2019-07-30 12:25:36 -04:00
2da86bdc3a Set -env variables as appropriate
close #3648

podman create and podman run do not set --env variable if the environment is not present with a value

Signed-off-by: Qi Wang <qiwan@redhat.com>
2019-07-30 12:02:18 -04:00
1a008958d4 Merge pull request #3661 from openSUSE/nixos-friendly-config
Update libpod.conf to be more friendly to NixOS
2019-07-30 16:33:48 +02:00
4196a59452 Merge pull request #3668 from TomSweeneyRedHat/dev/tsweeney/adderror
Touch up input argument error on create
2019-07-30 15:59:59 +02:00
0b14e53590 Touch up input argument error on create
Add an error when there are not enough input arguments for remote
create.  Addresses comments in #3656

Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2019-07-30 09:05:48 -04:00
52ae51c79f Update libpod.conf to be NixOS friendly
NixOS links the current system state to `/run/current-system`, so we
have to add these paths to the configuration files as well to work out
of the box.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-07-30 12:59:11 +02:00
040355d450 Merge pull request #3667 from major/test-with-username-has-dash
Allow info test to work with usernames w/dash
2019-07-30 02:19:19 +02:00
9822f54ac3 Allow info test to work with usernames w/dash
The regular expression used in the `info` test does not allow for
usernames that have a dash, such as `test-user`. This patch adjusts
the regex to allow for a dash.

Fixes #3666.

Signed-off-by: Major Hayden <major@redhat.com>
2019-07-29 16:08:51 -05:00
7d635ac1c5 Merge pull request #3656 from jwhonce/wip/env
Fix commit --changes env=X=Y
2019-07-29 21:57:08 +02:00
71bb2889ff Merge pull request #3660 from LaszloGombos/master
Fix the syntax in the podman export documentation example
2019-07-29 21:43:56 +02:00
5343d79e6c Merge pull request #3663 from adrianreber/random-test-ip
Move random IP code for tests from checkpoint to common
2019-07-29 21:31:13 +02:00
c3c45f3ba5 Merge pull request #3646 from vrothberg/hi-scott
fix `podman -v` regression
2019-07-29 19:54:49 +02:00
e46d9b2e45 Fix the syntax in the podman export documentation example
Signed-off-by: Laszlo Gombos <laszlo.gombos@gmail.com>
2019-07-29 11:14:31 -04:00
6665269ab8 Merge pull request #3233 from wking/fatal-requested-hook-directory-does-not-exist
libpod/container_internal: Make all errors loading explicitly configured hook dirs fatal
2019-07-29 16:39:08 +02:00
2ca7861b8e Merge pull request #3650 from cevich/fix_clone_depth
Cirrus: Remove fixed clone depth
2019-07-29 16:24:58 +02:00
6065070bae fix podman -v regression
Re-add the shortflag for --version and add e2e tests to avoid regressing
in the future.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-07-29 14:47:21 +02:00
90ffba92e9 Move random IP code for tests from checkpoint to common
The function to generate random IP addresses during ginkgo tests in
the checkpoint test code is moved to common and all tests using
hardcoded IP addresses have been changed to use random IP addresses to
reduce test errors when running the tests in parallel.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-29 14:24:08 +02:00
2c98bd5398 Merge pull request #3654 from TomSweeneyRedHat/dev/tsweeney/commandpause
Update pause/unpause video links and demo
2019-07-28 17:06:17 +02:00
40bf0649af Fix commit --changes env=X=Y
Signed-off-by: Jhon Honce <jhonce@redhat.com>
2019-07-26 16:04:17 -07:00
1bedf04416 Update pause/unpause video links and demo
Update the links for the asciinema casts and the demo for the
`podman pause` and `podman unpause` commands on the commands.md
page.

Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2019-07-26 17:04:23 -04:00
6146a3ad0f Cirrus: Remove fixed clone depth
It's been observed on several occasions, some tests fail in git clones
with a "cannot find ref" type error.  Especially in the depth=1 cases.
Since there's really only one place where limiting the depth makes sense
(build-each-commit), simply remove all the other limits.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-26 11:50:44 -04:00
0c4dfcfe57 Merge pull request #3639 from giuseppe/user-ns-container
podman: support --userns=ns|container
2019-07-26 15:06:06 +02:00
b212daa92f Merge pull request #3632 from cevich/small_cirrus_fixes
Small cirrus and image-build fixes
2019-07-26 14:55:12 +02:00
eca157fb54 Merge pull request #3627 from ashley-cui/rmdocs
Documenation & make tar.gz for remote
2019-07-26 11:37:10 +02:00
1910d68e4d Merge pull request #3645 from mheon/systemd_ubuntu
Use systemd cgroups for Ubuntu
2019-07-26 10:41:04 +02:00
4674d00f46 Merge pull request #3580 from samc24/hook
Improved hooks monitoring
2019-07-26 10:25:05 +02:00
1d72f651e4 podman: support --userns=ns|container
allow to join the user namespace of another container.

Closes: https://github.com/containers/libpod/issues/3629

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-25 23:04:55 +02:00
ba5741e398 pods: do not to join a userns if there is not any
do not attempt to join the user namespace if the pod is running in the
host user namespace.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-25 23:04:54 +02:00
ce0132a45e Documenation & build automation for remote darwin
Created shell script to automatically compile remote-only docs & rename
Added make brew-pkg to automatically package files needed for homebrew
Add missing docs

Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
2019-07-25 15:36:39 -04:00
ac5ad9acbf Cirrus: Bypass release during image-building
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 15:20:57 -04:00
0934949220 Use systemd cgroups for Ubuntu
It seems like our VM images now support systemd CGroups with the
Ubuntu LTS images. No reason to keep testing CGroupfs as such,
systemd is much less racy (and CGroupfs on systemd-enabled
systems can be iffy).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-25 14:57:58 -04:00
07b1e331c2 Cirrus: Ubuntu: Set + Test for $RUNC_BINARY
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 14:02:12 -04:00
f55288c96f Cirrus: Simplify evil-unit check in image
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
ceb3d76298 Cirrus: Silence systemd-banish noise
It's somewhat hard to predict which units are certinly present
for any given base-image.  Therefore, at image-build time, it's
distracting and unhelpful to see all the errors about units that
don't exist, on every platform.  Simply ignore them and rely on
the `check_image.sh` test to confirm none are enabled.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
e3082762fe Cirrus: Fix image build metadata update
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
6942d3275d Cirrus: Fix missing -n on CentOS
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
23722e644e Cirrus: Remove disused COMMIT variables
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
dff82d940e Merge pull request #3643 from openSUSE/history-panic
Fix possible runtime panic if image history len is zero
2019-07-25 19:26:47 +02:00
5763618ce5 Merge pull request #3631 from TristanCacqueray/master
Document SELinux label requirements for the rootfs argument
2019-07-25 18:03:35 +02:00
1b95ed9a71 Merge pull request #3622 from QiWang19/checkurl
fix import not ignoring url path
2019-07-25 17:49:08 +02:00
d6ea4b4139 Improved hooks monitoring
...to work for specific edge cases with a simpler solution.
Re-reads hooks directories after any changes are detected by the watchers.
Added monitoring test for adding a different invalid hook to primary directory.
Some issues with prior code:
- ReadDir would stop when it encounters an invalid hook, rather than registering an error but continuing to read the valid hook.
- Wouldn’t account for Rename and Chmod events.
- After doing a mv of the hooks file instead of rm, it would still think the hooks file is in the directory, but it has been moved to another location.
- If a hook file was renamed, it would register the renamed file as a separate hook and not delete the original, so it would then execute the hook twice - once for the renamed file, and once for the original name which it did not delete.

Signed-off-by: samc24 <sam.chaturvedi24@gmail.com>
2019-07-25 09:52:45 -04:00
7630f1b52e Fix possible runtime panic if image history len is zero
We now return an empty string for the `Comment` field if an OCI v1 image
contains no history.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-07-25 12:45:08 +02:00
7c9095ea1d Merge pull request #3641 from mheon/no_fuzzy_volume_lookup
When retrieving volumes, only use exact names
2019-07-25 11:24:29 +02:00