52 Commits

Author SHA1 Message Date
e19e0de5fa Introduce graph-based pod container removal
Originally, during pod removal, we locked every container in the
pod at once, did a number of validity checks to ensure everything
was safe, and then removed all the containers in the pod.

A deadlock was recently discovered with this approach. In brief,
we cannot lock the entire pod (or much more than a single
container at a time) without causing a deadlock. As such, we
converted to an approach where we just looped over each container
in the pod, removing them individually. Unfortunately, this
removed a lot of the validity checking of the earlier approach,
allowing for a lot of unintended bad things. Infra containers
could be removed while containers in the pod still depended on
them, for example.

There's no easy way to do validity checks while in a simple loop,
so I implemented a version of our graph-traversal logic that
currently handles pod start. This version acts in the reverse
order of startup: startup starts from containers which depend on
nothing and moves outwards, while removal acts on containers which
have nothing depend on them and moves inwards. By doing graph
traversal, we can guarantee that nothing is removed while
something that depends on it still exists - so the infra
container should be the last thing in a pod that is removed, for
example.

In the (unlikely) case that a graph of the pod's containers
cannot be built (most likely impossible without database editing)
the old method of pod removal has been retained to ensure that
even misbehaving pods can be forcibly evicted from the state.

I'm fairly confident that this resolves the problem, but there
are a lot of assumptions around dependency structure built into
the original pod removal code and I am not 100% sure I have
captured all of them.

Fixes #15526

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-14 13:44:48 -04:00
75740be395 all: stop using deprecated GenerateNonCryptoID
In view of https://github.com/containers/storage/pull/1337, do this:

	for f in $(git grep -l stringid.GenerateNonCryptoID | grep -v '^vendor/'); do
		sed -i 's/stringid.GenerateNonCryptoID/stringid.GenerateRandomID/g' $f;
	done

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-09-13 16:26:26 -07:00
0f73935563 Add support for containers.conf volume timeouts
Also, do a general cleanup of all the timeout code. Changes
include:
- Convert from int to *uint where possible. Timeouts cannot be
  negative, hence the uint change; and a timeout of 0 is valid,
  so we need a new way to detect that the user set a timeout
  (hence, pointer).
- Change name in the database to avoid conflicts between new data
  type and old one. This will cause timeouts set with 4.2.0 to be
  lost, but considering nobody is using the feature at present
  (and the lack of validation means we could have invalid,
  negative timeouts in the DB) this feels safe.
- Ensure volume plugin timeouts can only be used with volumes
  created using a plugin. Timeouts on the local driver are
  nonsensical.
- Remove the existing test, as it did not use a volume plugin.
  Write a new test that does.

The actual plumbing of the containers.conf timeout in is one line
in volume_api.go; the remainder are the above-described cleanups.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-08-23 15:42:00 -04:00
bb8ff86bf2 Use SafeChown rather then chown for volumes on NFS
NFS Servers will thrown ENOTSUPP error if you attempt to
chown a directory to the same UID and GID as the directory
already has. If volumes are stored on NFS directories this
throws an ugly error and then works on the next try.

Bottom line don't chown directories that already have the correct
UID and GID.

Fixes: https://github.com/containers/podman/issues/14766

[NO NEW TESTS NEEDED] Difficult to setup an NFS Server in testing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-07-12 15:41:13 -04:00
377057b400 [CI:DOCS] Improve language. Fix spelling and typos.
* Correct spelling and typos.

* Improve language.

Co-authored-by: Ed Santiago <santiago@redhat.com>
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2022-07-11 21:59:32 +02:00
773eead54e Merge pull request #14789 from saschagrunert/libpod-errors
libpod/runtime: switch to golang native error wrapping
2022-07-05 07:23:22 +00:00
597de7a083 libpod/runtime: switch to golang native error wrapping
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.

[NO NEW TESTS NEEDED]

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-07-04 15:39:00 +02:00
7131c84723 fix build
PR containers/podman/pull/14449 had an outdated base.  Merging it broke
builds.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-07-01 14:20:31 +02:00
96e72d90b8 Merge pull request #14449 from cdoern/podVolumes
podman volume create --opt=o=timeout...
2022-07-01 08:46:11 +00:00
8267cd3c51 Merge pull request #14734 from giuseppe/copyup-switch-order
volume: add two new options copy and nocopy
2022-06-28 13:57:16 +00:00
aada13f244 volume: new options [no]copy
add two new options to the volume create command: copy and nocopy.

When nocopy is specified, the files from the container image are not
copied up to the volume.

Closes: https://github.com/containers/podman/issues/14722

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-06-27 20:22:20 +02:00
2fab7d169b add podman volume reload to sync volume plugins
Libpod requires that all volumes are stored in the libpod db. Because
volume plugins can be created outside of podman, it will not show all
available plugins. This podman volume reload command allows users to
sync the libpod db with their external volume plugins. All new volumes
from the plugin are also created in the libpod db and when a volume from
the db no longer exists it will be removed if possible.

There are some problems:
- naming conflicts, in this case we only use the first volume we found.
  This is not deterministic.
- race conditions, we have no control over the volume plugins. It is
  possible that the volumes changed while we run this command.

Fixes #14207

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-06-23 18:36:30 +02:00
7b3e43c1f6 podman volume create --opt=o=timeout...
add an option to configure the driver timeout when creating a volume.
The default is 5 seconds but this value is too small for some custom drivers.

Signed-off-by: cdoern <cdoern@redhat.com>
2022-06-09 16:44:21 -04:00
91ead15283 volume: add new option -o o=noquota
add a new option to completely disable xfs quota usage for a volume.

xfs quota set on a volume, even just for tracking disk usage, can
cause weird errors if the volume is later re-used by a container with
a different quota projid.  More specifically, link(2) and rename(2)
might fail with EXDEV if the source file has a projid that is
different from the parent directory.

To prevent such kind of issues, the volume should be created
beforehand with `podman volume create -o o=noquota $ID`

Closes: https://github.com/containers/podman/issues/14049

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-04-28 13:29:01 +02:00
c7b16645af enable unparam linter
The unparam linter is useful to detect unused function parameters and
return values.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-25 13:23:20 +02:00
ea08765f40 go fmt: use go 1.18 conditional-build syntax
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-03-18 09:11:53 +01:00
4a60319ecb Remove the runtime lock
This primarily served to protect us against shutting down the
Libpod runtime while operations (like creating a container) were
happening. However, it was very inconsistently implemented (a lot
of our longer-lived functions, like pulling images, just didn't
implement it at all...) and I'm not sure how much we really care
about this very-specific error case?

Removing it also removes a lot of potential deadlocks, which is
nice.

[NO NEW TESTS NEEDED]

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-02-22 11:05:26 -05:00
bd09b7aa79 bump go module to version 4
Automated for .go files via gomove [1]:
`gomove github.com/containers/podman/v3 github.com/containers/podman/v4`

Remaining files via vgrep [2]:
`vgrep github.com/containers/podman/v3`

[1] https://github.com/KSubedi/gomove
[2] https://github.com/vrothberg/vgrep

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2022-01-18 12:47:07 +01:00
2e0d3e9ea4 Support all volume mounts for rootless containers
Fix handling of "bind" and "tmpfs" olumes to actually work.
Allow bind, tmpfs local volumes to work in rootless mode.

Also removed the string "error" from all error messages that begine with it.
All Podman commands are printed with Error:, so this causes an ugly
stutter.

Fixes: https://github.com/containers/podman/issues/12013

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-01-04 13:48:03 -05:00
7580c22734 Remove a volume with --force if container is running
Currently we are not passing the force flag down to the removal of
the running container. If the container is running, and we set
--force when removing the volume, the container should be stopped.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-10-11 15:02:04 -04:00
21c9dc3c40 Add --time out for podman * rm -f commands
Add --time flag to podman container rm
Add --time flag to podman pod rm
Add --time flag to podman volume rm
Add --time flag to podman network rm

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-10-04 07:07:56 -04:00
1c4e6d8624 standardize logrus messages to upper case
Remove ERROR: Error stutter from logrus messages also.

[ NO TESTS NEEDED] This is just code cleanup.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-09-22 15:29:34 -04:00
592fae4225 Volumes: Only remove from DB if plugin removal succeeds
Originally, Podman would unconditionally remove volumes from the
DB, even if they failed to be removed from the volume plugin;
this was a safety measure to ensure that `volume rm` can always
remove a volume from the database, even if the plugin is
misbehaving.

However, this is a significant deivation from Docker, which
refuses to remove if the plugin errors. These errors can be
legitimate configuration issues which the user should address
before the volume is removed, so Podman should also use this
behaviour.

Fixes #11214

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-08-18 14:19:11 -04:00
c0952c7334 Support size and inode options on builtin volumes
[NO TESTS NEEDED] Since it is difficult to setup xfs quota

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1982164

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-08-02 10:32:45 -04:00
5dded6fae7 bump go module to v3
We missed bumping the go module, so let's do it now :)

* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-02-22 09:03:51 +01:00
b53cb57680 Initial implementation of volume plugins
This implements support for mounting and unmounting volumes
backed by volume plugins. Support for actually retrieving
plugins requires a pull request to land in containers.conf and
then that to be vendored, and as such is not yet ready. Given
this, this code is only compile tested. However, the code for
everything past retrieving the plugin has been written - there is
support for creating, removing, mounting, and unmounting volumes,
which should allow full functionality once the c/common PR is
merged.

A major change is the signature of the MountPoint function for
volumes, which now, by necessity, returns an error. Named volumes
managed by a plugin do not have a mountpoint we control; instead,
it is managed entirely by the plugin. As such, we need to cache
the path in the DB, and calls to retrieve it now need to access
the DB (and may fail as such).

Notably absent is support for SELinux relabelling and chowning
these volumes. Given that we don't manage the mountpoint for
these volumes, I am extremely reluctant to try and modify it - we
could easily break the plugin trying to chown or relabel it.

Also, we had no less than *5* separate implementations of
inspecting a volume floating around in pkg/infra/abi and
pkg/api/handlers/libpod. And none of them used volume.Inspect(),
the only correct way of inspecting volumes. Remove them all and
consolidate to using the correct way. Compat API is likely still
doing things the wrong way, but that is an issue for another day.

Fixes #4304

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2021-01-14 15:35:33 -05:00
28138dafcc Fix missing options in volumes display while setting uid and gid
```
$ podman volume create testvol --opt o=uid=1001,gid=1001
$ ./bin/podman volume create testvol2 --opt o=uid=1001,gid=1001
$ podman volume inspect testvol
        "Options": {},
$ podman volume inspect testvol2
        "Options": {
            "GID": "1001",
            "UID": "1001",
            "o": "uid=1001,gid=1001"
        },
```

Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
2020-12-23 09:13:20 +08:00
ca7dcff5a8 fix: allow volume creation when the _data directory already exists
This restores pre f7e72bc86aff2ff986290f190309deceb7f22099 behavior

Signed-off-by: Yan Minari <yangm97@gmail.com>
2020-11-05 17:09:12 -03:00
a5e37ad280 Switch all references to github.com/containers/libpod -> podman
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-28 08:23:45 -04:00
8489dc4345 move go module to v2
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules.  While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.

Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`.  The renaming of the imports
was done via `gomove` [1].

[1] https://github.com/KSubedi/gomove

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-07-06 15:50:12 +02:00
200cfa41a4 Turn on More linters
- misspell
    - prealloc
    - unparam
    - nakedret

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-06-15 07:05:56 -04:00
4352d58549 Add support for containers.conf
vendor in c/common config pkg for containers.conf

Signed-off-by: Qi Wang qiwan@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-27 14:36:03 -04:00
4004f646cd Add basic deadlock detection for container start/remove
We can easily tell if we're going to deadlock by comparing lock
IDs before actually taking the lock. Add a few checks for this in
common places where deadlocks might occur.

This does not yet cover pod operations, where detection is more
difficult (and costly) due to the number of locks being involved
being higher than 2.

Also, add some error wrapping on the Podman side, so we can tell
people to use `system renumber` when it occurs.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-24 09:29:34 -05:00
b4fa6f4f08 Fix SELinux labels of volumes
If we attempt to label a volume and the file system
does not support labeling, then just warn.  SELinux
may or may not work, on the volume.

There is no way to setup a private label on a newly
created volume without using the container mountlabel.

If we don't have a mount label at the time of creation of
the volume, the only option we have is to create a shared
label.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-02-13 21:42:57 -05:00
67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
84eea2b2c0 Return a better error for volume name conflicts
When you try and create a new volume with the name of a volume
that already exists, you presently get a thoroughly unhelpful
error from `mkdir` as the volume attempts to create the
directory it will be mounted at. An EEXIST out of mkdir is not
particularly helpful to Podman users - it doesn't explain that
the name is already taken by another volume.

The solution here is potentially racy as the runtime is not
locked, so someone else could take the name while we're still
getting things set up, but that's a narrow timing window, and we
will still return an error - just an error that's not as good as
this one.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-23 16:34:32 -04:00
0f6b0e8c9c Ensure volumes can be removed when they fail to unmount
Also, ensure that we don't try to mount them without root - it
appears that it can somehow not error and report that mount was
successful when it clearly did not succeed, which can induce this
case.

We reuse the `--force` flag to indicate that a volume should be
removed even after unmount errors. It seems fairly natural to
expect that --force will remove a volume that is otherwise
presenting problems.

Finally, ignore EINVAL on unmount - if the mount point no longer
exists our job is done.

Fixes: #4247
Fixes: #4248

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-14 10:32:15 -04:00
de9a394fcf Correctly report errors on unmounting SHM
When we fail to remove a container's SHM, that's an error, and we
need to report it as such. This may be part of our lingering
storage woes.

Also, remove MNT_DETACH. It may be another cause of the storage
removal failures.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-05 17:12:27 -04:00
a760e325f3 Add ability for volumes with options to mount/umount
When volume options and the local volume driver are specified,
the volume is intended to be mounted using the 'mount' command.
Supported options will be used to volume the volume before the
first container using it starts, and unmount the volume after the
last container using it dies.

This should work for any local filesystem, though at present I've
only tested with tmpfs and btrfs.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-05 17:12:27 -04:00
e563f41116 Re-add locks to volumes.
This will require a 'podman system renumber' after being applied
to get lock numbers for existing volumes.

Add the DB backend code for rewriting volume configs and use it
for updating lock numbers as part of 'system renumber'.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 11:35:00 -04:00
8b72a72ca2 Implement backend for 'volume inspect'
Begin to separate the internal structures and frontend for
inspect on volumes. We can't rely on keeping internal data
structures for external presentation - separating presentation
and internal data format is good practice.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-02 15:08:30 -04:00
dd81a44ccf remove libpod from main
the compilation demands of having libpod in main is a burden for the
remote client compilations.  to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.

this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).

Signed-off-by: baude <bbaude@redhat.com>
2019-06-25 13:51:24 -05:00
5cbb3e7e9d Use standard remove functions for removing pod ctrs
Instead of rewriting the logic, reuse the standard logic we use
for removing containers, which is much better tested.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-10 14:14:29 -04:00
02c6110093 Fix E2E tests
The Commit test is blatantly wrong and testing buggy behavior. We
should be commiting the destination, if anything - and more
likely nothing at all.

When force-removing volumes, don't remove the volumes of
containers we need to remove. This can lead to a chicken and the
egg problem where the container removes the volume before we can.
When we re-add volume locks this could lead to deadlocks. I don't
really want to deal with this, and this doesn't seem a
particularly harmful quirk, so we'll let this slide until we get
a bug report.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:27:20 -04:00
3e066e2920 Volume force-remove now removed dependent containers
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:26:29 -04:00
d245c6df29 Switch Libpod over to new explicit named volumes
This swaps the previous handling (parse all volume mounts on the
container and look for ones that might refer to named volumes)
for the new, explicit named volume lists stored per-container.

It also deprecates force-removing volumes that are in use. I
don't know how we want to handle this yet, but leaving containers
that depend on a volume that no longer exists is definitely not
correct.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:26:29 -04:00
f7e72bc86a volumes: push the chown logic to runtime_volume_linux.go
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-29 14:04:44 +01:00
0d0ad59641 Default to SELinux private label for play kube mounts
Before, there were SELinux denials when a volume was bind-mounted by podman play kube.
Partially fix this by setting the default private label for mounts created by play kube (with DirectoryOrCreate)
For volumes mounted as Directory, the user will have to set their own SELinux permissions on the mount point

also remove left over debugging print statement

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-03-28 09:54:31 -04:00
ca1e76ff63 Add event logging to libpod, even display to podman
In lipod, we now log major events that occurr.  These events
can be displayed using the `podman events` command. Each
event contains:

* Type (container, image, volume, pod...)
* Status (create, rm, stop, kill, ....)
* Timestamp in RFC3339Nano format
* Name (if applicable)
* Image (if applicable)

The format of the event and the varlink endpoint are to not
be considered stable until cockpit has done its enablement.

Signed-off-by: baude <bbaude@redhat.com>
2019-03-11 15:08:59 -05:00
ca8ae877c1 Remove locks from volumes
I was looking into why we have locks in volumes, and I'm fairly
convinced they're unnecessary.

We don't have a state whose accesses we need to guard with locks
and syncs. The only real purpose for the lock was to prevent
concurrent removal of the same volume.

Looking at the code, concurrent removal ought to be fine with a
bit of reordering - one or the other might fail, but we will
successfully evict the volume from the state.

Also, remove the 'prune' bool from RemoveVolume. None of our
other API functions accept it, and it only served to toggle off
more verbose error messages.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-02-21 10:51:42 -05:00