When the path does not exist, filepath.EvalSymlinks returns an
empty string - so we can't just ignore ENOENT, we have to discard
the result if an ENOENT is returned.
Should fix Jira issue RHEL-37948
Signed-off-by: Matt Heon <mheon@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
When Podman starts, it checks a number of critical runtime paths
against stored values in the database to make sure that existing
containers are not broken by a configuration change. We recently
made some changes to this logic to make our handling of the some
options more sane (StaticDir in particular was set based on other
passed options in a way that was not particularly sane) which has
made the logic more sensitive to paths with symlinks. As a simple
fix, handle symlinks properly in our DB vs runtime comparisons.
The BoltDB bits are uglier because very, very old Podman versions
sometimes did not stuff a proper value in the database and
instead used the empty string. SQLite is new enough that we don't
have to worry about such things.
Fixes#20872
Signed-off-by: Matt Heon <mheon@redhat.com>
Since we have sqlite there is no point in duplicating this acroos two db
backends. Just set earlier when we validate the networks anyway.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This has been broken since we added Volumes - so, Podman v0.12.1
(so, around 5 years). I have no evidence anyone is using it in
the wild. It doesn't really function as expected. And it's a lot
of extraneous code and tests for the database.
Rip it out entirely, we can re-add once BoltDB is gone if there
is a requirement to do so.
Signed-off-by: Matt Heon <mheon@redhat.com>
Loading container states speed things up when listing all containers but
it comes with a price tag for many other call paths. Hence, make
loading the state conditional to allow for keeping `podman ps` fast
without other commands regressing in performance.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Do not sync containers with the runtime and the database when listing
containers. It turns out to be extremely expensive and unnecessary.
The sync was needed since listing all containers from the database did
not populate their state. Doing that, however, is much faster since we
already have a connection to the database.
This change makes listing 200 containers 2 times faster than before.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This should simplify the db logic. We no longer need a extra db bucket
for the netns, it is still supported in read only mode for backwards
compat. The old version required us to always open the netns before we
could attach it to the container state struct which caused problem in
some cases were the netns was no longer valid.
Now we use the netns as string throughout the code, this allow us to
only open it when needed reducing possible errors.
[NO NEW TESTS NEEDED] Existing tests should cover it and it is only a
flake so hard to reproduce the error.
Fixes#16140
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We added the concept of image volumes in 2.2.0, to support
inspecting an image from within a container. However, this is a
strictly read-only mount, with no modification allowed.
By contrast, the new `image` volume driver creates a c/storage
container as its underlying storage, so we have a read/write
layer. This, in and of itself, is not especially interesting, but
what it will enable in the future is. If we add a new command to
allow these image volumes to be committed, we can now distribute
volumes - and changes to them - via a standard OCI image registry
(which is rather new and quite exciting).
Future work in this area:
- Add support for `podman volume push` (commit volume changes and
push resulting image to OCI registry).
- Add support for `podman volume pull` (currently, we require
that the image a volume is created from be already pulled; it
would be simpler if we had a dedicated command that did the
pull and made a volume from it)
- Add support for scratch images (make an empty image on demand
to use as the base of the volume)
- Add UOR support to `podman volume push` and
`podman volume pull` to enable both with non-image volume
drivers
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Podman adds an Error: to every error message. So starting an error
message with "error" ends up being reported to the user as
Error: error ...
This patch removes the stutter.
Also ioutil.ReadFile errors report the Path, so wrapping the err message
with the path causes a stutter.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
This commit addresses three intertwined bugs to fix an issue when using
Gitlab runner on Podman. The three bug fixes are not split into
separate commits as tests won't pass otherwise; avoidable noise when
bisecting future issues.
1) Podman conflated states: even when asking to wait for the `exited`
state, Podman returned as soon as a container transitioned to
`stopped`. The issues surfaced in Gitlab tests to fail [1] as
`conmon`'s buffers have not (yet) been emptied when attaching to a
container right after a wait. The race window was extremely narrow,
and I only managed to reproduce with the Gitlab runner [1] unit
tests.
2) The clearer separation between `exited` and `stopped` revealed a race
condition predating the changes. If a container is configured for
autoremoval (e.g., via `run --rm`), the "run" process competes with
the "cleanup" process running in the background. The window of the
race condition was sufficiently large that the "cleanup" process has
already removed the container and storage before the "run" process
could read the exit code and hence waited indefinitely.
Address the exit-code race condition by recording exit codes in the
main libpod database. Exit codes can now be read from a database.
When waiting for a container to exit, Podman first waits for the
container to transition to `exited` and will then query the database
for its exit code. Outdated exit codes are pruned during cleanup
(i.e., non-performance critical) and when refreshing the database
after a reboot. An exit code is considered outdated when it is older
than 5 minutes.
While the race condition predates this change, the waiting process
has apparently always been fast enough in catching the exit code due
to issue 1): `exited` and `stopped` were conflated. The waiting
process hence caught the exit code after the container transitioned
to `stopped` but before it `exited` and got removed.
3) With 1) and 2), Podman is now waiting for a container to properly
transition to the `exited` state. Some tests did not pass after 1)
and 2) which revealed the third bug: `conmon` was executed with its
working directory pointing to the OCI runtime bundle of the
container. The changed working directory broke resolving relative
paths in the "cleanup" process. The "cleanup" process error'ed
before actually cleaning up the container and waiting "main" process
ran indefinitely - or until hitting a timeout. Fix the issue by
executing `conmon` with the same working directory as Podman.
Note that fixing 3) *may* address a number of issues we have seen in the
past where for *some* reason cleanup processes did not fire.
[1] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119#note_970712864
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
[MH: Minor reword of commit message]
Signed-off-by: Matthew Heon <mheon@redhat.com>
add an option to configure the driver timeout when creating a volume.
The default is 5 seconds but this value is too small for some custom drivers.
Signed-off-by: cdoern <cdoern@redhat.com>
Since networks must always be read from the db bucket directly we should
unset them in config to avoid caller from accidentally using them.
I already tried this but it didn't work because the networks were unset
after the config was marshalled.
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Make sure we create new containers in the db with the correct structure.
Also remove some unneeded code for alias handling. We no longer need this
functions.
The specgen format has not been changed for now.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.
Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.
This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.
The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.
To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.
Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12 480821532 2.230 ns/op 0 B/op 0 allocs/op
BenchmarkParsePortMapping1-12 38972 30183 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMapping100-12 18752 60688 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMapping1k-12 3104 331719 ns/op 223840 B/op 3018 allocs/op
BenchmarkParsePortMapping10k-12 376 3122930 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMapping1m-12 3 390869926 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingReverse100-12 18940 63414 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMappingReverse1k-12 3015 362500 ns/op 223841 B/op 3018 allocs/op
BenchmarkParsePortMappingReverse10k-12 343 3318135 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMappingReverse1m-12 3 403392469 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingRange1-12 37635 28756 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange100-12 39604 28935 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1k-12 38384 29921 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange10k-12 29479 40381 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1m-12 927 1279369 ns/op 143022 B/op 164 allocs/op
PASS
ok github.com/containers/podman/v3/pkg/specgen/generate 25.492s
```
Benchmark convert old port format to new one:
```
go test -bench=. -benchmem ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12 663526126 1.663 ns/op 0 B/op 0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12 7858082 141.9 ns/op 72 B/op 2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12 2065347 571.0 ns/op 536 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12 138478 8641 ns/op 4216 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12 9414 120964 ns/op 41080 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12 781 1490526 ns/op 401528 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12 4 250579010 ns/op 40001656 B/op 4 allocs/op
PASS
ok github.com/containers/podman/v3/libpod 11.727s
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman has, for a long time, had an internal concept of
dependency management, used mainly to ensure that pod infra
containers are started before any other container in the pod. We
also have the ability to recursively start these dependencies,
which we use to ensure that `podman start` on a container in a
pod will not fail because the infra container is stopped. We have
not, however, exposed these via the command line until now.
Add a `--requires` flag to `podman run` and `podman create` to
allow users to manually specify dependency containers. These
containers must be running before the container will start. Also,
make recursive starting with `podman start` default so we can
start these containers and their dependencies easily.
Fixes#9250
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Currently we were overwrapping error returned from removal
of a non existing container.
$ podman rm bogus -f
Error: failed to evict container: "": failed to find container "bogus" in state: no container with name or ID bogus found: no such container
Removal of wraps gets us to.
./bin/podman rm bogus -f
Error: no container with name or ID "bogus" found: no such container
Finally also added quotes around container name to help make it standout
when you get an error, currently it gets lost in the error.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We missed bumping the go module, so let's do it now :)
* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
This implements support for mounting and unmounting volumes
backed by volume plugins. Support for actually retrieving
plugins requires a pull request to land in containers.conf and
then that to be vendored, and as such is not yet ready. Given
this, this code is only compile tested. However, the code for
everything past retrieving the plugin has been written - there is
support for creating, removing, mounting, and unmounting volumes,
which should allow full functionality once the c/common PR is
merged.
A major change is the signature of the MountPoint function for
volumes, which now, by necessity, returns an error. Named volumes
managed by a plugin do not have a mountpoint we control; instead,
it is managed entirely by the plugin. As such, we need to cache
the path in the DB, and calls to retrieve it now need to access
the DB (and may fail as such).
Notably absent is support for SELinux relabelling and chowning
these volumes. Given that we don't manage the mountpoint for
these volumes, I am extremely reluctant to try and modify it - we
could easily break the plugin trying to chown or relabel it.
Also, we had no less than *5* separate implementations of
inspecting a volume floating around in pkg/infra/abi and
pkg/api/handlers/libpod. And none of them used volume.Inspect(),
the only correct way of inspecting volumes. Remove them all and
consolidate to using the correct way. Compat API is likely still
doing things the wrong way, but that is an issue for another day.
Fixes#4304
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Convert the existing network aliases set/remove code to network
connect and disconnect. We can no longer modify aliases for an
existing network, but we can add and remove entire networks. As
part of this, we need to add a new function to retrieve current
aliases the container is connected to (we had a table for this
as of the first aliases PR, but it was not externally exposed).
At the same time, remove all deconflicting logic for aliases.
Docker does absolutely no checks of this nature, and allows two
containers to have the same aliases, aliases that conflict with
container names, etc - it's just left to DNS to return all the
IP addresses, and presumably we round-robin from there? Most
tests for the existing code had to be removed because of this.
Convert all uses of the old container config.Networks field,
which previously included all networks in the container, to use
the new DB table. This ensures we actually get an up-to-date list
of in-use networks. Also, add network aliases to the output of
`podman inspect`.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
As part of this, we need two new functions, for retrieving all
aliases for a network and removing all aliases for a network,
both required to test.
Also, rework handling for some things the tests discovered were
broken (notably conflicts between container name and existing
aliases).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
clean the paths before checking whether its value is different than
what is stored in the db.
Closes: https://github.com/containers/podman/issues/8160
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This adds the database backend for network aliases. Aliases are
additional names for a container that are used with the CNI
dnsname plugin - the container will be accessible by these names
in addition to its name. Aliases are allowed to change over time
as the container connects to and disconnects from networks.
Aliases are implemented as another bucket in the database to
register all aliases, plus two buckets for each container (one to
hold connected CNI networks, a second to hold its aliases). The
aliases are only unique per-network, to the global and
per-container aliases buckets have a sub-bucket for each CNI
network that has aliases, and the aliases are stored within that
sub-bucket. Aliases are formatted as alias (key) to container ID
(value) in both cases.
Three DB functions are defined for aliases: retrieving current
aliases for a given network, setting aliases for a given network,
and removing all aliases for a given network.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When an OCI runtime is given by full path, we need to ensure we
use the same runtime on subsequent use. Unfortunately, users are
often not considerate enough to use the same `--runtime` flag
every time they invoke runtime - and if the runtime was not in
containers.conf, that means we don't have it stored inn the
libpod Runtime.
Fortunately, since we have the full path, we can initialize the
OCI runtime for use at the point where we pull the container from
the database.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Check if there is an pod or container an return
the appropriate error message instead of blindly
return 'container exists' with `podman create` and
'pod exists' with `podman pod create`.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules. While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.
Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`. The renaming of the imports
was done via `gomove` [1].
[1] https://github.com/KSubedi/gomove
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Theoretically these should never happen, but it never hurts to be
sure and check. Add a check to one, make the other one a
create-if-not-exist (it was just adding, not checking the
contents).
Signed-off-by: Matthew Heon <mheon@redhat.com>
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.
This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.
Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).
Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add ability to evict a container when it becomes unusable. This may
happen when the host setup changes after a container creation, making it
impossible for that container to be used or removed.
Evicting a container is done using the `rm --force` command.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
When volume options and the local volume driver are specified,
the volume is intended to be mounted using the 'mount' command.
Supported options will be used to volume the volume before the
first container using it starts, and unmount the volume after the
last container using it dies.
This should work for any local filesystem, though at present I've
only tested with tmpfs and btrfs.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We need to be able to track the number of times a volume has been
mounted for tmpfs/nfs/etc volumes. As such, we need a mutable
state for volumes. Add one, with the expected update/save methods
in both states.
There is backwards compat here, in that older volumes without a
state will still be accepted.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This will require a 'podman system renumber' after being applied
to get lock numbers for existing volumes.
Add the DB backend code for rewriting volume configs and use it
for updating lock numbers as part of 'system renumber'.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
the compilation demands of having libpod in main is a burden for the
remote client compilations. to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.
this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).
Signed-off-by: baude <bbaude@redhat.com>