- fix issues found by recvcheck
- skip k8s files from recvcheck
- remove two removed linters gomnd and execinquery
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
If volume ls was called while another volume was removed at the right
time it could have failed with "no such volume" as we did not ignore
such error during listing. As we list things and this no longer exists
the correct thing is to ignore the error and continue like we do with
containers, pods, etc...
This was pretty easy to reproduce with these two commands running in
different terminals:
while :; do bin/podman volume create test && bin/podman volume rm test || break; done
while :; do bin/podman volume ls || break ; done
I have a slight feeling that this might solve #23913 but I am not to
sure there so I am not adding a Fixes here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This commit resolves an issue where network creation and removal events were not being logged in `podman events`. A new function has been introduced in the `events` package to ensure consistent logging of network lifecycle events. This update will allow users to track network operations more effectively through the event log, improving visibility and aiding in debugging network-related issues.
Fixes: #24032
Signed-off-by: Sainath Sativar <Sativar.sainath@gmail.com>
One of the problems with the Events() API was that you had to call it in
a new goroutine. This meant the the error returned by it had to be read
back via a second channel. This cuased other bugs in the past but here
the biggest problem is that basic errors such as invalid since/until
options were not directly returned to the caller.
It meant in the API we were not able to write http code 200 quickly
because we always waited for the first event or error from the
channels. This in turn made some clients not happy as they assume the
server hangs on time out if no such events are generated.
To fix this we resturcture the entire event flow. First we spawn the
goroutine inside the eventer Read() function so not all the callers have
to. Then we can return the basic error quickly without the goroutine.
The caller then checks the error like any normal function and the API
can use this one to decide which status code to return.
Second we now return errors/event in one channel then the callers can
decide to ignore or log them which makes it a bit more clear.
Fixes c46884aa93 ("podman events: check for an error after we finish reading events")
Fixes#23712
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Fixes#24267
This commit replaces a potentially unsafe slice-assignment with a call to `slices.Clone`.
This could prevent a bug where `saveCommand` and `loadCommand` could end up sharing an underlying array if `parentFlags` has a cap > it's len.
Signed-off-by: Zachary Hanham <z.hanham00@gmail.com>
Prior to this commit, many scp functions existed without option structs, which would make extending functionality (adding new options) impossible without breaking changes, or without adding redundant wrapper functions.
This commit adds in new option types for various scp related functions, and changes those functions' signatures to use the new options.
This commit also modifies the `ImageEngine.Scp()` function's interface to use the new opts.
The commit also renames the existing `ImageScpOptions` entity type to `ScpTransferImageOptions`. This is because the previous `ImageScpOptions` was inaccurate, as it is not the actual options for `ImageEngine.Scp()`. `ImageEngine.Scp()` should instead receive `ImageScpOptions`.
This commit should not change any behavior, however it will break the existing functions' signatures.
Signed-off-by: Zachary Hanham <z.hanham00@gmail.com>
mapMutex is initialized in the ContainerRm function and cannot be released from outside,
thus unlock mutex before returning from function.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Егор Макрушин <emakrushin@astralinux.ru>
Call the wait endpoint right away when a container is started and not
only when attach is done, this allows us for wait to work when the
container has been removed otherwise (i.e. podman-remote run --rm). In
that case it was possible that wait failed and we then fall back to
reading events. However based on some reports there seems to be the
chance that the event readin is not working for them either and returns
a bad error "Cannot get exit code: <nil>" which does not help anybody.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When we are activated by systemd the code assumed that we had a valid
URL which was not the case so it failed to parse the URL which causes
the info call to fail all the time.
This fixes two problems first add the schema to the systemd activated
listener URL so it can be parsed correctly but second simply do not
parse it as url as all we care about in the info call is if it is unix
and the file path exists.
Fixes#24152
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As shown in #23671 these functions can return the raw error without any
useful context to the user which makes it hard to understand where
things went wrong. Simply add some context to some error paths here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When we check for a storage container mount we normally expect a
ErrContainerUnknown when it does not exists. However during we check if
it is actually mounted we also can get ErrLayerUnknown when the
contianer was removed between the Container and Mount checks as they do
not happen under the same lock.
Fixes#23671
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
These flags can affect the output of the HealtCheck log. Currently, when a container is configured with HealthCheck, the output from the HealthCheck command is only logged to the container status file, which is accessible via `podman inspect`.
It is also limited to the last five executions and the first 500 characters per execution.
This makes debugging past problems very difficult, since the only information available about the failure of the HealthCheck command is the generic `healthcheck service failed` record.
- The `--health-log-destination` flag sets the destination of the HealthCheck log.
- `none`: (default behavior) `HealthCheckResults` are stored in overlay containers. (For example: `$runroot/healthcheck.log`)
- `directory`: creates a log file named `<container-ID>-healthcheck.log` with JSON `HealthCheckResults` in the specified directory.
- `events_logger`: The log will be written with logging mechanism set by events_loggeri. It also saves the log to a default directory, for performance on a system with a large number of logs.
- The `--health-max-log-count` flag sets the maximum number of attempts in the HealthCheck log file.
- A value of `0` indicates an infinite number of attempts in the log file.
- The default value is `5` attempts in the log file.
- The `--health-max-log-size` flag sets the maximum length of the log stored.
- A value of `0` indicates an infinite log length.
- The default value is `500` log characters.
Add --health-max-log-count flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-max-log-size flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
Add --health-log-destination flag
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
The kube generate command can now generate a yaml for
the Job kind and the kube play command can create a pod
and containers with podman when passed in a Job yaml.
Add relevant tests and docs for this.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Adds a note in the `podman machine info` manpage that clarifies
that `defaultmachine` in the `podman machine info` output does
not suggest that a user can set a default podman machine via
system connections.
Additionally adds a Podman 6.0 TODO comment to change the name of the
field to `ActiveMachineConnection` to better describe its purpose.
[NO NEW TESTS NEEDED]
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Many dependencies started using go 1.22 which means we have to follow in
order to update.
Disable the now depracted exportloopref linter as it was replaced by
copyloopvar as go fixed the loop copy problem in 1.22[1]
Another new chnage in go 1.22 is the for loop syntax over ints, the
intrange linter chacks for this but there a lot of loops that have to be
converted so I didn't do it here and disable th elinter for now, th eold
syntax is still fine.
[1] https://go.dev/blog/loopvar-preview
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This commit vendor pre-release version of `c/common:8483ef6022b4`.
It also adapts the code to the new `c/common/libimage` API, which
fixes an image listing race that was listing false warnings.
fixes: #23331
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
The podman container cleanup process runs asynchronous and by the time
it gets the lock it is possible another podman process already did the
cleanup and then did a new init() to start it again. If the cleanup
process gets the lock there it will cause very weird things.
This can be observed in the remote start API as CI flakes.
Fixes#23754
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.
In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.
Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.
Fixes#22852
Fixes RHEL-39513
Signed-off-by: Matt Heon <mheon@redhat.com>
The new golangci-lint version 1.60.1 has problems with typecheck when
linting remote files. We have certain pakcages that should never be
inlcuded in remote but the typecheck tries to compile all of them but
this never works and it seems to ignore the exclude files we gave it.
To fix this the proper way is to mark all packages we only use locally
with !remote tags. This is a bit ugly but more correct. I also moved the
DecodeChanges() code around as it is called from the client so the
handles package which should only be remote doesn't really fit anyway.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When we create a container we first create it in the storage then in the
libpod db so there is a tiny window where it is seen as storage ctr but
then by the time we mount it we see it was a libpod container.
Fixes#23637
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.
Fixes#23554
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The podman container cleanup command is not really intended for human
use. Instead each conmon will spawn this command after the container
exit to make sure we can cleanup resources asynchronously. However this
command will always race against other foreground process such as podman
rm -fa. Therefore it is possible that the ctr was already removed and we
should not log errors in this case.
While these errors are normally not seen as the command is int he
background you can see it if you enable syslog logging and then they
just spam the log with useless errors so just ignore them.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
currently there is no way to specify the mappings, so at least treat a
private user namespace as "auto".
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The pod spec HostUsers boolean only specifies whether a user namespace
is used or not. Hene, the podman specific annotation must have a
higher precedence since it defines how the user namespace is created.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Like commit 55749af0c7 but for podman *pod* stats not the normal podman
stats. We must ignore ErrCtrStopped here as well as this will happen
when the container process exited.
While at it remove a useless argument from the function as it was always
nil and restructure the logic flow to make it easier to read.
Fixes#23334
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
stats read from the cgroup, and in order to know the cgroup we check the
pid for the cgroup. However there is a window where the pid exited and
podman did not yet updated its internal state. In this case the code
returns ErrCtrStopped so we should ignore this error as well.
Fixes#23334
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
If a pod is removed when calling podman pod stats there is a race where
the command might fail with no such pod. This is not a user error, like
the ps/ls command skip it and move to the next one.
Fixes#23327
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When a container or volume is removed during the loop this is not a
problem and we should just skip it as it is not a user bug and just a
normal race.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The current code did something like this:
lock()
getState()
unlock()
if state != running
lock()
getState() == running -> error
unlock()
This of course is wrong because between the first unlock() and second
lock() call another process could have modified the state. This meant
that sometimes you would get a weird error on start because the internal
setup errored as the container was already running.
In general any state check without holding the lock is incorrect and
will result in race conditions. As such refactor the code to combine
both StartAndAttach and Attach() into one function that can handle both.
With that we can move the running check into the locked code.
Also use typed error for this specific error case then the callers can
check and ignore the specific error when needed. This also allows us to
fix races in the compat API that did a similar racy state check.
This commit changes slightly how we output the result, previously a
start on already running container would never print the id/name of the
container which is confusing and sort of breaks idempotence. Now it will
include the output except when --all is used. Then it only reports the
ids that were actually started.
Fixes#23246
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We should never try to reexxec when we are already root with
CAP_SYS_ADMIN. The code contained a bug when --cgroups=disabled is used
as it tried to perfom a reexec even when it was not needed.
Fixes: 900e29549a ("libpod: do not move podman with --cgroups=disabled")
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The code currently tried to avoid joining the userns from conmon
directly and rather joined to only read the pid file and then send this
back to use so we could join the userns. From the comment this was done
because we could not read the pid file. However this is no longer true
as of commit 49eb5af301 and file is no always owned by the real user.
This means we can just remove this special logic and join the namespace
directly there. A test has been added to check the rejoin logic with a
custom uidmapping.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When a user specifies a invalid connection in CONTAINER_CONNECTION then
podman should return a proper error saying so. Currently it ignored the
error and in rootFlags() just exited early with defining any flags. This
caused a panic then when trying to use the flags later.
In order to address this first store the connection error in the
PodmanConfig struct and not abort right away during flag setup. This is
important as the user might have specified a flag with a valid remote
connection. As such we check all flags and only when none were given we
return the connection error.
Also while at it I noticed that the default connection reported via
podman --help was wrong as it only used the old containers.conf field
for it and did not consider the podman-connections.json default.
New regression tests have been added to make sure it behaves correctly.
This fixes the problem reported in the PR #22997.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
add a new flag that allows to override the pull options configured in
the storage.conf file.
e.g.: --pull-option="enable_partial_images=false" can be specified to
Podman to disable partial pulls even if enabled.
Leave it as a hidden configuration flag for now since the API itself
is marked as experimental in c/storage.
Currently c/storage doesn't honor the overrides, being fixed with
https://github.com/containers/storage/pull/1966
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add a `podman system check` that performs consistency checks on local
storage, optionally removing damaged items so that they can be
recreated.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>