To debug a deadlock, we really want to know what lock is actually
locked, so we can figure out what is using that lock. This PR
adds support for this, using trylock to check if every lock on
the system is free or in use. Will really need to be run a few
times in quick succession to verify that it's not a transient
lock and it's actually stuck, but that's not really a big deal.
Signed-off-by: Matt Heon <mheon@redhat.com>
This is a nice quality-of-life change that should help to debug
situations where someone runs out of locks (usually when a bunch
of unused volumes accumulate).
Signed-off-by: Matt Heon <mheon@redhat.com>
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
golint, scopelint and interfacer are deprecated. golint is replaced by
revive. This linter is better because it will also check for our error
style: `error strings should not be capitalized or end with punctuation or a newline`
scopelint is replaced by exportloopref (already endabled)
interfacer has no replacement but I do not think this linter is
important.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
After a reboot, when we refresh Podman's state, we retrieved the
lock from the fresh SHM instance, but we did not mark it as
allocated to prevent it being handed out to other containers and
pods.
Provide a method for marking locks as in-use, and use it when we
refresh Podman state after a reboot.
Fixes#2900
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The original intent behind the requirement was to ensure that, if
two SHM lock structs were open at the same time, we should not
make such a runtime available to the user, and should clean it up
instead.
It turns out that we don't even need to open a second SHM lock
struct - if we get an error mapping the first one due to a lock
count mismatch, we can just delete it, and it cleans itself up
when it errors. So there's no reason not to return a valid
runtime.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Renumber is a way of renumbering container locks after the number
of locks available has changed.
For now, renumber only works with containers.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Remove runtime's lockDir as it is no longer needed after the lock
rework.
Add a trivial in-memory lock manager for unit testing
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>