syncContainer: transition from stopping to exited

Allow the cleanup process (and others) to transition the container from
`stopping` to `exited`.  This fixes a race condition detected in #14859
where the cleanup process kicks in _before_ the stopping process can
read the exit file.  Prior to this fix, the cleanup process left the
container in the `stopping` state and removed the conmon files, such
that the stopping process also left the container in this state as it
could not read the exit files.  Hence, `podman wait` timed out (see the
23 seconds execution time of the test [1]) due to the unexpected/invalid
state and the test failed.

Further turn the warning during stop to a debug message since it's a
natural race due to the daemonless/concurrent architecture and nothing
to worry about.

[NO NEW TESTS NEEDED] since we can only monitor if #14859 continues
flaking or not.

[1] https://storage.googleapis.com/cirrus-ci-6707778565701632-fcae48/artifacts/containers/podman/6210434704343040/html/sys-remote-fedora-36-rootless-host.log.html#t--00205

Fixes: #14859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This commit is contained in:
Valentin Rothberg
2022-07-27 15:24:02 +02:00
parent 0bf6ee61dd
commit 389a4a6cc6

View File

@ -347,7 +347,7 @@ func (c *Container) syncContainer() error {
}
// If runtime knows about the container, update its status in runtime
// And then save back to disk
if c.ensureState(define.ContainerStateCreated, define.ContainerStateRunning, define.ContainerStateStopped, define.ContainerStatePaused) {
if c.ensureState(define.ContainerStateCreated, define.ContainerStateRunning, define.ContainerStateStopped, define.ContainerStateStopping, define.ContainerStatePaused) {
oldState := c.state.State
if err := c.checkExitFile(); err != nil {
@ -1316,10 +1316,10 @@ func (c *Container) stop(timeout uint) error {
// Since we're now subject to a race condition with other processes who
// may have altered the state (and other data), let's check if the
// state has changed. If so, we should return immediately and log a
// warning.
// state has changed. If so, we should return immediately and leave
// breadcrumbs for debugging if needed.
if c.state.State != define.ContainerStateStopping {
logrus.Warnf(
logrus.Debugf(
"Container %q state changed from %q to %q while waiting for it to be stopped: discontinuing stop procedure as another process interfered",
c.ID(), define.ContainerStateStopping, c.state.State,
)