podman rmi: handle corrupted storage better

The storage can easily be corrupted when a build or pull process (or any process *writing* to the storage) has been killed. The corruption surfaces in Podman reporting that a given layer could not be found in the layer tree. Those errors must not be fatal but only logged, such that the image removal may continue. Otherwise, a user may be unable to remove an image. [NO TESTS NEEDED] as I do not yet have a reliable way to cause such a storage corruption. Reported-in: https://github.com/containers/podman/issues/8148#issuecomment-787598940 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2025-12-03 03:39:44 +08:00 · 2021-03-01 09:39:03 +01:00
parent b154c519ac
commit 3752016338
1 changed files with 6 additions and 2 deletions
--- a/libpod/image/layer_tree.go
+++ b/libpod/image/layer_tree.go
@@ -4,7 +4,6 @@ import (
 	"context"

 	ociv1 "github.com/opencontainers/image-spec/specs-go/v1"
-	"github.com/pkg/errors"
 	"github.com/sirupsen/logrus"
 )

@@ -188,7 +187,12 @@ func (t *layerTree) parent(ctx context.Context, child *Image) (*Image, error) {

 	node, exists := t.nodes[child.TopLayer()]
 	if !exists {
-		return nil, errors.Errorf("layer not found in layer tree: %q", child.TopLayer())
+		// Note: erroring out in this case has turned out having been a
+		// mistake. Users may not be able to recover, so we're now
+		// throwing a warning to guide them to resolve the issue and
+		// turn the errors non-fatal.
+		logrus.Warnf("Layer %s not found in layer. The storage may be corrupted, consider running `podman system reset`.", child.TopLayer())
+		return nil, nil
 	}

 	childOCI, err := t.toOCI(ctx, child)