Instrument cleanup tracer to log weird volume removal flake

Debug for #23913, I though if we have no idea which process is nuking
the volume then we need to figure this out. As there is no reproducer
we can (ab)use the cleanup tracer. Simply trace all unlink syscalls to
see which process deletes our special named volume. Given the volume
name is used as path on the fs and is deleted on volume rm we should
know exactly which process deleted it the next time hopefully.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This commit is contained in:
Paul Holzinger
2024-10-30 18:45:39 +01:00
parent f139bc17b3
commit d633824a95
2 changed files with 16 additions and 1 deletions

View File

@@ -149,3 +149,17 @@ tracepoint:syscalls:sys_enter_write
$offset += $len
}
}
// HACK: debug for https://github.com/containers/podman/issues/23913
// The test uses "ebpf-debug-23913" volume name and because and volume rm
// will delete the path we can trap the process here to find out who actually
// deletes it.
tracepoint:syscalls:sys_enter_unlink*
/ strcontains(str(args.pathname), "ebpf-debug-23913") /
{
printf("Special issue 23913 volume deleted by pid %d: ", pid);
// This can fail to open the file it is done in user space and
// thus racy if the process exits quickly.
cat("/proc/%d/cmdline", pid);
print("");
}