6 Commits

Author SHA1 Message Date
b59648a2d6 hack/podman_cleanup_tracer.bt: check map before deleting keys
It seems the new bpftrace version since 0.22 logs a warning if we try to
delete a key that does not exist.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2025-07-09 14:02:11 +02:00
1f8bc9d736 hack/podman_cleanup_tracer.bt: clamp str size for strcontains()
On bpftrace 0.22 this fails to compile and load so the script currently
does not show us anything in CI there.

We need to clamp the string size a bit 128 chars seems more than enough
for the podman/conmon binary path length.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2025-07-09 14:02:10 +02:00
5e5bfadf93 hack/podman_cleanup_tracer.bt: use new max str lenth
The default has been set to 1024 which should bee good enough and better
than having to unroll this loop like that.
This is supported since bpftrace 0.22 which is in fedora 42.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2025-07-09 14:02:10 +02:00
0fb78905c1 Revert "Instrument cleanup tracer to log weird volume removal flake"
This reverts commit d633824a9527b9ec937cdfc8aacc890ec3249127.

The issue has been fixed in commit 9a0c0b2eef and I have not seen it
since so remove this special case.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2025-04-11 15:12:33 +02:00
d633824a95 Instrument cleanup tracer to log weird volume removal flake
Debug for #23913, I though if we have no idea which process is nuking
the volume then we need to figure this out. As there is no reproducer
we can (ab)use the cleanup tracer. Simply trace all unlink syscalls to
see which process deletes our special named volume. Given the volume
name is used as path on the fs and is deleted on volume rm we should
know exactly which process deleted it the next time hopefully.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-30 18:50:07 +01:00
0b59f67c3a add epbf program to trace podman cleanup errors
Add a new program based on bpftrace[1] to trace all podman processes
with arguments and exit code/signals. Additionally this captures stderr
from all podman container cleanup processes spawned by conmon which
otherwise go to /dev/null and are never seen in any CI logs.
Hopefull this allows us to debug strange network cleanup error seen in
CI, my plan is to add this to the cirrus setup and upload the logs so we
can check them when the flakes happen.

[1] https://github.com/bpftrace/bpftrace

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-24 12:47:03 +02:00