binutils-gdb

mirror of https://github.com/espressif/binutils-gdb.git synced 2025-06-09 17:33:24 +08:00

Author	SHA1	Message	Date
Tom de Vries	1bcaeecb7f	[gdb/testsuite] Add xfail case in gdb.python/py-record-btrace.exp I came across: ... gdb) PASS: gdb.python/py-record-btrace.exp: prepare record: stepi 100 python insn = r.instruction_history^M warning: Non-contiguous trace at instruction 1 (offset = 0x3e10).^M (gdb) FAIL: gdb.python/py-record-btrace.exp: prepare record: python insn = r.i\ nstruction_history ... I'm assuming it's the same root cause as for the already present XFAIL. Fix this by recognizing above warning in the xfail regexp. Tested on x86_64-linux, although sofar I was not able to trigger the warning again. Approved-By: Markus T. Metzger <markus.t.metzger@intel.com>	2023-02-20 11:16:02 +01:00
Tom de Vries	13d4a4bd5a	[gdb/testsuite] Fix gdb.threads/schedlock.exp for gcc 4.8.5 Since commit 9af467b8240 ("[gdb/testsuite] Fix gdb.threads/schedlock.exp on fast cpu"), the test-case fails for gcc 4.8.5. The problem is that for gcc 4.8.5, the commit turned a two-line loop: ... (gdb) next 78 while (myp > 0) (gdb) next 81 MAYBE_CALL_SOME_FUNCTION(); (myp) ++; (gdb) next 78 while (myp > 0) ... into a three-line loop: ... (gdb) next 83 MAYBE_CALL_SOME_FUNCTION(); (myp) ++; (gdb) next 84 cnt++; (gdb) next 85 } (gdb) next 83 MAYBE_CALL_SOME_FUNCTION(); (*myp) ++; (gdb) ... and the test-case doesn't expect this. Fix this by reverting back to the original loop shape as much as possible by: - removing the cnt++ line - replacing "while (1)" with "while (one)", where one is a volatile variable set to 1. Tested on x86_64-linux, using compilers: - gcc 4.8.5, 7.5.0, 12.2.1 - clang 4.0.1, 13.0.1	2023-02-20 11:16:02 +01:00
Tom Tromey	47fe57c928	Fix "start" for D, Rust, etc The new DWARF indexer broke "start" for some languages. For D, it is broken because, while the code in cooked_index_shard::add specifically excludes Ada, it fails to exclude D. This means that the C "main" will be detected as "main" here -- whereas what is intended is for the code in find_main_name to use d_main_name to find the name. The Rust compiler, on the other hand, uses DW_AT_main_subprogram. However, the code in dwarf2_build_psymtabs_hard fails to create a fully-qualified name, so the name always ends up as plain "main". For D and Ada, a very simple approach suffices: remove the check against "main" from cooked_index_shard::add. This also has the benefit of slightly speeding up DWARF indexing. I assume this approach will work for Pascal and Modula-2 as well, but I don't have a way to test those at present. For Rust, though, this is not sufficient. And, computing the fully-qualified name in dwarf2_build_psymtabs_hard will crash, because cooked_index_entry::full_name uses the canonical name -- and that is not computed until after canonicalization. However, we don't want to wait for canonicalization to be done before computing the main name. That would remove any benefit from doing canonicalization is the background. This patch solves this dilemma by noticing that languages using DW_AT_main_subprogram are, currently, disjoint from languages requiring canonicalization. Because of this, we can add a parameter to full_name to let us avoid crashes, slowdowns, and races here. This is kind of tricky and ugly, so I've tried to comment it sufficiently. While doing this, I had to change gdb.dwarf2/main-subprogram.exp. A different possibility here would be to ignore the canonicalization needs of C in this situation, because those only affect certain types. However, I chose this approach because the test case is artificial anyhow. A long time ago, in an earlier threading attempt, I changed the global current_language to be a function (hidden behind a macro) to let us attempt lazily computing the current language. Perhaps this approach could still be made to work. However, that also seemed rather tricky, more so than this patch. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30116	2023-02-18 15:41:38 -07:00
Tom Tromey	e8eca7a6b6	Fix crash in go_symbol_package_name go_symbol_package_name package name asserts that it is only passed a Go symbol, but this is not enforced by one caller. It seems simplest to just check and return early in this case. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17876 Reviewed-By: Andrew Burgess <aburgess@redhat.com>	2023-02-17 19:05:04 -05:00
Andrew Burgess	733da2ced8	gdb: fix regression in gdb.xml/maint_print_struct.exp A regression in gdb.xml/maint_print_struct.exp was introduced with commit: commit 81b86eced24f905545b58aa6c27478104c364976 Date: Fri Jan 6 09:30:40 2023 -0700 Do not record a rejected target description The test relied on an invalid target description being stored within the tdesc_info of the current inferior, the above commit stopped this behaviour. Update the test to check that the invalid architecture is NOT stored, and then check printing the target description directly from the file. Approved-By: Tom Tromey <tromey@adacore.com>	2023-02-17 22:29:09 +00:00
Tom de Vries	ab3fdfe6e4	[gdb/testsuite] Simplify gdb.arch/amd64-disp-step-avx.exp On SLE-11, with glibc 2.11.3, I run into: ... (gdb) PASS: gdb.arch/amd64-disp-step-avx.exp: vex3: \ var128 has expected value after continue^M Continuing.^M ^M Program received signal SIGSEGV, Segmentation fault.^M 0x0000000000400283 in _exit (status=0) at \ ../sysdeps/unix/sysv/linux/_exit.c:33^M 33 ../sysdeps/unix/sysv/linux/_exit.c: No such file or directory.^M (gdb) FAIL: gdb.arch/amd64-disp-step-avx.exp: \ continue until exit at amd64-disp-step-avx ... This is not related to gdb, we get the same result by just running the exec. The problem is that the test-case: - calls glibc's _exit, and - uses -nostartfiles -static, putting the burden for any necessary initialization for calling glibc's _exit on the test-case itself. So, when we get to the second insn in _exit: ... 000000000040acb0 <_exit>: 40acb0: 48 63 d7 movslq %edi,%rdx 40acb3: 64 4c 8b 14 25 00 00 mov %fs:0x0,%r10 ... no glibc-related initialization is done, and we run into the segfault. Adding this (borrowed from __libc_start_main) in _start in the .S file is sufficient to fix it: ... .rept 200 nop + call __pthread_initialize_minimal .endr ... But that already doesn't compile with say glibc 2.31, and regardless I think this sort of fix is too fragile. We could of course fix this by simply not running to exit. But ideally we'd have an exec that doesn't segfault when you just run it. Alternatively, we could hand-code an _exit syscall and bypass glibc all together. But I'd rather fix this in a way that simplifies the test-case. Taking a step back, the -nostartfiles -static was added to address that the xmm registers were not zero at main (which AFAICT is a valid thing to happen). [ The change itself silently broke the test-case, needing further fixing by commit 40310f30a51 ("gdb: make gdb.arch/amd64-disp-step-avx.exp actually test displaced stepping"). ] Instead, simplify things by reverting to the original situation: - no -nostartfiles -static compilation flags, - no _start in the .S file, - use exit instead of _exit in the .S file, and fix the original problem by setting the xmm registers to zero rather than checking that they're zero. Now that we're no longer forcing -static, add nopie to the flags to prevent compilation failure with target board unix/-fPIE/-pie. Tested on x86_64-linux. PR testsuite/30132 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30132	2023-02-17 15:33:18 +01:00
Pedro Alves	141cd15842	Don't throw quit while handling inferior events, part II I noticed that if Ctrl-C was typed just while GDB is evaluating a breakpoint condition in the background, and GDB ends up reaching out to the Python interpreter, then the breakpoint condition would still fail, like: c& Continuing. (gdb) Error in testing breakpoint condition: Quit That happens because while evaluating the breakpoint condition, we enter Python, and end up calling PyErr_SetInterrupt (it's called by gdbpy_set_quit_flag, in frame #0): (top-gdb) bt #0 gdbpy_set_quit_flag (extlang=0x558c68f81900 <extension_language_python>) at ../../src/gdb/python/python.c:288 #1 0x0000558c6845f049 in set_quit_flag () at ../../src/gdb/extension.c:785 #2 0x0000558c6845ef98 in set_active_ext_lang (now_active=0x558c68f81900 <extension_language_python>) at ../../src/gdb/extension.c:743 #3 0x0000558c686d3e56 in gdbpy_enter::gdbpy_enter (this=0x7fff2b70bb90, gdbarch=0x558c6ab9eac0, language=0x0) at ../../src/gdb/python/python.c:212 #4 0x0000558c68695d49 in python_on_memory_change (inferior=0x558c6a830b00, addr=0x555555558014, len=4, data=0x558c6af8a610 "") at ../../src/gdb/python/py-inferior.c:146 #5 0x0000558c6823a071 in std::__invoke_impl<void, void (&)(inferior, unsigned long, long, unsigned char const), inferior, unsigned long, long, unsigned char const> (__f=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior, CORE_ADDR, ssize_t, bfd_byte const)>) at /usr/include/c++/11/bits/invoke.h:61 #6 0x0000558c68237591 in std::__invoke_r<void, void (&)(inferior, unsigned long, long, unsigned char const), inferior, unsigned long, long, unsigned char const> (__fn=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior, CORE_ADDR, ssize_t, bfd_byte const)>) at /usr/include/c++/11/bits/invoke.h:111 #7 0x0000558c68233e64 in std::_Function_handler<void (inferior, unsigned long, long, unsigned char const), void ()(inferior, unsigned long, long, unsigned char const)>::_M_invoke(std::_Any_data const&, inferior&&, unsigned long&&, long&&, unsigned char const&&) (__functor=..., __args#0=@0x7fff2b70bd40: 0x558c6a830b00, __args#1=@0x7fff2b70bd38: 93824992247828, __args#2=@0x7fff2b70bd30: 4, __args#3=@0x7fff2b70bd28: 0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:290 #8 0x0000558c6830a96e in std::function<void (inferior, unsigned long, long, unsigned char const)>::operator()(inferior, unsigned long, long, unsigned char const) const (this=0x558c6a8ecd98, __args#0=0x558c6a830b00, __args#1=93824992247828, __args#2=4, __args#3=0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:590 #9 0x0000558c6830a620 in gdb::observers::observable<inferior, unsigned long, long, unsigned char const*>::notify (this=0x558c690828c0 <gdb::observers::memory_changed>, args#0=0x558c6a830b00, args#1=93824992247828, args#2=4, args#3=0x558c6af8a610 "") at ../../src/gdb/../gdbsupport/observable.h:166 #10 0x0000558c68309d95 in write_memory_with_notification (memaddr=0x555555558014, myaddr=0x558c6af8a610 "", len=4) at ../../src/gdb/corefile.c:363 #11 0x0000558c68904224 in value_assign (toval=0x558c6afce910, fromval=0x558c6afba6c0) at ../../src/gdb/valops.c:1190 #12 0x0000558c681e3869 in expr::assign_operation::evaluate (this=0x558c6af8e150, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/expop.h:1902 #13 0x0000558c68450c89 in expr::logical_or_operation::evaluate (this=0x558c6afab060, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:2330 #14 0x0000558c6844a896 in expression::evaluate (this=0x558c6afcfe60, expect_type=0x0, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:110 #15 0x0000558c6844a95e in evaluate_expression (exp=0x558c6afcfe60, expect_type=0x0) at ../../src/gdb/eval.c:124 #16 0x0000558c682061ef in breakpoint_cond_eval (exp=0x558c6afcfe60) at ../../src/gdb/breakpoint.c:4971 ... The fix is to disable cooperative SIGINT handling while handling inferior events, so that SIGINT is saved in the global quit flag, and not in the extension language, while handling an event. This commit augments the testcase added by the previous commit to test this scenario as well. Approved-By: Tom Tromey <tom@tromey.com> Change-Id: Idf8ab815774ee6f4b45ca2d0caaf30c9b9f127bb	2023-02-15 20:58:10 +00:00
Pedro Alves	0ace6ace1b	Don't throw quit while handling inferior events This implements what I suggested here: https://inbox.sourceware.org/gdb-patches/ab97c553-f406-b094-cdf3-ba031fdea925@palves.net/ Here is the current default quit_handler, a function that ends up called by the QUIT macro: void default_quit_handler (void) { if (check_quit_flag ()) { if (target_terminal::is_ours ()) quit (); else target_pass_ctrlc (); } } As we can see above, when the inferior is running in the foreground, then a Ctrl-C is translated into a call to target_pass_ctrlc(). The target_terminal::is_ours() case above is there to handle the scenario where GDB has the terminal, meaning it is handling some command the user typed, like "list", or "p a + b" or some such. However, when the inferior is running on the background, say with "c&", GDB also has the terminal. Run control handling is now done in the "background". The CLI is responsive to user commands. If users type Ctrl-C, they're expecting it to interrupt whatever command they next type in the CLI, which again, could be "list", "p a + b", etc. It's as if background run control was handled by a separate thread, and the Ctrl-C is meant to go to the main thread, handling the CLI. However, when handling an event, inside fetch_inferior_event & friends, a Ctrl-C _also_ results in a Quit exception, from the same default_quit_handler function shown above. This quit aborts run control handling, breakpoint condition evaluation, etc., and may even leave run control in an inconsistent state. The testcase added by this patch illustrates this. The test program just loops a number of times calling the "foo" function. The idea is to set a breakpoint in the "foo" function with a condition that sends SIGINT to GDB, and then evaluates to false, which results in the program being re-resumed in the background. The SIGINT-sending emulates pressing Ctrl-C just while GDB was evaluating the breakpoint condition, except, it's more deterministic. It looks like this: (gdb) p $counter = 0 $1 = 0 (gdb) b foo if $counter++ == 10 \|\| $_shell("kill -SIGINT `pidof gdb`") != 0 Breakpoint 2 at 0x555555555131: file gdb.base/bg-exec-sigint-bp-cond.c, line 21. (gdb) c& Continuing. (gdb) After that background continue, the breakpoint should be hit 10 times, and we should see 10 "Quit" being printed on the screen. As if the user typed Ctrl-C on the prompt a number of times with no inferior running: (gdb) <<< Ctrl-C (gdb) Quit <<< Ctrl-C (gdb) Quit <<< Ctrl-C (gdb) However, here's what you see instead: (gdb) c& Continuing. (gdb) Quit (gdb) Just one Quit, and nothing else. If we look at the thread's state, we see: (gdb) info threads Id Target Id Frame * 1 Thread 0x7ffff7d6f740 (LWP 112192) "bg-exec-sigint-" foo () at gdb.base/bg-exec-sigint-bp-cond.c:21 So the thread stopped, but we didn't report a stop... Issuing another continue shows the same immediate-and-silent-stop: (gdb) c& Continuing. (gdb) Quit (gdb) p $counter $2 = 2 As mentioned, since the run control handling, and breakpoint and watchpoint evaluation, etc. are running in the background from the perspective of the CLI, when users type Ctrl-C in this situation, they're thinking of aborting whatever other command they were typing or running at the prompt, not the run control side, not the previous "c&" command. So I think that we should install a custom quit_handler while inside fetch_inferior_event, where we already disable pagination and other things for a similar reason. This custom quit handler does nothing if GDB has the terminal, and forwards Ctrl-C to the inferior otherwise. With the patch implementing that, and the same testcase, here's what you see instead: (gdb) p $counter = 0 $1 = 0 (gdb) b foo if $counter++ == 10 \|\| $_shell("kill -SIGINT `pidof gdb`") != 0 Breakpoint 2 at 0x555555555131: file gdb.base/bg-exec-sigint-bp-cond.c, line 21. (gdb) c& Continuing. (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Quit (gdb) Breakpoint 2, foo () at gdb.base/bg-exec-sigint-bp-cond.c:21 21 return 0; Approved-By: Tom Tromey <tom@tromey.com> Change-Id: I1f10d99496a7d67c94b258e45963e83e439e1778	2023-02-15 20:58:10 +00:00
Pedro Alves	91265a7d7c	Add new "$_shell(CMD)" internal function For testing a following patch, I wanted a way to send a SIGINT to GDB from a breakpoint condition. And I didn't want to do it from a Python breakpoint or Python function, as I wanted to exercise non-Python code paths. So I thought I'd add a new $_shell internal function, that runs a command under the shell, and returns the exit code. With this, I could write: (gdb) b foo if $_shell("kill -SIGINT $gdb_pid") != 0 \|\| <other condition> I think this is generally useful, hence I'm proposing it here. Here's the new function in action: (gdb) p $_shell("true") $1 = 0 (gdb) p $_shell("false") $2 = 1 (gdb) p $_shell("echo hello") hello $3 = 0 (gdb) p $_shell("foobar") bash: line 1: foobar: command not found $4 = 127 (gdb) help function _shell $_shell - execute a shell command and returns the result. Usage: $_shell (command) Returns the command's exit code: zero on success, non-zero otherwise. (gdb) NEWS and manual changes included. Approved-By: Andrew Burgess <aburgess@redhat.com> Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Eli Zaretskii <eliz@gnu.org> Change-Id: I7e36d451ee6b428cbf41fded415ae2d6b4efaa4e	2023-02-15 20:58:00 +00:00
Pedro Alves	751495be92	Make "ptype INTERNAL_FUNCTION" in Ada print like other languages Currently, printing the type of an internal function in Ada shows double <>s, like: (gdb) with language ada -- ptype $_isvoid type = <<internal function>> while all other languages print it with a single <>, like: (gdb) with language c -- ptype $_isvoid type = <internal function> I don't think there's a reason that Ada needs to be different. We currently print the double <>s because we take this path in ada_print_type: switch (type->code ()) { default: gdb_printf (stream, "<"); c_print_type (type, "", stream, show, level, language_ada, flags); gdb_printf (stream, ">"); break; ... and the type's name already has the <>s. Fix this by simply adding an early check for TYPE_CODE_INTERNAL_FUNCTION. Approved-By: Andrew Burgess <aburgess@redhat.com> Approved-By: Tom Tromey <tom@tromey.com> Change-Id: Ic2b6527b9240a367471431023f6e27e6daed5501 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30105	2023-02-15 20:56:57 +00:00
Pedro Alves	a975d4e6bc	Fix "ptype INTERNAL_FUNC" (PR gdb/30105) Currently, looking at the type of an internal function, like below, hits an odd error: (gdb) ptype $_isvoid type = <internal function>type not handled in c_type_print_varspec_prefix() That is an error thrown from c-typeprint.c:c_type_print_varspec_prefix, where it reads: ... case TYPE_CODE_DECFLOAT: case TYPE_CODE_FIXED_POINT: /* These types need no prefix. They are listed here so that gcc -Wall will reveal any types that haven't been handled. */ break; default: error (_("type not handled in c_type_print_varspec_prefix()")); break; Internal function types have type code TYPE_CODE_INTERNAL_FUNCTION, which is not explicitly handled by that switch. That comment quoted above says that gcc -Wall will reveal any types that haven't been handled, but that's not actually true, at least with modern GCCs. You would need to enable -Wswitch-enum for that, which we don't. If I do enable that warning, then I see that we're missing handling for the following type codes: TYPE_CODE_INTERNAL_FUNCTION, TYPE_CODE_MODULE, TYPE_CODE_NAMELIST, TYPE_CODE_XMETHOD TYPE_CODE_MODULE and TYPE_CODE_NAMELIST and Fortran-specific, so it'd be a little weird to handle them here. I tried to reach this code with TYPE_CODE_XMETHOD, but couldn't figure out how to. ptype on an xmethod isn't treated specially, it just complains that the method doesn't exist. I've extended the gdb.python/py-xmethods.exp testcase to make sure of that. My thinking is that whatever type code we add next, the most likely scenario is that it won't need any special handling, so we'd just be adding another case to that "do nothing" list. If we do need special casing for whatever type code, I think that tests added at the same time as the feature would uncover it anyhow. If we do miss adding the special casing, then it still looks better to me to print the type somewhat incompletely than to error out and make it harder for users to debug whatever they need. So I think that the best thing to do here is to just remove all those explicit "do nothing" cases, along with the error default case. After doing that, I decided to write a testcase that iterates over all supported languages doing "ptype INTERNAL_FUNC". That revealed that Pascal has a similar problem, except the default case hits a gdb_assert instead of an error: (gdb) with language pascal -- ptype $_isvoid type = ../../src/gdb/p-typeprint.c:268: internal-error: type_print_varspec_prefix: unexpected type A problem internal to GDB has been detected, further debugging may prove unreliable. That is fixed by this patch in the same way. You'll notice that the new testcase special-cases the Ada expected output: } elseif {$lang == "ada"} { gdb_test "ptype \$_isvoid" "<<internal function>>" } else { gdb_test "ptype \$_isvoid" "<internal function>" } That will be subject of the following patch. Approved-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I81aec03523cceb338b5180a0b4c2e4ad26b4c4db Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30105	2023-02-15 20:56:52 +00:00
Tom de Vries	5bed9dc992	[gdb/testsuite] Add xfail in gdb.python/py-record-btrace.exp There's a HW bug affecting Processor Trace on some Intel processors (Ice Lake to Raptor Lake microarchitectures). The bug was exposed by linux kernel commit 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode"), added in version v5.5.0, and was worked around by commit ce0d998be927 ("perf/x86/intel/pt: Fix sampling using single range output") in version 6.1.0. The bug manifests (on a Performance-core of an i7-1250U, an Alder Lake cpu) in a single test-case: ... (gdb) python insn = r.instruction_history^M warning: Decode error (-20) at instruction 33 (offset = 0x3d6a, \ pc = 0x400501): compressed return without call.^M (gdb) FAIL: gdb.python/py-record-btrace.exp: prepare record: \ python insn = r.instruction_history ... Add a corresponding XFAIL. Note that the i7-1250U has both Performance-cores and Efficient-cores, and on an Efficient-Core the test-case runs without any problems, so if the testsuite run is not pinned to a specific cpu, the test may either PASS or XFAIL. Tested on x86_64-linux: - openSUSE Leap 15.4 with linux kernel version 5.14.21 - openSUSE Tumbleweed with linux kernel version 6.1.8 PR testsuite/30075 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30075	2023-02-14 13:15:49 +01:00
Tom de Vries	37d75d4552	[gdb/testsuite] Factor out proc linux_kernel_version Factor out new proc linux_kernel_version from test-case gdb.arch/i386-pkru.exp. Tested on x86_64-linux.	2023-02-14 11:53:54 +01:00
Tom Tromey	382d927ffc	Rename all fields of struct value This renames all the fields of struct value, in preparation for the coming changes. Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-13 15:21:06 -07:00
Andrew Burgess	9ae4519da9	gdb/python: deallocate tui window factories at Python shut down The previous commit relied on spotting when a Python defined TUI window factory was deleted. I spotted that the window factories are not deleted when GDB shuts down its Python environment, they are only deleted when one window factory replaces another. Consider this example Python script: class TestWindowFactory: def __init__(self, msg): self.msg = msg print("Entering TestWindowFactory.__init__: %s" % self.msg) def __call__(self, tui_win): print("Entering TestWindowFactory.__call__: %s" % self.msg) return TestWindow(tui_win, self.msg) def __del__(self): print("Entering TestWindowFactory.__del__: %s" % self.msg) gdb.register_window_type("test_window", TestWindowFactory("A")) gdb.register_window_type("test_window", TestWindowFactory("B")) And this GDB session: (gdb) source tui.py Entering TestWindowFactory.__init__: A Entering TestWindowFactory.__init__: B Entering TestWindowFactory.__del__: B (gdb) quit Notice that when the 'B' window replaces the 'A' window we see the 'A' object being deleted. But, when Python is shut down (after the 'quit') the 'B' object is never deleted. Instead, GDB retains a reference to the window factory object, which forces the Python object to remain live even after the Python interpreter itself has been shut down. The references themselves are held in a dynamically allocated std::unordered_map (in tui/tui-layout.c) which is never deallocated, thus the underlying Python references are never decremented to zero, and so GDB never tries to delete these Python objects. This commit is the first half of the work to clean up this edge case. All gdbpy_tui_window_maker objects (the objects that implement the TUI window factory callback for Python defined TUI windows), are now linked together into a global list using the intrusive list mechanism. When GDB shuts down the Python interpreter we can now walk this global list and release the reference that is held to the underlying Python object. By releasing this reference the Python object will now be deleted. I've added a new assert in gdbpy_tui_window_maker::operator(), this will catch the case where we somehow end up in here after having reset the reference to the underlying Python object. I don't think this should ever happen though as we only clear the references when shutting down the Python interpreter, and the ::operator() function is only called when trying to apply a new TUI layout - something that shouldn't happen while GDB itself is shutting down. This commit does not update the std::unordered_map in tui-layout.c, that will be done in the next commit. Reviewed-By: Tom Tromey <tom@tromey.com>	2023-02-13 14:50:46 +00:00
Andrew Burgess	d159d87072	gdb/python: allow Python TUI windows to be replaced The documentation for gdb.register_window_type says: "... It's an error to try to replace one of the built-in windows, but other window types can be replaced. ..." I take this to mean that if I imported a Python script like this: gdb.register_window_type('my_window', FactoryFunction) Then GDB would have a new TUI window 'my_window', which could be created by calling FactoryFunction(). If I then, in the same GDB session imported a script which included: gdb.register_window_type('my_window', UpdatedFactoryFunction) Then GDB would replace the old 'my_window' factory with my new one, GDB would now call UpdatedFactoryFunction(). This is pretty useful in practice, as it allows users to iterate on their window implementation within a single GDB session. However, right now, this is not how GDB operates. The second call to register_window_type is basically ignored and the old window factory is retained. This is because in tui_register_window (tui/tui-layout.c) we use std::unordered_map::emplace to insert the new factory function, and emplace doesn't replace an existing element in an unordered_map. In this commit, before the emplace call, I now search for an already existing element, and delete any matching element from the map, the emplace call will then add the new factory function. Reviewed-By: Tom Tromey <tom@tromey.com>	2023-02-13 14:50:37 +00:00
Andrew Burgess	97c1951915	gdb/testsuite: handle differences in guile error string output A new guile test added in commit: commit 0a9ccb9dd79384f3ba3f8cd75940e8868f3b526f Date: Mon Feb 6 13:04:16 2023 +0000 gdb: only allow one of thread or task on breakpoints or watchpoints fails for some versions of guile. It turns out that some versions of guile emit an error like this: (gdb) guile (set-breakpoint-thread! bp 1) ERROR: In procedure set-breakpoint-thread!: In procedure gdbscm_set_breakpoint_thread_x: cannot set both task and thread attributes Error while executing Scheme code. while other versions of guile emit the error like this: (gdb) guile (set-breakpoint-thread! bp 1) ERROR: In procedure set-breakpoint-thread!: ERROR: In procedure gdbscm_set_breakpoint_thread_x: cannot set both task and thread attributes Error while executing Scheme code. notice the extra 'ERROR: ' on the second line of output. This commit updates the test regexp to handle this optional 'ERROR: ' string.	2023-02-13 11:19:57 +00:00
Lancelot SIX	f9767e607d	gdb/testsuite: look for hipcc in env(ROCM_PATH) If the hipcc compiler cannot be found in dejagnu's tool_root_dir, look for it in $::env(ROCM_PATH) (if set). If hipcc is still not found, fallback to "hipcc" so the compiler will be searched in the PATH. This removes the fallback to the hard-coded "/opt/rocm/bin" prefix. This change is done so ROCM tools are searched in a uniform manner. Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-13 09:42:14 +00:00
Lancelot SIX	39f6d7c6b0	gdb/testsuite: allow_hipcc_tests tests the hipcc compiler Update allow_hipcc_tests so all gdb.rocm tests are skipped if we do not have a working hipcc compiler available. To achieve this, adjust gdb_simple_compile to ensure that the hip program is saved in a ".cpp" file before calling hipcc otherwise compilation will fail. One thing to note is that it is possible to have a hipcc installed with a CUDA backend. Compiling with this back-end will successfully result in an application, but GDB cannot debug it (at least for the offload part). In the context of the gdb.rocm tests, we want to detect such situation where gdb_simple_compile would give a false positive. To achieve this, this patch checks that there is at least one AMDGPU device available and that hipcc can compile for this or those targets. Detecting the device is done using the rocm_agent_enumerator tool which is installed with the all ROCm installations (it is used by hipcc to detect identify targets if this is not specified on the comand line). This patch also makes the allow_hipcc_tests proc a cached proc. Co-Authored-By: Pedro Alves <pedro@palves.net> Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-13 09:42:14 +00:00
Lancelot SIX	310943c20c	gdb/testsuite: require amd-dbgapi support to run rocm tests Update allow_hipcc_tests to check that GDB has the amd-dbgapi support built-in. Without this support, all tests using hipcc and the rocm stack will fail. Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-13 09:42:13 +00:00
Lancelot SIX	09ad7eb8cc	gdb/testsuite: Rename skip_hipcc_tests to allow_hipcc_tests Rename skip_hipcc_tests to allow_hipcc_tests so it can be used as a "require" predicate in tests. Use require in gdb.rocm/simple.exp. Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-13 09:42:13 +00:00
Andrew Burgess	f0bdf68d3f	gdb/c++: fix handling of breakpoints on @plt symbols This commit should fix PR gdb/20091, PR gdb/17201, and PR gdb/17071. Additionally, PR gdb/17199 relates to this area of code, but is more of a request to refactor some parts of GDB, this commit does not address that request, but it is probably worth reading that PR when looking at this commit. When the current language is C++, and the user places a breakpoint on a function in a shared library, GDB will currently find two locations for the breakpoint, one location will be within the function itself as we would expect, but the other location will be within the PLT table for the call to the named function. Consider this session: $ gdb -q /tmp/breakpoint-shlib-func Reading symbols from /tmp/breakpoint-shlib-func... (gdb) start Temporary breakpoint 1 at 0x40112e: file /tmp/breakpoint-shlib-func.cc, line 20. Starting program: /tmp/breakpoint-shlib-func Temporary breakpoint 1, main () at /tmp/breakpoint-shlib-func.cc:20 20 int answer = foo (); (gdb) break foo Breakpoint 2 at 0x401030 (2 locations) (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y <MULTIPLE> 2.1 y 0x0000000000401030 <foo()@plt> 2.2 y 0x00007ffff7fc50fd in foo() at /tmp/breakpoint-shlib-func-lib.cc:20 This is not the expected behaviour. If we compile the same test using a C compiler then we see this: (gdb) break foo Breakpoint 2 at 0x7ffff7fc50fd: file /tmp/breakpoint-shlib-func-c-lib.c, line 20. (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y 0x00007ffff7fc50fd in foo at /tmp/breakpoint-shlib-func-c-lib.c:20 Here's what's happening. When GDB parses the symbols in the main executable and the shared library we see a number of different symbols for foo, and use these to create entries in GDB's msymbol table: - In the main executable we see a symbol 'foo@plt' that points at the plt entry for foo, from this we add two entries into GDB's msymbol table, one called 'foo@plt' which points at the plt entry and has type mst_text, then we create a second symbol, this time called 'foo' with type mst_solib_trampoline which also points at the plt entry, - Then, when the shared library is loaded we see another symbol called 'foo', this one points at the actual implementation in the shared library. This time GDB creates a msymbol called 'foo' with type mst_text that points at the implementation. This means that GDB creates 3 msymbols to represent the 2 symbols found in the executable and shared library. When the user creates a breakpoint on 'foo' GDB eventually ends up in search_minsyms_for_name (linespec.c), this function then calls iterate_over_minimal_symbols passing in the name we are looking for wrapped in a lookup_name_info object. In iterate_over_minimal_symbols we iterate over two hash tables (using the name we're looking for as the hash key), first we walk the hash table of symbol linkage names, then we walk the hash table of demangled symbol names. When the language is C++ the symbols for 'foo' will all have been mangled, as a result, in this case, the iteration of the linkage name hash table will find no matching results. However, when we walk the demangled hash table we do find some results. In order to match symbol names, GDB obtains a symbol name matching function by calling the get_symbol_name_matcher method on the language_defn class. For C++, in this case, the matching function we use is cp_fq_symbol_name_matches, which delegates the work to strncmp_iw_with_mode with mode strncmp_iw_mode::MATCH_PARAMS and language set to language_cplus. The strncmp_iw_mode::MATCH_PARAMS mode means that strncmp_iw_mode will skip any parameters in the demangled symbol name when checking for a match, e.g. 'foo' will match the demangled name 'foo()'. The way this is done is that the strings are matched character by character, but, once the string we are looking for ('foo' here) is exhausted, if we are looking at '(' then we consider the match a success. Lets consider the 3 symbols GDB created. If the function declaration is 'void foo ()' then from the main executable we added symbols '_Z3foov@plt' and '_Z3foov', while from the shared library we added another symbol call '_Z3foov'. When these are demangled they become 'foo()@plt', 'foo()', and 'foo()' respectively. Now, the '_Z3foov' symbol from the main executable has the type mst_solib_trampoline, and in search_minsyms_for_name, we search for any symbols of type mst_solib_trampoline and filter these out of the results. However, the '_Z3foov@plt' symbol (from the main executable), and the '_Z3foov' symbol (from the shared library) both have type mst_text. During the demangled name matching, due to the use of MATCH_PARAMS mode, we stop the comparison as soon as we hit a '(' in the demangled name. And so, '_Z3foov@plt', which demangles to 'foo()@plt' matches 'foo', and '_Z3foov', which demangles to 'foo()' also matches 'foo'. By contrast, for C, there are no demangled hash table entries to be iterated over (in iterate_over_minimal_symbols), we only consider the linkage name symbols which are 'foo@plt' and 'foo'. The plain 'foo' symbol obviously matches when we are looking for 'foo', but in this case the 'foo@plt' will not match due to the '@plt' suffix. And so, when the user asks for a breakpoint in 'foo', and the language is C, search_minsyms_for_name, returns a single msymbol, the mst_text symbol for foo in the shared library, while, when the language is C++, we get two results, '_Z3foov' for the shared library function, and '_Z3foov@plt' for the plt entry in the main executable. I propose to fix this in strncmp_iw_with_mode. When the mode is MATCH_PARAMS, instead of stopping at a '(' and assuming the match is a success, GDB will instead search forward for the matching, closing, ')', effectively skipping the parameter list, and then resume matching. Thus, when comparing 'foo' to 'foo()@plt' GDB will effectively compare against 'foo@plt' (skipping the parameter list), and the match will fail, just as it does when the language is C. There is one slight complication, which is revealed by the test gdb.linespec/cpcompletion.exp, when searching for the symbol of a const member function, the demangled symbol will have 'const' at the end of its name, e.g.: struct_with_const_overload::const_overload_fn() const Previously, the matching would stop at the '(' character, but after my change the whole '()' is skipped, and the match resumes. As a result, the 'const' modifier results in a failure to match, when previously GDB would have found a match. To work around this issue, in strncmp_iw_with_mode, when mode is MATCH_PARAMS, after skipping the parameter list, if the next character is '@' then we assume we are looking at something like '@plt' and return a value indicating the match failed, otherwise, we return a value indicating the match succeeded, this allows things like 'const' to be skipped. With these changes in place I now see GDB correctly setting a breakpoint only at the implementation of 'foo' in the shared library. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20091 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17201 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17071 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17199 Tested-By: Bruno Larsen <blarsen@redhat.com> Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-02-12 06:19:53 +00:00
Andrew Burgess	0a9ccb9dd7	gdb: only allow one of thread or task on breakpoints or watchpoints After this mailing list posting: https://sourceware.org/pipermail/gdb-patches/2023-February/196607.html it seems to me that in practice an Ada task maps 1:1 with a GDB thread, and so it doesn't really make sense to allow uses to give both a thread and a task within a single breakpoint or watchpoint condition. This commit updates GDB so that the user will get an error if both are specified. I've added new tests to cover the CLI as well as the Python and Guile APIs. For the Python and Guile testing, as far as I can tell, this was the first testing for this corner of the APIs, so I ended up adding more than just a single test. For documentation I've added a NEWS entry, but I've not added anything to the docs themselves. Currently we document the commands with a thread-id or task-id as distinct command, e.g.: 'break LOCSPEC task TASKNO' 'break LOCSPEC task TASKNO if ...' 'break LOCSPEC thread THREAD-ID' 'break LOCSPEC thread THREAD-ID if ...' As such, I don't believe there is any indication that combining 'task' and 'thread' would be expected to work; it seems clear to me in the above that those four options are all distinct commands. I think the NEWS entry is enough that if someone is combining these keywords (it's not clear what the expected behaviour would be in this case) then they can figure out that this was a deliberate change in GDB, but for a new user, the manual doesn't suggest combining them is OK, and any future attempt to combine them will give an error. Approved-By: Pedro Alves <pedro@palves.net>	2023-02-12 05:46:44 +00:00
Andrew Burgess	f1f517e810	gdb: show task number in describe_other_breakpoints I noticed that describe_other_breakpoints doesn't show the task number, but does show the thread-id. I can't see any reason why we'd want to not show the task number in this situation, so this commit adds this missing information, and extends gdb.ada/tasks.exp to check this case. Approved-By: Pedro Alves <pedro@palves.net>	2023-02-11 17:36:24 +00:00
Andrew Burgess	ce068c5f45	gdb: don't print global thread-id to CLI in describe_other_breakpoints I noticed that describe_other_breakpoints was printing the global thread-id to the CLI. For CLI output we should be printing the inferior local thread-id (e.g. "2.1"). This can be seen in the following GDB session: (gdb) info threads Id Target Id Frame 1.1 Thread 4065742.4065742 "bp-thread-speci" main () at /tmp/bp-thread-specific.c:27 * 2.1 Thread 4065743.4065743 "bp-thread-speci" main () at /tmp/bp-thread-specific.c:27 (gdb) break foo thread 2.1 Breakpoint 3 at 0x40110a: foo. (2 locations) (gdb) break foo thread 1.1 Note: breakpoint 3 (thread 2) also set at pc 0x40110a. Note: breakpoint 3 (thread 2) also set at pc 0x40110a. Breakpoint 4 at 0x40110a: foo. (2 locations) Notice that GDB says: Note: breakpoint 3 (thread 2) also set at pc 0x40110a. The 'thread 2' in here is using the global thread-id, we should instead say 'thread 2.1' which corresponds to how the user specified the breakpoint. This commit fixes this issue and adds a test. Approved-By: Pedro Alves <pedro@palves.net>	2023-02-11 17:35:14 +00:00
Andrew Burgess	bb146a79c7	gdb: add test for readline handling very long commands The test added in this commit tests for a long fixed readline issue relating to long command lines. A similar patch has existed in the Fedora GDB tree for several years, but I don't see any reason why this test would not be suitable for inclusion in upstream GDB. I've updated the patch to current testsuite standards. The test is checking for an issue that was fixed by this readline patch: https://lists.gnu.org/archive/html/bug-readline/2006-11/msg00002.html Which was merged into readline 6.0 (released ~2010). The issue was triggered when the user enters a long command line, which wrapped over multiple terminal lines. The crash looks like this: free(): invalid pointer Fatal signal: Aborted ----- Backtrace ----- 0x4fb583 gdb_internal_backtrace_1 ../../src/gdb/bt-utils.c:122 0x4fb583 _Z22gdb_internal_backtracev ../../src/gdb/bt-utils.c:168 0x6047b9 handle_fatal_signal ../../src/gdb/event-top.c:964 0x7f26e0cc56af ??? 0x7f26e0cc5625 ??? 0x7f26e0cae8d8 ??? 0x7f26e0d094be ??? 0x7f26e0d10aab ??? 0x7f26e0d124ab ??? 0x7f26e1d32e12 rl_free_undo_list ../../readline-5.2/undo.c:119 0x7f26e1d229eb readline_internal_teardown ../../readline-5.2/readline.c:405 0x7f26e1d3425f rl_callback_read_char ../../readline-5.2/callback.c:197 0x604c0d gdb_rl_callback_read_char_wrapper_noexcept ../../src/gdb/event-top.c:192 0x60581d gdb_rl_callback_read_char_wrapper ../../src/gdb/event-top.c:225 0x60492f stdin_event_handler ../../src/gdb/event-top.c:545 0xa60015 gdb_wait_for_event ../../src/gdbsupport/event-loop.cc:694 0xa6078d gdb_wait_for_event ../../src/gdbsupport/event-loop.cc:593 0xa6078d _Z16gdb_do_one_eventi ../../src/gdbsupport/event-loop.cc:264 0x6fc459 start_event_loop ../../src/gdb/main.c:411 0x6fc459 captured_command_loop ../../src/gdb/main.c:471 0x6fdce4 captured_main ../../src/gdb/main.c:1310 0x6fdce4 _Z8gdb_mainP18captured_main_args ../../src/gdb/main.c:1325 0x44f694 main ../../src/gdb/gdb.c:32 --------------------- I recreated the above crash by a little light hacking on GDB, and then linking GDB against readline 5.2. The above stack trace was generated from the test included in this patch, and matches the trace that was included in the original bug report. It is worth acknowledging that without hacking things GDB has a minimum requirement of readline 7.0. This test is not about checking whether GDB has been built against an older version of readline, it is about checking that readline doesn't regress in this area. Reviewed-By: Tom Tromey <tom@tromey.com>	2023-02-11 17:17:56 +00:00
Andrew Burgess	a0c0791577	GDB: Introduce limited array lengths while printing values This commit introduces the idea of loading only part of an array in order to print it, what I call "limited length" arrays. The motivation behind this work is to make it possible to print slices of very large arrays, where very large means bigger than `max-value-size'. Consider this GDB session with the current GDB: (gdb) set max-value-size 100 (gdb) p large_1d_array value requires 400 bytes, which is more than max-value-size (gdb) p -elements 10 -- large_1d_array value requires 400 bytes, which is more than max-value-size notice that the request to print 10 elements still fails, even though 10 elements should be less than the max-value-size. With a patched version of GDB: (gdb) p -elements 10 -- large_1d_array $1 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9...} So now the print has succeeded. It also has loaded `max-value-size' worth of data into value history, so the recorded value can be accessed consistently: (gdb) p -elements 10 -- $1 $2 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9...} (gdb) p $1 $3 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, <unavailable> <repeats 75 times>} (gdb) Accesses with other languages work similarly, although for Ada only C-style [] array element/dimension accesses use history. For both Ada and Fortran () array element/dimension accesses go straight to the inferior, bypassing the value history just as with C pointers. Co-Authored-By: Maciej W. Rozycki <macro@embecosm.com>	2023-02-10 23:49:19 +00:00
Maciej W. Rozycki	a2fb245a4b	GDB/testsuite: Add `-nonl' option to` gdb_test' Add a `-nonl' option to `gdb_test' making it possible to match output from commands such as `output' that do not produce a new line sequence at the end, e.g.: (gdb) output 0 0(gdb)	2023-02-10 23:49:19 +00:00
Maciej W. Rozycki	aaab5fce4f	GDB: Only make data actually retrieved into value history available While it makes sense to allow accessing out-of-bounds elements in the debuggee and see whatever there might happen to be there in memory (we are a debugger and not a programming rules enforcement facility and we want to make people's life easier in chasing bugs), e.g.: (gdb) print one_hundred[-1] $1 = 0 (gdb) print one_hundred[100] $2 = 0 (gdb) we shouldn't really pretend that we have any meaningful data around values recorded in history (what these commands really retrieve are current debuggee memory contents outside the original data accessed, really confusing in my opinion). Mark values recorded in history as such then and verify accesses to be in-range for them: (gdb) print one_hundred[-1] $1 = <unavailable> (gdb) print one_hundred[100] $2 = <unavailable> Add a suitable test case, which also covers integer overflows in data location calculation. Approved-By: Tom Tromey <tom@tromey.com>	2023-02-10 23:49:19 +00:00
Maciej W. Rozycki	bae19789c0	GDB: Ignore `max-value-size' setting with value history accesses We have an inconsistency in value history accesses where array element accesses cause an error for entries exceeding the currently selected `max-value-size' setting even where such accesses successfully complete for elements located in the inferior, e.g.: (gdb) p/d one $1 = 0 (gdb) p/d one_hundred $2 = {0 <repeats 100 times>} (gdb) p/d one_hundred[99] $3 = 0 (gdb) set max-value-size 25 (gdb) p/d one_hundred value requires 100 bytes, which is more than max-value-size (gdb) p/d one_hundred[99] $7 = 0 (gdb) p/d $2 value requires 100 bytes, which is more than max-value-size (gdb) p/d $2[99] value requires 100 bytes, which is more than max-value-size (gdb) According to our documentation the `max-value-size' setting is a safety guard against allocating an overly large amount of memory. Moreover a statement in documentation says, concerning this setting, that: "Setting this variable does not affect values that have already been allocated within GDB, only future allocations." While in the implementer-speak the sentence may be unambiguous I think the outside user may well infer that the setting does not apply to values previously printed. Therefore rather than just fixing this inconsistency it seems reasonable to lift the setting for value history accesses, under an implication that by having been retrieved from the debuggee they have already passed the safety check. Do it then, by suppressing the value size check in `value_copy' -- under an observation that if the original value has been already loaded (i.e. it's not lazy), then it must have previously passed said check -- making the last two commands succeed: (gdb) p/d $2 $8 = {0 <repeats 100 times>} (gdb) p/d $2 [99] $9 = 0 (gdb) Expand the testsuite accordingly, covering both value history handling and the use of `value_copy' by `make_cv_value', used by Python code.	2023-02-10 23:49:19 +00:00
Simon Marchi	71bb560755	gdb/testsuite: fix gdb.gdb/selftest.exp for native-extended-gdbserver Following commit 4e2a80ba606 ("gdb/testsuite: expect SIGSEGV from top GDB spawn id"), the next failure I get in gdb.gdb/selftest.exp, using the native-extended-gdbserver, is: (gdb) PASS: gdb.gdb/selftest.exp: send ^C to child process signal SIGINT Continuing with signal SIGINT. FAIL: gdb.gdb/selftest.exp: send SIGINT signal to child process (timeout) The problem is that in this gdb_test_multiple: set description "send SIGINT signal to child process" gdb_test_multiple "signal SIGINT" "$description" { -re "^signal SIGINT\r\nContinuing with signal SIGINT.\r\nQuit\r\n.* $" { pass "$description" } } The "Continuing with signal SIGINT" portion is printed by the top GDB, while the Quit portion is printed by the bottom GDB. As the gdb_test_multiple is written, it expects both the the top GDB's spawn id. Fix this by splitting the gdb_test_multiple in two. The first one expects the "Continuing with signal SIGINT" from the top GDB. The second one expect "Quit" and the "(xgdb)" prompt from $inferior_spawn_id. When debugging natively, this spawn id will be the same as the top GDB's spawn id, but it's different when debugging with GDBserver. Change-Id: I689bd369a041b48f4dc9858d38bf977d09600da2	2023-02-10 13:55:45 -05:00
Tom de Vries	632652850d	[gdb/testsuite] Fix linespec ambiguity in gdb.base/longjmp.exp PR testsuite/30103 reports the following failure on aarch64-linux (ubuntu 22.04): ... (gdb) PASS: gdb.base/longjmp.exp: with_probes=0: pattern 1: next to longjmp next warning: Breakpoint address adjusted from 0x83dc305fef755015 to \ 0xffdc305fef755015. Warning: Cannot insert breakpoint 0. Cannot access memory at address 0xffdc305fef755015 __libc_siglongjmp (env=0xaaaaaaab1018 <env>, val=1) at ./setjmp/longjmp.c:30 30 } (gdb) KFAIL: gdb.base/longjmp.exp: with_probes=0: pattern 1: gdb/26967 \ (PRMS: next over longjmp) delete breakpoints Delete all breakpoints? (y or n) y (gdb) info breakpoints No breakpoints or watchpoints. (gdb) break 63 No line 63 in the current file. Make breakpoint pending on future shared library load? (y or [n]) n (gdb) FAIL: gdb.base/longjmp.exp: with_probes=0: pattern 2: setup: breakpoint \ at pattern start (got interactive prompt) ... The test-case intends to set the breakpoint on line number 63 in gdb.base/longjmp.c. It tries to do so by specifying "break 63", which specifies a line in the "current source file". Due to the KFAIL PR, gdb stopped in __libc_siglongjmp, and because of presence of debug info, the "current source file" becomes glibc's ./setjmp/longjmp.c. Consequently, setting the breakpoint fails. Fix this by adding a $subdir/$srcfile: prefix to the breakpoint linespecs. I've managed to reproduce the FAIL on x86_64/-m32, by installing the glibc-32bit-debuginfo package. This allowed me to confirm the "current source file" that is used: ... (gdb) KFAIL: gdb.base/longjmp.exp: with_probes=0: pattern 1: gdb/26967 \ (PRMS: next over longjmp) info source^M Current source file is ../setjmp/longjmp.c^M ... Tested on x86_64-linux, target boards unix/{-m64,-m32}. Reported-By: Luis Machado <luis.machado@arm.com> Reviewed-By: Tom Tromey <tom@tromey.com> PR testsuite/30103 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30103	2023-02-10 15:58:00 +01:00
Tom Tromey	8e77fff268	Fix comment in gdb.rust/fnfield.exp gdb.rust/fnfield.exp has a comment that, I assume, I copied from some other test. This patch fixes it.	2023-02-09 12:23:08 -07:00
Christina Schimpe	31cf28c784	gdb, testsuite: Remove unnecessary call of "set print pretty on" The command has no effect for the loading of GDB pretty printers and is removed by this patch to avoid confusion. Documentation for "set print pretty" "Cause GDB to print structures in an indented format with one member per line"	2023-02-09 19:38:52 +01:00
Tom Tromey	1775f8b380	Increase size of main_type::nfields main_type::nfields is a 'short', and has been for many years. PR c++/29985 points out that 'short' is too narrow for an enum that contains more than 2^15 constants. This patch bumps the size of 'nfields'. To verify that the field isn't directly used, it is also renamed. Note that this does not affect the size of main_type on x86-64 Fedora 36. And, if it does have a negative effect somewhere, it's worth considering that types could be shrunk more drastically by using subclasses for the different codes. This is v2 of this patch, which has these changes: * I changed nfields to 'unsigned', per Simon's request. I looked at changing all the uses, but this quickly fans out into a very large patch. (One additional tweak was needed, though.) * I wrote a test case. I discovered that GCC cannot compile a large enough C test case, so I resorted to using the DWARF assembler. This test doesn't reproduce the crash, but it does fail without the patch. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29985	2023-02-09 07:55:34 -07:00
Tom Tromey	cdeb7b7de2	Avoid FAILs in gdb.compile Many gdb.compile C++ tests fail for me on Fedora 36. I think these are largely bugs in the plugin, though I didn't investigate too deeply. Once one failure is seen, this often cascades and sometimes there are many timeouts. For example, this can happen: (gdb) compile code var = a->get_var () warning: Could not find symbol "_ZZ9_gdb_exprP10__gdb_regsE1a" for compiled module "/tmp/gdbobj-0xdI6U/out2.o". 1 symbols were missing, cannot continue. I think this is probably a plugin bug because, IIRC, in theory these symbols should be exempt from a lookup via gdb. This patch arranges to catch any catastrophic failure and then simply exit the entire .exp file.	2023-02-08 10:12:22 -07:00
Tom Tromey	300fa060ab	Don't let .gdb_history file cause failures I had a .gdb_history file in my testsuite directory in the build tree, and this provoked a failure in gdbhistsize-history.exp. It seems simple to prevent this file from causing a failure.	2023-02-08 10:12:22 -07:00
Tom de Vries	4e315cd4af	[gdb/testsuite] Use maint ignore-probes in gdb.base/longjmp.exp Test-case gdb.base/longjmp.exp handles both the case that there is a libc longjmp probe, and the case that there isn't. However, it only tests one of the two cases. Use maint ignore-probes to test both cases, if possible. Tested on x86_64-linux.	2023-02-08 13:46:17 +01:00
Tom de Vries	0ab9328277	[gdb/testsuite] Use maint ignore-probes in gdb.base/solib-corrupted.exp Test-case gdb.base/solib-corrupted.exp only works for a glibc without probes interface, otherwise we run into: ... XFAIL: gdb.base/solib-corrupted.exp: info probes UNTESTED: gdb.base/solib-corrupted.exp: GDB is using probes ... Fix this by using maint ignore-probes to simulate the absence of the relevant probes. Also, it requires glibc debuginfo, and if not present, it produces an XFAIL: ... XFAIL: gdb.base/solib-corrupted.exp: make solibs looping UNTESTED: gdb.base/solib-corrupted.exp: no _r_debug symbol has been found ... This is incorrect, because an XFAIL indicates a known problem in the environment. In this case, there is no problem: the environment is functioning as expected when glibc debuginfo is not installed. Fix this by using UNSUPPORTED instead, and make the message less cryptic: ... UNSUPPORTED: gdb.base/solib-corrupted.exp: make solibs looping \ (glibc debuginfo required) ... Finally, with glibc debuginfo present, we run into: ... (gdb) PASS: gdb.base/solib-corrupted.exp: make solibs looping info sharedlibrary^M warning: Corrupted shared library list: 0x7ffff7ffe750 != 0x0^M From To Syms Read Shared Object Library^M 0x00007ffff7dd4170 0x00007ffff7df4090 Yes /lib64/ld-linux-x86-64.so.2^M (gdb) FAIL: gdb.base/solib-corrupted.exp: corrupted list \ (shared library list corrupted) ... due to commit 44288716537 ("gdb, testsuite: extend gdb_test_multiple checks"). Fix this by rewriting into gdb_test_multiple and using -early. Tested on x86_64-linux, with and without glibc debuginfo installed.	2023-02-08 11:48:53 +01:00
Andrew Burgess	944b1b1817	gdb: fix display of thread condition for multi-location breakpoints This commit addresses the issue in PR gdb/30087. If a breakpoint with multiple locations has a thread condition, then the 'info breakpoints' output is a little messed up, here's an example of the current output: (gdb) break foo thread 1 Breakpoint 2 at 0x401114: foo. (3 locations) (gdb) break bar thread 1 Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32. (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y <MULTIPLE> thread 1 stop only in thread 1 2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32 thread 1 stop only in thread 1 Notice that, at the end of the location for breakpoint 3, the 'thread 1' condition is printed, but this is then repeated on the next line with 'stop only in thread 1'. In contrast, for breakpoint 2, the 'thread 1' appears randomly, in the "What" column, though slightly offset, non of the separate locations have the 'thread 1' information. Additionally for breakpoint 2 we also get a 'stop only in thread 1' line. There's two things going on here. First the randomly placed 'thread 1' for breakpoint 2 is due to a bug in print_one_breakpoint_location, where we check the variable part_of_multiple instead of header_of_multiple. If I fix this oversight, then the output is now: (gdb) break foo thread 1 Breakpoint 2 at 0x401114: foo. (3 locations) (gdb) break bar thread 1 Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32. (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y <MULTIPLE> stop only in thread 1 2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1 2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1 2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1 3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32 thread 1 stop only in thread 1 The 'thread 1' condition is now displayed at the end of each location, which makes the output the same for single location breakpoints and multi-location breakpoints. However, there's still some duplication here. Both breakpoints 2 and 3 include a 'stop only in thread 1' line, and it feels like the additional 'thread 1' is redundant. In fact, there's a comment to this very effect in the code: /* FIXME: This seems to be redundant and lost here; see the "stop only in" line a little further down. */ So, lets fix this FIXME. The new plan is to remove all the trailing 'thread 1' markers from the CLI output, we now get this: (gdb) break foo thread 1 Breakpoint 2 at 0x401114: foo. (3 locations) (gdb) break bar thread 1 Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32. (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y <MULTIPLE> stop only in thread 1 2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32 stop only in thread 1 All of the above points are also true for the Ada 'task' breakpoint condition, and the changes I've made also update how the task information is printed, though in the case of the Ada task there was no 'stop only in task XXX' line printed, so I've added one of those. Obviously it can't be quite that easy. For MI backwards compatibility I've retained the existing code (but now only for MI like outputs), which ensures we should generate backwards compatible output. I've extended an Ada test to cover the new task related output, and updated all the tests I could find that checked for the old output. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30087 Approved-By: Pedro Alves <pedro@palves.net>	2023-02-07 14:41:40 +00:00
Tom de Vries	ca2f51c696	[gdb/testsuite] Improve untested message in gdb.ada/finish-var-size.exp I came across: ... UNTESTED: gdb.ada/finish-var-size.exp: GCC too told for this test ... The message only tells us that the compiler version too old, not what compiler version is required. Fix this by rewriting using required: ... UNSUPPORTED: gdb.ada/finish-var-size.exp: require failed: \ expr [gcc_major_version] >= 12 ... Tested on x86_64-linux.	2023-02-07 11:41:44 +01:00
Tom de Vries	9af467b824	[gdb/testsuite] Fix gdb.threads/schedlock.exp on fast cpu Occasionally, I run into: ... (gdb) PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ set scheduler-locking on continue^M Continuing.^M PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ continue (with lock) [Thread 0x7ffff746e700 (LWP 1339) exited]^M No unwaited-for children left.^M (gdb) Quit^M (gdb) FAIL: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \ stop all threads (with lock) (timeout) ... What happens is that this loop which is supposed to run "just short of forever": ... /* Don't run forever. Run just short of it :) / while (myp > 0) { /* schedlock.exp: main loop. / MAYBE_CALL_SOME_FUNCTION(); (myp) ++; } ... finishes after 0x7fffffff iterations (when a signed wrap occurs), which on my system takes only about 1.5 seconds. Fix this by: - changing the pointed-at type of myp from signed to unsigned, which makes the wrap defined behaviour (and which also make the loop run twice as long, which is already enough to make it impossible for me to reproduce the FAIL. But let's try to solve this more structurally). - changing the pointed-at type of myp from int to long long, making the wrap unlikely. - making sure the loop runs forever, by setting the loop condition to 1. - making sure the loop still contains different lines (as far as debug info is concerned) by incrementing a volatile counter in the loop. - making sure the program doesn't run forever in case of trouble, by adding an "alarm (30)". Tested on x86_64-linux. PR testsuite/30074 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30074	2023-02-06 12:52:50 +01:00
Andrew Burgess	980dbf3622	gdb: error if 'thread' or 'task' keywords are overused When creating a breakpoint or watchpoint, the 'thread' and 'task' keywords can be used to create a thread or task specific breakpoint or watchpoint. Currently, a thread or task specific breakpoint can only apply for a single thread or task, if multiple threads or tasks are specified when creating the breakpoint (or watchpoint), then the last specified id will be used. The exception to the above is that when the 'thread' keyword is used during the creation of a watchpoint, GDB will give an error if 'thread' is given more than once. In this commit I propose making this behaviour consistent, if the 'thread' or 'task' keywords are used more than once when creating either a breakpoint or watchpoint, then GDB will give an error. I haven't updated the manual, we don't explicitly say that these keywords can be repeated, and (to me), given the keyword takes a single id, I don't think it makes much sense to repeat the keyword. As such, I see this more as adding a missing error to GDB, rather than making some big change. However, I have added an entry to the NEWS file as I guess it is possible that some people might hit this new error with an existing (I claim, badly written) GDB script. I've added some new tests to check for the new error. Just one test needed updating, gdb.linespec/keywords.exp, this test did use the 'thread' keyword twice, and expected the breakpoint to be created. Looking at what this test was for though, it was checking the use of '-force-condition', and I don't think that being able to repeat 'thread' was actually a critical part of this test. As such, I've updated this test to expect the error when 'thread' is repeated.	2023-02-06 11:02:48 +00:00
Andrew Burgess	79436bfc5a	gdb/testsuite: don't try to set non-stop mode on a running target The test gdb.threads/thread-specific-bp.exp tries to set non-stop mode on a running target, something which the manual makes clear is not allowed. This commit restructures the test a little, we now set the non-stop mode as part of the GDBFLAGS, so the mode will be set before GDB connects to the target. As a consequence I'm able to move the with_test_prefix out of the check_thread_specific_breakpoint proc. The check_thread_specific_breakpoint proc is now called within a loop. After this commit the gdb.threads/thread-specific-bp.exp test still has some failures, this is because of an issue GDB currently has printing "Thread ... exited" messages. This problem should be addressed by this patch: https://sourceware.org/pipermail/gdb-patches/2022-December/194694.html when it is merged.	2023-02-04 16:15:38 +00:00
Simon Marchi	18b4d0736b	gdb: initial support for ROCm platform (AMDGPU) debugging This patch adds the foundation for GDB to be able to debug programs offloaded to AMD GPUs using the AMD ROCm platform [1]. The latest public release of the ROCm release at the time of writing is 5.4, so this is what this patch targets. The ROCm platform allows host programs to schedule bits of code for execution on GPUs or similar accelerators. The programs running on GPUs are typically referred to as `kernels` (not related to operating system kernels). Programs offloaded with the AMD ROCm platform can be written in the HIP language [2], OpenCL and OpenMP, but we're going to focus on HIP here. The HIP language consists of a C++ Runtime API and kernel language. Here's an example of a very simple HIP program: #include "hip/hip_runtime.h" #include <cassert> __global__ void do_an_addition (int a, int b, int out) { out = a + b; } int main () { int result_ptr, result; / Allocate memory for the device to write the result to. / hipError_t error = hipMalloc (&result_ptr, sizeof (int)); assert (error == hipSuccess); / Run `do_an_addition` on one workgroup containing one work item. / do_an_addition<<<dim3(1), dim3(1), 0, 0>>> (1, 2, result_ptr); / Copy result from device to host. Note that this acts as a synchronization point, waiting for the kernel dispatch to complete. / error = hipMemcpyDtoH (&result, result_ptr, sizeof (int)); assert (error == hipSuccess); printf ("result is %d\n", result); assert (result == 3); return 0; } This program can be compiled with: $ hipcc simple.cpp -g -O0 -o simple ... where `hipcc` is the HIP compiler, shipped with ROCm releases. This generates an ELF binary for the host architecture, containing another ELF binary with the device code. The ELF for the device can be inspected with: $ roc-obj-ls simple 1 host-x86_64-unknown-linux file://simple#offset=8192&size=0 1 hipv4-amdgcn-amd-amdhsa--gfx906 file://simple#offset=8192&size=34216 $ roc-obj-extract 'file://simple#offset=8192&size=34216' $ file simple-offset8192-size34216.co simple-offset8192-size34216.co: ELF 64-bit LSB shared object, unknown arch 0xe0* version 1, dynamically linked, with debug_info, not stripped ^ amcgcn architecture that my `file` doesn't know about ----´ Running the program gives the very unimpressive result: $ ./simple result is 3 While running, this host program has copied the device program into the GPU's memory and spawned an execution thread on it. The goal of this GDB port is to let the user debug host threads and these GPU threads simultaneously. Here's a sample session using a GDB with this patch applied: $ ./gdb -q -nx --data-directory=data-directory ./simple Reading symbols from ./simple... (gdb) break do_an_addition Function "do_an_addition" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (do_an_addition) pending. (gdb) r Starting program: /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff5db7640 (LWP 1082911)] [New Thread 0x7ffef53ff640 (LWP 1082913)] [Thread 0x7ffef53ff640 (LWP 1082913) exited] [New Thread 0x7ffdecb53640 (LWP 1083185)] [New Thread 0x7ffff54bf640 (LWP 1083186)] [Thread 0x7ffdecb53640 (LWP 1083185) exited] [Switching to AMDGPU Wave 2:2:1:1 (0,0,0)/0] Thread 6 hit Breakpoint 1, do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 24 out = a + b; (gdb) info inferiors Num Description Connection Executable 1 process 1082907 1 (native) /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple (gdb) info threads Id Target Id Frame 1 Thread 0x7ffff5dc9240 (LWP 1082907) "simple" 0x00007ffff5e9410b in ?? () from /opt/rocm-5.4.0/lib/libhsa-runtime64.so.1 2 Thread 0x7ffff5db7640 (LWP 1082911) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff54bf640 (LWP 1083186) "simple" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 AMDGPU Wave 2:2:1:1 (0,0,0)/0 do_an_addition ( a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 (gdb) bt Python Exception <class 'gdb.error'>: Unhandled dwarf expression opcode 0xe1 #0 do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>, out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24 (gdb) continue Continuing. result is 3 warning: Temporarily disabling breakpoints for unloaded shared library "file:///home/smarchi/build/binutils-gdb-amdgpu/gdb/simple#offset=8192&size=67208" [Thread 0x7ffff54bf640 (LWP 1083186) exited] [Thread 0x7ffff5db7640 (LWP 1082911) exited] [Inferior 1 (process 1082907) exited normally] One thing to notice is the host and GPU threads appearing under the same inferior. This is a design goal for us, as programmers tend to think of the threads running on the GPU as part of the same program as the host threads, so showing them in the same inferior in GDB seems natural. Also, the host and GPU threads share a global memory space, which fits the inferior model. Another thing to notice is the error messages when trying to read variables or printing a backtrace. This is expected for the moment, since the AMD GPU compiler produces some DWARF that uses some non-standard extensions: https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html There were already some patches posted by Zoran Zaric earlier to make GDB support these extensions: https://inbox.sourceware.org/gdb-patches/20211105113849.118800-1-zoran.zaric@amd.com/ We think it's better to get the basic support for AMD GPU in first, which will then give a better justification for GDB to support these extensions. GPU threads are named `AMDGPU Wave`: a wave is essentially a hardware thread using the SIMT (single-instruction, multiple-threads) [3] execution model. GDB uses the amd-dbgapi library [4], included in the ROCm platform, for a few things related to AMD GPU threads debugging. Different components talk to the library, as show on the following diagram: +---------------------------+ +-------------+ +------------------+ \| GDB \| amd-dbgapi target \| <-> \| AMD \| \| Linux kernel \| \| +-------------------+ \| Debugger \| +--------+ \| \| \| amdgcn gdbarch \| <-> \| API \| <=> \| AMDGPU \| \| \| +-------------------+ \| \| \| driver \| \| \| \| solib-rocm \| <-> \| (dbgapi.so) \| +--------+---------+ +---------------------------+ +-------------+ - The amd-dbgapi target is a target_ops implementation used to control execution of GPU threads. While the debugging of host threads works by using the ptrace / wait Linux kernel interface (as usual), control of GPU threads is done through a special interface (dubbed `kfd`) exposed by the `amdgpu` Linux kernel module. GDB doesn't interact directly with `kfd`, but instead goes through the amd-dbgapi library (AMD Debugger API on the diagram). Since it provides execution control, the amd-dbgapi target should normally be a process_stratum_target, not just a target_ops. More on that later. - The amdgcn gdbarch (describing the hardware architecture of the GPU execution units) offloads some requests to the amd-dbgapi library, so that knowledge about the various architectures doesn't need to be duplicated and baked in GDB. This is for example for things like the list of registers. - The solib-rocm component is an solib provider that fetches the list of code objects loaded on the device from the amd-dbgapi library, and makes GDB read their symbols. This is very similar to other solib providers that handle shared libraries, except that here the shared libraries are the pieces of code loaded on the device. Given that Linux host threads are managed by the linux-nat target, and the GPU threads are managed by the amd-dbgapi target, having all threads appear in the same inferior requires the two targets to be in that inferior's target stack. However, there can only be one process_stratum_target in a given target stack, since there can be only one target per slot. To achieve it, we therefore resort the hack^W solution of placing the amd-dbgapi target in the arch_stratum slot of the target stack, on top of the linux-nat target. Doing so allows the amd-dbgapi target to intercept target calls and handle them if they concern GPU threads, and offload to beneath otherwise. See amd_dbgapi_target::fetch_registers for a simple example: void amd_dbgapi_target::fetch_registers (struct regcache regcache, int regno) { if (!ptid_is_gpu (regcache->ptid ())) { beneath ()->fetch_registers (regcache, regno); return; } // handle it } ptids of GPU threads are crafted with the following pattern: (pid, 1, wave id) Where pid is the inferior's pid and "wave id" is the wave handle handed to us by the amd-dbgapi library (in practice, a monotonically incrementing integer). The idea is that on Linux systems, the combination (pid != 1, lwp == 1) is not possible. lwp == 1 would always belong to the init process, which would also have pid == 1 (and it's improbable for the init process to offload work to the GPU and much less for the user to debug it). We can therefore differentiate GPU and non-GPU ptids this way. See ptid_is_gpu for more details. Note that we believe that this scheme could break down in the context of containers, where the initial process executed in a container has pid 1 (in its own pid namespace). For instance, if you were to execute a ROCm program in a container, then spawn a GDB in that container and attach to the process, it will likely not work. This is a known limitation. A workaround for this is to have a dummy process (like a shell) fork and execute the program of interest. The amd-dbgapi target watches native inferiors, and "attaches" to them using amd_dbgapi_process_attach, which gives it a notifier fd that is registered in the event loop (see enable_amd_dbgapi). Note that this isn't the same "attach" as in PTRACE_ATTACH, but being ptrace-attached is a precondition for amd_dbgapi_process_attach to work. When the debugged process enables the ROCm runtime, the amd-dbgapi target gets notified through that fd, and pushes itself on the target stack of the inferior. The amd-dbgapi target is then able to intercept target_ops calls. If the debugged process disables the ROCm runtime, the amd-dbgapi target unpushes itself from the target stack. This way, the amd-dbgapi target's footprint stays minimal when debugging a process that doesn't use the AMD ROCm platform, it does not intercept target calls. The amd-dbgapi library is found using pkg-config. Since enabling support for the amdgpu architecture (amdgpu-tdep.c) depends on the amd-dbgapi library being present, we have the following logic for the interaction with --target and --enable-targets: - if the user explicitly asks for amdgcn support with --target=amdgcn--* or --enable-targets=amdgcn--, we probe for the amd-dbgapi and fail if not found - if the user uses --enable-targets=all, we probe for amd-dbgapi, enable amdgcn support if found, disable amdgcn support if not found - if the user uses --enable-targets=all and --with-amd-dbgapi=yes, we probe for amd-dbgapi, enable amdgcn if found and fail if not found - if the user uses --enable-targets=all and --with-amd-dbgapi=no, we do not probe for amd-dbgapi, disable amdgcn support - otherwise, amd-dbgapi is not probed for and support for amdgcn is not enabled Finally, a simple test is included. It only tests hitting a breakpoint in device code and resuming execution, pretty much like the example shown above. [1] https://docs.amd.com/category/ROCm_v5.4 [2] https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4 [3] https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads [4] https://docs.amd.com/bundle/ROCDebugger-API-Guide-v5.4 Change-Id: I591edca98b8927b1e49e4b0abe4e304765fed9ee Co-Authored-By: Zoran Zaric <zoran.zaric@amd.com> Co-Authored-By: Laurent Morichetti <laurent.morichetti@amd.com> Co-Authored-By: Tony Tye <Tony.Tye@amd.com> Co-Authored-By: Lancelot SIX <lancelot.six@amd.com> Co-Authored-By: Pedro Alves <pedro@palves.net>	2023-02-02 10:02:34 -05:00
Andrew Burgess	cded17bfca	gdb/testsuite: fix fetch_src_and_symbols.exp with native-gdbserver board I noticed that the gdb.debuginfod/fetch_src_and_symbols.exp script doesn't work with the native-gdbserver board, I see this error: ERROR: tcl error sourcing /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.debuginfod/fetch_src_and_symbols.exp. ERROR: gdbserver does not support run without extended-remote while executing "error "gdbserver does not support $command without extended-remote"" (procedure "gdb_test_multiple" line 51) invoked from within This was introduced with this commit: commit 7dd38e31d67c2548b52bea313ab18e40824c05da Date: Fri Jan 6 18:45:27 2023 -0500 gdb/linespec.c: Fix missing source file during breakpoint re-set The problem is that the above commit introduces a direct use of the "run" command, which doesn't work with 'target remote' targets, as exercised by the native-gdbserver board. To avoid this, in this commit I switch to using runto_main. However, calling runto_main will, by default, delete all the currently set breakpoints. As the point of the above commit was to check that a breakpoint set before stating an inferior would be correctly re-set, we need to avoid this breakpoint deleting behaviour. To do this I make use of with_override, and override the delete_breakpoints proc with a dummy proc which does nothing. By reverting the GDB changes in commit 7dd38e31d67c I have confirmed that even after my changes in this commit, the test still fails. But with the fixes in commit 7dd38e31d67c, this test now passed using the unix, native-gdbserver, and native-extended-gdbserver boards.	2023-02-01 17:32:16 +00:00
Alexandra Hájková	6647f05df0	gdb: defer warnings when loading separate debug files Currently, when GDB loads debug information from a separate debug file, there are a couple of warnings that could be produced if things go wrong. In find_separate_debug_file_by_buildid (build-id.c) GDB can give a warning if the separate debug file doesn't include any actual debug information, and in separate_debug_file_exists (symfile.c) we can warn if the CRC checksum in the separate debug file doesn't match the checksum in the original executable. The problem here is that, when looking up debug information, GDB will try several different approaches, lookup by build-id, lookup by debug-link, and then a lookup from debuginfod. GDB can potentially give a warning from an earlier attempt, and then succeed with a later attempt. In the cases I have run into this is primarily a warning about some out of date debug information on my machine, but then GDB finds the correct information using debuginfod. This can be confusing to a user, they will see warnings from GDB when really everything is working just fine. For example: warning: the debug information found in "/usr/lib/debug//lib64/ld-2.32.so.debug" \ does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch). This diagnostic was printed on Fedora 33 even when the correct debuginfo was downloaded. In this patch I propose that we defer any warnings related to looking up debug information from a separate debug file. If any of the approaches are successful then GDB will not print any of the warnings. As far as the user is concerned, everything "just worked". Only if GDB completely fails to find any suitable debug information will the warnings be printed. The crc_mismatch test compiles two executables: crc_mismatch and crc_mismatch-2 and then strips them of debuginfo creating separate debug files. The test then replaces crc_mismatch-2.debug with crc_mismatch.debug to trigger "CRC mismatch" warning. A local debuginfod server is setup to supply the correct debug file, now when GDB looks up the debug info no warning is given. The build-id-no-debug-warning.exp is similar to the previous test. It triggers the "separate debug info file has no debug info" warning by replacing the build-id based .debug file with the stripped binary and then loading it to GDB. It then also sets up local debuginfod server with the correct debug file to download to make sure no warnings are emitted.	2023-02-01 11:12:35 +00:00
Simon Marchi	95cbab2beb	gdb/testsuite: adjust ensure_gdb_index to cooked_index_functions::dump changes Following 7d82b08e9e0a ("gdb/dwarf: dump cooked index contents in cooked_index_functions::dump"), I see some failures like: (gdb) mt print objfiles with-mf^M ^M Object file /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/with-mf/with-mf: Objfile at 0x614000005040, bfd at 0x6120000e08c0, 18 minsyms ^M ^M Cooked index in use:^M ^M ... (gdb) FAIL: gdb.base/with-mf.exp: check if index present This is because the format of the "Cooked index in use" line changed slightly. Adjust ensure_gdb_index to expect the trailing colon. Change-Id: If0a87575c02d8a0bc0d4b8ead540c234c62760f8	2023-01-31 11:43:38 -05:00
Simon Marchi	d8e88d10d8	gdb/testsuite: fix xfail in gdb.ada/ptype_tagged_param.exp I see: ERROR: wrong # args: should be "xfail message" while executing "xfail "no debug info" $gdb_test_name" ("uplevel" body line 3) invoked from within "uplevel { if {!$has_runtime_debug_info} { xfail "no debug info" $gdb_test_name } else { fail $gdb_test_name } }" This is because the xfail takes only one argument, fix that. Change-Id: I2e304d4fd3aa61067c04b5dac2be2ed34dab3190	2023-01-31 11:35:43 -05:00
Tom Tromey	8d31d08fe6	Use xfail in ptype_tagged_param.exp Pedro pointed out that ptype_tagged_param.exp used a kfail, but an xfail would be more appropriate as the problem appears to be in gcc, not gdb.	2023-01-30 08:03:33 -07:00

1 2 3 4 5 ...

10265 Commits