binutils-gdb

mirror of https://github.com/espressif/binutils-gdb.git synced 2025-06-10 09:59:06 +08:00

Author	SHA1	Message	Date
Tom Tromey	47fe57c928	Fix "start" for D, Rust, etc The new DWARF indexer broke "start" for some languages. For D, it is broken because, while the code in cooked_index_shard::add specifically excludes Ada, it fails to exclude D. This means that the C "main" will be detected as "main" here -- whereas what is intended is for the code in find_main_name to use d_main_name to find the name. The Rust compiler, on the other hand, uses DW_AT_main_subprogram. However, the code in dwarf2_build_psymtabs_hard fails to create a fully-qualified name, so the name always ends up as plain "main". For D and Ada, a very simple approach suffices: remove the check against "main" from cooked_index_shard::add. This also has the benefit of slightly speeding up DWARF indexing. I assume this approach will work for Pascal and Modula-2 as well, but I don't have a way to test those at present. For Rust, though, this is not sufficient. And, computing the fully-qualified name in dwarf2_build_psymtabs_hard will crash, because cooked_index_entry::full_name uses the canonical name -- and that is not computed until after canonicalization. However, we don't want to wait for canonicalization to be done before computing the main name. That would remove any benefit from doing canonicalization is the background. This patch solves this dilemma by noticing that languages using DW_AT_main_subprogram are, currently, disjoint from languages requiring canonicalization. Because of this, we can add a parameter to full_name to let us avoid crashes, slowdowns, and races here. This is kind of tricky and ugly, so I've tried to comment it sufficiently. While doing this, I had to change gdb.dwarf2/main-subprogram.exp. A different possibility here would be to ignore the canonicalization needs of C in this situation, because those only affect certain types. However, I chose this approach because the test case is artificial anyhow. A long time ago, in an earlier threading attempt, I changed the global current_language to be a function (hidden behind a macro) to let us attempt lazily computing the current language. Perhaps this approach could still be made to work. However, that also seemed rather tricky, more so than this patch. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30116	2023-02-18 15:41:38 -07:00
Tom Tromey	307733cc0f	Let user C-c when waiting for DWARF index finalization In PR gdb/29854, Simon pointed out that it would be good to be able to use C-c when the DWARF cooked index is waiting for finalization. The idea here is to be able to interrupt a command like "break" -- not to stop the finalization process itself, which runs in a worker thread. This patch implements this idea, by changing the index wait functions to, by default, allow a quit. Polling is done, because there doesn't seem to be a better way to interrupt a wait on a std::future. For v2, I realized that the thread compatibility code in thread-pool.h also needed an update. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29854	2023-02-09 07:21:52 -07:00
Simon Marchi	19455ee11d	gdb/dwarf: rename cooked_index_vector to cooked_index See previous patch's commit message for rationale. Change-Id: I6b8cdc045dffccc1c01ed690ff258af09f6ff076 Approved-By: Tom Tromey <tom@tromey.com>	2023-01-31 22:03:40 -05:00
Simon Marchi	a8dc671839	gdb/dwarf: rename cooked_index to cooked_index_shard I propose to rename cooked_index_vector and cooked_index such that the "main" object, that is the entry point to the index, is called cooked_index. The fact that the cooked index is implemented as a vector of smaller indexes is an implementation detail. This patch renames cooked_index to cooked_index_shard. The following patch renames cooked_index_vector to cooked_index. Change-Id: Id650f97dcb23c48f8409fa0974cd093ca0b75177 Approved-By: Tom Tromey <tom@tromey.com>	2023-01-31 22:03:40 -05:00
Simon Marchi	902d61e328	gdb: fix dwarf2/cooked-index.c compilation on 32-bit systems The i386 builder shows: ../../binutils-gdb/gdb/dwarf2/cooked-index.c: In member function ‘void cooked_index_vector::dump(gdbarch*) const’: ../../binutils-gdb/gdb/dwarf2/cooked-index.c:492:40: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘std::__underlying_type_impl<sect_offset, true>::type’ {aka ‘long long unsigned int’} [-Werror=format=] 492 \| gdb_printf (" DIE offset: 0x%lx\n", \| ~~^ \| \| \| long unsigned int \| %llx 493 \| to_underlying (entry->die_offset)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| \| \| std::__underlying_type_impl<sect_offset, true>::type {aka long long unsigned int} The die_offset's underlying type is uint64, so use PRIx64 in the format string. Change-Id: Ibdde4c624ed1bb50eced9a514a4e37aec70a1323	2023-01-30 16:15:17 -05:00
Simon Marchi	7d82b08e9e	gdb/dwarf: dump cooked index contents in cooked_index_functions::dump As I am investigating a crash I see with the cooked index, I thought it would be useful to have a way to dump the index contents. For those not too familiar with it (that includes me), it can help get a feel of what it contains and how it is structured. The cooked_index_functions::dump function is called as part of the "maintenance print objfiles" command. I tried to make the output well structured and indented to help readability, as this prints a lot of text. The dump function first dumps all cooked index entries, like this: [25] ((cooked_index_entry ) 0x621000121220) name: __ioinit canonical: __ioinit DWARF tag: DW_TAG_variable flags: 0x2 [IS_STATIC] DIE offset: 0x21a4 parent: ((cooked_index_entry ) 0x6210000f9610) [std] Then the information about the main symbol: main: ((cooked_index_entry ) 0x621000123b40) [main] And finally the address map contents: [1] ((addrmap ) 0x6210000f7910) [0x0] ((dwarf2_per_cu_data ) 0) [0x118a] ((dwarf2_per_cu_data ) 0x60c000007f00) [0x1cc7] ((dwarf2_per_cu_data ) 0) [0x1cc8] ((dwarf2_per_cu_data ) 0x60c000007f00) [0x1cdf] ((dwarf2_per_cu_data ) 0) [0x1ce0] ((dwarf2_per_cu_data ) 0x60c000007f00) The display of address maps above could probably be improved, to show it more as ranges, but I think this is a reasonable start. Note that this patch depends on Pedro Alves' patch "enum_flags to_string" [1]. If my patch is to be merged before Pedro's series, I will cherry-pick this patch from his series and merge it before mine. [1] https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-8-pedro@palves.net/ Change-Id: Ida13e479fd4c8d21102ddd732241778bc3b6904a	2023-01-30 15:04:44 -05:00
Tom Tromey	c121e82c39	Fix comparator bug in cooked index Simon pointed out that the cooked index template-matching patch introduced a failure in libstdc++ debug mode. In particular, the new code violates the assumption of std::lower_bound and std::upper_bound that the range is sorted with respect to the comparison. When I first debugged this, I thought the problem was unfixable as-is and that a second layer of filtering would have to be done. However, on irc, Simon pointed out that it could perhaps be solved if the comparison function were assured that one operand always came from the index, with the other always being the search string. This patch implements this idea. First, a new mode is introduced: a sorting mode for cooked_index_entry::compare. In this mode, strings are compared case-insensitively, but we're careful to always sort '<' before any other printable character. This way, two names like "func" and "func<param>" will be sorted next to each other -- i.e., "func1" will not be seen between them. This is important when searching. Second, the compare function is changed to work in a strcmp-like way. This makes it easier to test and (IMO) understand. Third, the compare function is modified so that in non-sorting modes, the index entry is always the first argument. This allows consistency in compares. I regression tested this in libstdc++ debug mode on x86-64 Fedora 36. It fixes the crash that Simon saw. This is v2. I believe it addresses the review comments, except for the 'enum class' change, as I mentioned in email on the list. Approved-By: Simon Marchi <simon.marchi@efficios.com>	2023-01-30 10:46:14 -07:00
Tom Tromey	70ca3a6bc9	Make addrmap const-correct in cooked index After the cooked index is created, the addrmaps should be const. Change-Id: I8234520ab346ced40a8dd6e478ba21fc438c2ba2	2023-01-30 11:55:07 -05:00
Tom Tromey	35e1763185	More const-correctness in cooked indexer I noticed that iterating over the index yields non-const cooked_index_entry objects. However, after finalization, they should not be modified. This patch enforces this by adding const where needed. v2 makes the find, all_entries, and wait methods const as well.	2023-01-27 14:12:01 -07:00
Tom Tromey	ac37b79cc4	Fix parameter-less template regression in new DWARF reader PR c++/29896 points out a regression in the new DWARF reader. It does not properly handle a case like "break fn", where "fn" is a template function. This happens because the new index uses strncasecmp to compare. However, to make this work correctly, we need a custom function that ignores template parameters. This patch adds a custom comparison function and fixes the bug. A new test case is included. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29896	2023-01-17 07:03:26 -07:00
Tom Tromey	5a89072f36	Move hash_entry and eq_entry into cooked_index::do_finalize I was briefly confused by the hash_entry and eq_entry functions in the cooked index. They are only needed in a single method, and that method already has a couple of local lambdas for a different hash table. So, it seemed cleaner to move these there as well.	2023-01-17 07:03:26 -07:00
Joel Brobecker	213516ef31	Update copyright year range in header of all files managed by GDB This commit is the result of running the gdb/copyright.py script, which automated the update of the copyright year range for all source files managed by the GDB project to be updated to include year 2023.	2023-01-01 17:01:16 +04:00
Tom Tromey	55fc1623f9	Add name canonicalization for C PR symtab/29105 shows a number of situations where symbol lookup can result in the expansion of too many CUs. What happens is that lookup_signed_typename will try to look up a type like "signed int". In cooked_index_functions::expand_symtabs_matching, when looping over languages, the C++ case will canonicalize this type name to be "int" instead. Then this method will proceed to expand every CU that has an entry for "int" -- i.e., nearly all of them. A crucial component of this is that the caller, objfile::lookup_symbol, does not do this canonicalization, so when it tries to find the symbol for "signed int", it fails -- causing the loop to continue. This patch fixes the problem by introducing name canonicalization for C. The idea here is that, by making C and C++ agree on the canonical name when a symbol name can have multiple spellings, we avoid the bad behavior in objfile::lookup_symbol (and any other such code -- I don't know if there is any). Unlike C++, C only has a few situations where canonicalization is needed. And, in particular, due to the lack of overloading (thus avoiding any issues in linespec) and due to the way c-exp.y works, I think that no canonicalization is needed during symbol lookup -- only during symtab construction. This explains why lookup_name_info is not touched. The stabs reader is modified on a "best effort" basis. The DWARF reader needed one small tweak in dwarf2_name to avoid a regression in dw2-unusual-field-names.exp. I think this is adequately explained by the comment, but basically this is a scenario that should not occur in real code, only the gdb test suite. lookup_signed_typename is simplified. It used to search for two different type names, but now gdb can search just for the canonical form. gdb.dwarf2/enum-type.exp needed a small tweak, because the canonicalizer turns "unsigned integer" into "unsigned int integer". It seems better here to use the correct C type name. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105 Tested-by: Simon Marchi <simark@simark.ca> Reviewed-by: Andrew Burgess <aburgess@redhat.com>	2022-12-01 11:16:41 -07:00
Tom Tromey	bed34ce705	Refactor cooked_index::do_finalize This refactors cooked_index::do_finalize, reordering an 'if' to make it a little less redundant. This change makes a subsequent patch easier to read. Reviewed-by: Andrew Burgess <aburgess@redhat.com>	2022-12-01 11:16:41 -07:00
Tom de Vries	2c474c4694	[gdb/symtab] Add get/set functions for per_cu->lang/unit_type The dwarf2_per_cu_data fields lang and unit_type both have a dont-know initial value (respectively language_unknown and (dwarf_unit_type)0), which allows us to add certain checks, f.i. checking that that a field is not read before written. Add get/set member functions for the two fields as a convenient location to add such checks, make the fields private to enforce using the member functions, and add the m_ prefix. Tested on x86_64-linux.	2022-07-04 10:28:42 +02:00
Tom Tromey	20a26f4e01	Finalize each cooked index separately After DWARF has been scanned, the cooked index code does a "finalization" step in a worker thread. This step combines all the index entries into a single master list, canonicalizes C++ names, and splits Ada names to synthesize package names. While this step is run in the background, gdb will wait for the results in some situations, and it turns out that this step can be slow. This is PR symtab/29105. This can be sped up by parallelizing, at a small memory cost. Now each index is finalized on its own, in a worker thread. The cost comes from name canonicalization: if a given non-canonical name is referred to by multiple indices, there will be N canonical copies (one per index) rather than just one. This requires changing the users of the index to iterate over multiple results. However, this is easily done by introducing a new "chained range" class. When run on gdb itself, the memory cost seems rather low -- on my current machine, "maint space 1" reports no change due to the patch. For performance testing, using "maint time 1" and "file" will not show correct results. That approach measures "time to next prompt", but because the patch only affects background work, this shouldn't (and doesn't) change. Instead, a simple way to make gdb wait for the results is to set a breakpoint. Before: $ /bin/time -f%e ~/gdb/install/bin/gdb -nx -q -batch \ -ex 'break main' /tmp/gdb Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28. 2.00 After: $ /bin/time -f%e ./gdb/gdb -nx -q -batch \ -ex 'break main' /tmp/gdb Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28. 0.65 Regression tested on x86-64 Fedora 34. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105	2022-05-26 07:35:30 -06:00
Tom Tromey	72b580b8f4	Micro-optimize cooked_index_entry::full_name I noticed that cooked_index_entry::full_name can return the canonical string when there is no parent entry. Regression tested on x86-64 Fedora 34.	2022-04-20 06:21:06 -06:00
Simon Marchi	a8b7a13911	gdb: fix "passing NULL to memcpy" UBsan error in dwarf2/cooked-index.c Reading a simple file compiled with : $ gcc -DONE=1 -gdwarf-4 -g3 test.c $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0 I get: Reading symbols from /tmp/cwd/a.out... /home/smarchi/src/binutils-gdb/gdb/dwarf2/cooked-index.c:332:11: runtime error: null pointer passed as argument 2, which is declared to never be null It looks like even if the size is 0 (the size of the `entries` vector is 0), we shouldn't be passing a NULL pointer to memcpy. And `entries.data ()` returns NULL. Fix that by using std::vector::insert to insert the items of entries into m_entries. I haven't checked, but it should essentially compile down to a memcpy, since the vector elements are trivially copyiable. Change-Id: I75f1c901e9b522e42e89eb5936e2c70d68eb21e5	2022-04-12 14:42:02 -04:00
Tom Tromey	7e75279093	"Finalize" the DWARF index in the background After scanning the CUs, the DWARF indexer merges all the data into a single vector, canonicalizing C++ names as it proceeds. While not necessarily single-threaded, this process is currently done in just one thread, to keep memory costs lower. However, this work is all done without reference to any data outside of the indexes. This patch improves the apparent performance of GDB by moving it to the background. All uses of the index are then made to wait for this process to complete. In our ongoing example, this reduces the scanning time on gdb itself to 0.173937 (wall). Recall that before this patch, the time was 0.668923; and psymbol reader does this in 1.598869. That is, at the end of this series, we see about a 10x speedup.	2022-04-12 09:31:16 -06:00
Tom Tromey	46114cb7be	Parallelize DWARF indexing This parallelizes the new DWARF indexer. The indexer's storage was designed so that each storage object and each indexer is fully independent. This setup makes it simple to scan different CUs independently. This patch creates a new cooked index storage object per thread, and then scans a subset of all the CUs in each such thread, using gdb's existing thread pool. In the ongoing "gdb gdb" example, this patch reduces the wall time down to 0.668923, from 0.903534. (Note that the 0.903534 is the time for the new index -- that is, when the "enable the new index" patch is rebased to before this one. However, in the final series, that patch appears toward the end. Hopefully this isn't too confusing.)	2022-04-12 09:31:16 -06:00
Tom Tromey	51f5a4b8e9	Introduce the new DWARF index class This patch introduces the new DWARF index class. It is called "cooked" to contrast against a "raw" index, which is mapped from disk without extra effort. Nothing constructs a cooked index yet. The essential idea here is that index entries are created via the "add" method; then when all the entries have been read, they are "finalize"d -- name canonicalization is performed and the entries are added to a sorted vector. Entries use the DWARF name (DW_AT_name) or linkage name, not the full name as is done for partial symbols. These two facets -- the short name and the deferred canonicalization -- help improve the performance of this approach. This will become clear in later patches, when parallelization is added. Some special code is needed for Ada, because GNAT only emits mangled ("encoded", in the Ada lingo) names, and so we reconstruct the hierarchical structure after the fact. This is also done in the finalization phase. One other aspect worth noting is that the way the "main" function is found is different in the new code. Currently gdb will notice DW_AT_main_subprogram, but won't recognize "main" during reading -- this is done later, via explicit symbol lookup. This is done differently in the new code so that finalization can be done in the background without then requiring a synchronization to look up the symbol.	2022-04-12 09:31:16 -06:00

21 Commits