The following 2 patches move .gdb_index and .debug_names reading code to
their own file. Prepare this by exposing some things used by that code
to read.h.
Change-Id: If8ef135758a2ff0ab3b765cc92596da8189f3bbd
Approved-By: Tom Tromey <tom@tromey.com>
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
PR symtab/29343 points out that it would be beneficial if
comp_unit_head had a constructor and used initializers. This patch
implements this. I'm unsure if this is sufficient to close the bug,
but at least it's a step.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29343
Clang up to version 15 (current) adds macros that were defined in the
command line or by "other means", according to the Dwarf specification,
after the last DW_MACRO_end_file, instead of before the first
DW_MACRO_start_file, as the specification dictates. When GDB reads the
macros after the last file is closed, the macros never end up "in scope"
and so we can't print them. This has been submitted as a bug to Clang
developers (https://github.com/llvm/llvm-project/issues/54506), and PR
macros/29034 was opened for GDB to keep track of this.
Seeing as there is no expected date for it to be fixed, add a workaround
for all current versions of Clang. The workaround detects when
the main file would be closed and if the producer is Clang, and turns
that operation into a noop, so we keep a reference to the current_file
as those macros are read.
A test case was added to confirm the functionality, and the KFAIL for
running gdb.base/macro-source-path when using clang.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29034
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Add all_comp_units/all_type_units views on all_units.
Having the views allows us to:
- easily get the number of CUs or TUs in all_units, and
- easily access the nth CU or TU.
This minimizes the use of tu_stats.nr_tus.
Tested on x86_64-linux.
When building gdb with -fsanitize=thread and running test-case
gdb.dwarf2/inlined_subroutine-inheritance.exp, we run into a data race
between:
...
Read of size 1 at 0x7b2000003010 by thread T4:
#0 packed<language, 1ul>::operator language() const packed.h:54
#1 dwarf2_per_cu_data::set_lang(language) read.h:363
...
and:
...
Previous write of size 1 at 0x7b2000003010 by main thread:
#0 dwarf2_per_cu_data::set_lang(language) read.h:365
...
Fix this by making per_cu->m_lang atomic.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29286
With gdb with -fsanitize=thread and test-case gdb.ada/array_bounds.exp, I run
into a data race between:
...
Read of size 1 at 0x7b2000025f0f by main thread:
#0 packed<dwarf_unit_type, 1ul>::operator dwarf_unit_type() const packed.h:54
#1 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:339
...
and:
...
Previous write of size 1 at 0x7b2000025f0f by thread T3:
#0 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:341
...
Fix this by making per_cu->unit_type atomic.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29286
We have in per_cu->set_lang this comment:
...
void set_lang (enum language lang)
{
/* We'd like to be more strict here, similar to what is done in
set_unit_type, but currently a partial unit can go from unknown to
minimal to ada to c. */
...
Fix this by not setting m_lang for partial units.
This requires us to move the m_unit_type initialization to ensure that
m_unit_type is initialized before per_cu->m_lang.
Tested on x86_64-linux, with native and target board cc-with-dwz-m.
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a data race between:
...
Read of size 1 at 0x7b200000300d by thread T2:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82ec95)^M
...
and:
...
Previous write of size 1 at 0x7b200000300d by main thread:^M
#0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M
...
In other words, between:
...
if (this_cu->reading_dwo_directly)
...
and:
...
cu->per_cu->lang = pretend_language;
...
Likewise, we run into a data race between:
...
Write of size 1 at 0x7b200000300e by thread T4:
#0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720)
...
and:
...
Previous read of size 1 at 0x7b200000300e by main thread:
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82edab)
...
In other words, between:
...
this_cu->unit_type = DW_UT_partial;
...
and:
...
if (this_cu->reading_dwo_directly)
...
Likewise for the write to addresses_seen in cooked_indexer::check_bounds and a
read from is_dwz in dwarf2_find_containing_comp_unit for test-case
gdb.dwarf2/dw2-dir-file-name.exp and target board cc-with-dwz-m.
The problem is that the written fields are part of the same memory location as
the read fields, so executing a read and write in different threads is
undefined behavour.
Making the written fields separate memory locations, using the new
struct packed template fixes this.
The set of fields has been established experimentally to be the
minimal set to get rid of this type of -fsanitize=thread errors, but
more fields might require the same treatment.
Looking at the properties of the lang field, unlike dwarf_version it's
not available in the unit header, so it will be set the first time
during the parallel cooked index reading. The same holds for
unit_type, and likewise for addresses_seen.
dwarf2_per_cu_data::addresses_seen is moved so that the bitfields that
currently follow it can be merged in the same memory location as the
bitfields that currently precede it, for better packing.
Tested on x86_64-linux.
Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532
With gdb build with -fsanitize=thread and test-case
gdb.dwarf2/dw4-sig-types.exp and target board cc-with-dwz-m we run into a data
race between:
...
Write of size 4 at 0x7b2800002268 by thread T4:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6236 \
(gdb+0x82f525)^M
...
and this read:
...
Previous read of size 4 at 0x7b2800002268 by thread T1:^M
#0 dwarf2_find_containing_comp_unit gdb/dwarf2/read.c:23444 \
(gdb+0x86e22e)^M
...
In other words, between this write:
...
this_cu->length = cu->header.get_length ();
...
and this read:
...
&& mid_cu->sect_off + mid_cu->length > sect_off))
...
of per_cu->length.
Fix this similar to the per_cu->dwarf_version case, by only setting it if
needed, and otherwise verifying that the same value is used. [ Note that the
same code is already present in the other cutu_reader constructor. ]
Move this logic into into a member function set_length to make sure it's used
consistenly, and make the field private in order to enforce access through the
member functions, and rename it to m_length.
This exposes (running test-case gdb.dwarf2/fission-reread.exp) that in
fill_in_sig_entry_from_dwo_entry, the m_length field is overwritten.
For now, allow for that exception.
While we're at it, make sure that the length is set before read.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29344
When running test-case gdb.dwarf2/dw2-symtab-includes.exp with target board
cc-with-debug-names, I run into:
...
(gdb) maint expand-symtab dw2-symtab-includes.h^M
src/gdb/dwarf2/read.h:311: internal-error: unit_type: \
Assertion `m_unit_type != 0' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
----- Backtrace -----^M
FAIL: gdb.dwarf2/dw2-symtab-includes.exp: maint expand-symtab \
dw2-symtab-includes.h (GDB internal error)
...
The assert was recently added in commit 2c474c46943 ("[gdb/symtab] Add get/set
functions for per_cu->lang/unit_type").
The assert is triggered here:
...
/* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
into another compilation unit, at root level. Regard this as a hint,
and ignore it. */
if (die->parent && die->parent->parent == NULL
&& per_cu->unit_type () == DW_UT_compile
&& per_cu->lang () == language_cplus)
return;
...
We're trying to access unit_type / lang which hasn't been set yet.
Normally, these are set during cooked index creation, but that's not the case
when using an index (or using -readnow).
In other words, this lto binary reading speed optimization only works in the
normal use case.
IWBN to have this working in all use cases, but for now, allow lang () and
unit_type () to return language_unknown and 0 here.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29321
The dwarf2_per_cu_data fields lang and unit_type both have a dont-know
initial value (respectively language_unknown and (dwarf_unit_type)0), which
allows us to add certain checks, f.i. checking that that a field is not read
before written.
Add get/set member functions for the two fields as a convenient location to
add such checks, make the fields private to enforce using the member
functions, and add the m_ prefix.
Tested on x86_64-linux.
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a data race between thread T2 and the main
thread in the same write:
...
Write of size 1 at 0x7b200000300c:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6252 \
(gdb+0x82f3b3)^M
...
which is here:
...
this_cu->dwarf_version = cu->header.version;
...
Both writes are called from the parallel for in dwarf2_build_psymtabs_hard,
this one directly:
...
#1 process_psymtab_comp_unit gdb/dwarf2/read.c:6774 (gdb+0x8304d7)^M
#2 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M
#3 operator() gdbsupport/parallel-for.h:163 (gdb+0x872380)^M
...
and this via the PU import:
...
#1 cooked_indexer::ensure_cu_exists(cutu_reader*, dwarf2_per_objfile*, \
sect_offset, bool, bool) gdb/dwarf2/read.c:17964 (gdb+0x85c43b)^M
#2 cooked_indexer::index_imported_unit(cutu_reader*, unsigned char const*, \
abbrev_info const*) gdb/dwarf2/read.c:18248 (gdb+0x85d8ff)^M
#3 cooked_indexer::index_dies(cutu_reader*, unsigned char const*, \
cooked_index_entry const*, bool) gdb/dwarf2/read.c:18302 (gdb+0x85dcdb)^M
#4 cooked_indexer::make_index(cutu_reader*) gdb/dwarf2/read.c:18443 \
(gdb+0x85e68a)^M
#5 process_psymtab_comp_unit gdb/dwarf2/read.c:6812 (gdb+0x830879)^M
#6 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M
#7 operator() gdbsupport/parallel-for.h:171 (gdb+0x8723e2)^M
...
Fix this by setting the field earlier, in read_comp_units_from_section.
The write in cutu_reader::cutu_reader() is still needed, in case
read_comp_units_from_section is not used (run the test-case with say, target
board cc-with-gdb-index).
Make the write conditional, such that it doesn't trigger if the field is
already set by read_comp_units_from_section. Instead, verify that the
field already has the value that we're trying to set it to.
Move this logic into into a member function set_version (in analogy to the
already present member function version) to make sure it's used consistenly,
and make the field private in order to enforce access through the member
functions, and rename it to m_dwarf_version.
While we're at it, make sure that the version is set before read, to avoid
say returning true for "per_cu.version () < 5" if "per_cu.version () == 0".
Tested on x86_64-linux.
The CU queue is a member of dwarf2_per_bfd, but it is only used when
expanding CUs. Also, the dwarf2_per_objfile destructor checks the
queue -- however, if the per-BFD object is destroyed first, this will
not work. This was pointed out Lancelot as fallout from the patch to
rewrite the registry system.
This patch avoids this problem by moving the queue to the per-objfile
object.
The dwarf2_per_bfd object has a separate field for each possible kind
of index. Until an earlier patch in this series, two of these were
even derived from a common base class, but still had separate slots.
This patch unifies all the index fields using the common base class
that was introduced earlier in this series. This makes it more
obvious that only a single index can be active at a time, and also
removes some code from dwarf2_initialize_objfile.
Now that the psymtab reader has been removed, the
dwarf2_per_cu_data::v union is no longer needed. Instead, we can
simply move the members from dwarf2_per_cu_quick_data into
dwarf2_per_cu_data and remove the "quick" object entirely.
This parallelizes the new DWARF indexer. The indexer's storage was
designed so that each storage object and each indexer is fully
independent. This setup makes it simple to scan different CUs
independently.
This patch creates a new cooked index storage object per thread, and
then scans a subset of all the CUs in each such thread, using gdb's
existing thread pool.
In the ongoing "gdb gdb" example, this patch reduces the wall time
down to 0.668923, from 0.903534. (Note that the 0.903534 is the time
for the new index -- that is, when the "enable the new index" patch is
rebased to before this one. However, in the final series, that patch
appears toward the end. Hopefully this isn't too confusing.)
Because BFD is not thread-safe, we need to be sure that any section
data that is needed is read before trying to do any DWARF indexing in
the background.
This patch takes a simple approach to this -- it pre-reads the
"info"-related sections. This is done for the main file, but also any
auxiliary files as well, such as the DWO file.
This patch could be perhaps enhanced by removing some now-redundant
calls to dwarf2_section_info::read.
This implements quick_symbol_functions for the cooked DWARF index.
This is the code that interfaces between the new index and the rest of
gdb. Cooked indexes still aren't created by anything.
For the most part this is straightforward. It shares some concepts
with the existing DWARF indices. However, because names are stored
pre-split in the cooked index, name lookup here is necessarily
different; see expand_symtabs_matching for the gory details.
This patch adds the code to index DWARF. This is just the scanner; it
reads the DWARF and constructs the index, but nothing calls it yet.
The indexer is split into two parts: a storage object and an indexer
object. This is done to support the parallelization of this code -- a
future patch will create a single storage object per thread.
This adds a new member to dwarf2_per_cu_data that indicates whether
addresses have been seen for this CU. This is then set by the
.debug_aranges reader. The idea here is to detect when a CU does not
have address information, so that the new indexer will know to do
extra scanning in that case.
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
This changes the DWARF reader to cache the result of
find_file_and_directory. This is not especially important now, but it
will help the new DWARF indexer.
I noticed a new gcc option -gdwarf64 and tried it out (using gcc 11.2.1).
With a test-case hello.c:
...
int
main (void)
{
printf ("hello\n");
return 0;
}
...
compiled like this:
...
$ gcc -g -gdwarf64 ~/hello.c
...
I ran into:
...
$ gdb -q -batch a.out
DW_FORM_line_strp pointing outside of .debug_line_str section \
[in module a.out]
...
Debugging gdb revealed that the string offset is:
...
(gdb) up
objfile=0x182ab70, str_offset=1378684502312,
form_name=0xeae9b5 "DW_FORM_line_strp")
at src/gdb/dwarf2/section.c:208
208 error (_("%s pointing outside of %s section [in module %s]"),
(gdb) p /x str_offset
$1 = 0x14100000128
(gdb)
...
which is read when parsing a .debug_line entry at 0x1e0.
Looking with readelf at the 0x1e0 entry, we have:
...
The Directory Table (offset 0x202, lines 2, columns 1):
Entry Name
0 (indirect line string, offset: 0x128): /data/gdb_versions/devel
1 (indirect line string, offset: 0x141): /home/vries
...
which in a hexdump looks like:
...
0x00000200 1f022801 00004101 00000201 1f020f02
...
What happens is the following:
- readelf interprets the DW_FORM_line_strp reference to .debug_line_str as
a 4 byte value, and sees entries 0x00000128 and 0x00000141.
- gdb instead interprets it as an 8 byte value, and sees as first entry
0x0000014100000128, which is too big so it bails out.
AFAIU, gdb is wrong. It assumes DW_FORM_line_strp is 8 bytes on the basis
that the corresponding CU is 64-bit DWARF. However, the .debug_line
contribution has it's own initial_length field, and encodes there that it's
32-bit DWARF.
Fix this by using the correct offset size for DW_FORM_line_strp references
in .debug_line.
Note: the described test-case does trigger this complaint (both with and
without this patch):
...
$ gdb -q -batch -iex "set complaints 10" a.out
During symbol reading: intermixed 32-bit and 64-bit DWARF sections
...
The reason that the CU has 64-bit dwarf is because -gdwarf64 was passed to
gcc. The reason that the .debug_line entry has 32-bit dwarf is because that's
what gas generates. Perhaps this is complaint-worthy, but I don't think it
is wrong.
Tested on x86_64-linux, using native and target board dwarf64.exp.
A bug was filed against the incorrect underlying type setting for
an enumeration type, which was caused by a copy and paste error.
This patch fixes the problem by setting it by calling objfile_int_type,
which was originally dwarf2_per_objfile::int_type, with ctf_type_size bits.
Also add error checking on ctf_func_type_info call.
PR symtab/28160 and PR symtab/27893 concern GDB crashes in the test
suite when using the "fission" target board. They are both caused by
the patches that merge the list of CUs with the list of TUs (and to a
lesser degree by the patches to share DWARF data across objfiles), and
the underlying issue is the same: it turns out that reading a DWO can
cause new type units to be created. This means that the list of
dwarf2_per_cu_data objects depends on precisely which CUs have been
expanded. However, because the type units can be created while
expanding a CU means that the vector of CUs can expand while it is
being iterated over -- a classic mistake. Also, because a TU can be
added later, it means the resize_symtabs approach is incorrect.
This patch fixes resize_symtabs by removing it, and having set_symtab
resize the vector on demand. It fixes the iteration problem by
introducing a safe (index-based) iterator and changing the relevant
spots to use it.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28160
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27893
When run with the gdb-index or debug-names target boards, dup-psym.exp
fails. This came up for me because my new DWARF scanner reuses this
part of the existing index code, and so it registers as a regression.
This is PR symtab/25834.
Looking into this, I found that the DWARF index code here is fairly
different from the psymtab code. I don't think there's a deep reason
for this, and in fact, it seemed to me that the index code could
simply mimic what the psymtab code already does.
That is what this patch implements. The DW_AT_name and DW_AT_comp_dir
are now stored in the quick file names table. This may require
allocating a quick file names table even when DW_AT_stmt_list does not
exist. Then, the functions that work with this data are changed to
use find_source_or_rewrite, just as the psymbol code does. Finally,
line_header::file_full_name is removed, as it is no longer needed.
Currently, the index maintains a hash table of "quick file names".
The hash table uses a deletion function to free the "real name"
components when necessary. There's also a second such function to
implement the forget_cached_source_info method.
This bug fix patch will create a quick file name object even when
there is no DW_AT_stmt_list, meaning that the object won't be entered
in the hash table. So, this patch changes the memory management
approach so that the entries are cleared when the per-BFD object is
destroyed. (A dwarf2_per_cu_data destructor is not introduced,
because we have been avoiding adding a vtable to that class.)
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=25834
This is another attempt at fixing the problem described in commit 4cf88725da1
"[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder()", which was
reverted in commit 3db19b2d724.
First off, some context.
A DWARF CU can be viewed as a symbol table: toplevel children of a CU DIE
represent symbol table entries for that CU. Furthermore, there is a
hierarchy: a symbol table entry such as a function itself has a symbol table
containing parameters and local variables.
The dwarf reader maintains a notion of current symbol table (that is: the
symbol table a new symbol needs to be entered into) in dwarf2_cu member
list_in_scope.
A problem then presents itself when reading inter-CU references:
- a new symbol read from a CU B needs to be entered into the symbol table of
another CU A.
- the notion of current symbol table is tracked on a per-CU basis.
This is addressed in inherit_abstract_dies by temporarily overwriting the
list_in_scope for CU B with the one for CU A.
The current symbol table is one aspect of the current dwarf reader context
that is tracked, but there are more, f.i. ones that are tracked via the
dwarf2_cu member m_builder, f.i. m_builder->m_local_using_directives.
A similar problem exists in relation to inter-CU references, but a different
solution was chosen:
- to keep track of an ancestor field in dwarf2_cu, which is updated
when traversing inter-CU references, and
- to use the ancestor field in dwarf2_cu::get_builder to return the m_builder
in scope.
There is no actual concept of a CU having an ancestor, it just marks the most
recent CU from which a CU was inter-CU-referenced. Consequently, when
following inter-CU references from a CU A to another CU B and back to CU A,
the ancestors form a cycle, which causes dwarf2_cu::get_builder to hang or
segfault, as reported in PR26327.
ISTM that the ancestor implementation is confusing and fragile, and should
go. Furthermore, it seems that keeping track of the m_builder in scope can be
handled simply with a per-objfile variable.
Fix the hang / segfault by:
- keeping track of the m_builder in scope using a new variable
per_obj->sym_cu, and
- using it in dwarf2_cu::get_builder.
Tested on x86_64-linux (openSUSE Leap 15.2), no regressions for config:
- using default gcc version 7.5.0
(with 5 unexpected FAILs)
- gcc 10.3.0 and target board
unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects
(with 1000 unexpected FAILs)
gdb/ChangeLog:
2021-06-16 Tom de Vries <tdevries@suse.de>
PR symtab/26327
* dwarf2/cu.h (dwarf2_cu::ancestor): Remove.
(dwarf2_cu::get_builder): Declare and move ...
* dwarf2/cu.c (dwarf2_cu::get_builder): ... here. Use sym_cu instead
of ancestor. Assert return value is non-null.
* dwarf2/read.c (read_file_scope): Set per_objfile->sym_cu.
(follow_die_offset, follow_die_sig_1): Remove setting of ancestor.
(dwarf2_per_objfile): Add sym_cu field.
With -fgnat-encodings=minimal, an internal version (these patches will
be upstreamed in the near future) of the Ada compiler can emit DWARF
for an array where the bound comes from a variable, like:
<1><12a7>: Abbrev Number: 7 (DW_TAG_array_type)
<12a8> DW_AT_name : (indirect string, offset: 0x1ae9): pck__my_array
[...]
<2><12b4>: Abbrev Number: 8 (DW_TAG_subrange_type)
<12b5> DW_AT_type : <0x1294>
<12b9> DW_AT_upper_bound : <0x1277>
With the upper bound DIE being:
<1><1277>: Abbrev Number: 2 (DW_TAG_variable)
<1278> DW_AT_name : (indirect string, offset: 0x1a4d): pck__my_length___U
<127c> DW_AT_type : <0x128f>
<1280> DW_AT_external : 1
<1280> DW_AT_artificial : 1
<1280> DW_AT_declaration : 1
Note that the variable is just a declaration -- in this situation, the
variable comes from another compilation unit, and must be found when
trying to compute the array bound.
This patch adds a new PROP_VARIABLE_NAME kind, to enable this search.
This same scenario can occur with DW_OP_GNU_variable_value, so this
patch adds support for that as well.
gdb/ChangeLog
2021-06-04 Tom Tromey <tromey@adacore.com>
* dwarf2/read.h (dwarf2_fetch_die_type_sect_off): Add 'var_name'
parameter.
* dwarf2/loc.c (dwarf2_evaluate_property) <case
PROP_VARIABLE_NAME>: New case.
(compute_var_value): New function.
(sect_variable_value): Use compute_var_value.
* dwarf2/read.c (attr_to_dynamic_prop): Handle DW_TAG_variable.
(var_decl_name): New function.
(dwarf2_fetch_die_type_sect_off): Add 'var_name' parameter.
* gdbtypes.h (enum dynamic_prop_kind) <PROP_VARIABLE_NAME>: New
constant.
(union dynamic_prop_data) <variable_name>: New member.
(struct dynamic_prop) <variable_name, set_variable_name>: New
methods.
gdb/testsuite/ChangeLog
2021-06-04 Tom Tromey <tromey@adacore.com>
* gdb.ada/array_of_symbolic_length.exp: New file.
* gdb.ada/array_of_symbolic_length/foo.adb: New file.
* gdb.ada/array_of_symbolic_length/gl.adb: New file.
* gdb.ada/array_of_symbolic_length/gl.ads: New file.
* gdb.ada/array_of_symbolic_length/pck.adb: New file.
* gdb.ada/array_of_symbolic_length/pck.ads: New file.
All signatured_type constucted (even those used only for lookups in hash
maps) need a signature. Enforce that by passing the signature all the
way to the signatured_type constructor.
gdb/ChangeLog:
* dwarf2/read.h (struct structured_type) <signatured_type>: New.
Update all callers.
(struct dwarf2_per_bfd) <allocate_signatured_type>: Add
signature parameter, update all callers.
* dwar2/read.c (dwarf2_per_bfd::allocate_signatured_type): Add
signature parameter.
Change-Id: I99bc1f88f54127666aa133ddbbabb7f7668fa14a
Add an alias for std::unique_ptr<signatured_type> and use it where
possible.
gdb/ChangeLog:
* dwarf2/read.h (signatured_type_up): New, use where possible.
Change-Id: I5a41e8345551434c8beeb9f269b03bdcf27989be
Move them up before dwarf2_per_bfd, this will allow adding and using
signatured_type_up in the next patch.
gdb/ChangeLog:
* dwarf2/read.h (signatured_type, dwarf2_per_cu_data): Move up.
Change-Id: I85acad4476c8236930b6f9e53ddb8bbbad009e5e
Now that CUs and TUs are stored together in all_comp_units, the
m_num_psymtabs member is no longer needed -- it is always identical to
the length of the vector. This patch removes it.
2021-05-30 Tom Tromey <tom@tromey.com>
* dwarf2/read.h (struct dwarf2_per_bfd) <num_psymtabs,
m_num_psymtabs>: Remove.
(resize_symtabs): Update.
* dwarf2/read.c (dwarf2_per_bfd::allocate_per_cu)
(dwarf2_per_bfd::allocate_signatured_type): Update.
When loading the debug info package
libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we
run into a dwarf error:
...
$ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug
Dwarf Error: Cannot not find DIE at 0x18a936e7 \
[from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug]
...
The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52.
No error message is printed when using -readnow.
What happens is the following:
- a dwarf2_per_cu_data P is created for the CU.
- a dwarf2_cu A is created for the same CU.
- another dwarf2_cu B is created for the same CU.
- the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that
per_objfile->get_cu (P) returns B.
- P->load_all_dies is set to 1.
- all dies are read into the A->partial_dies htab
- dwarf2_cu A is destroyed.
- we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies.
We can't find it, but do not try to load all dies, because P->load_all_dies
is already set to 1.
- an error message is generated.
The question is why we're creating dwarf2_cu A and B for the same CU.
The dwarf2_cu A is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7028
...
And the dwarf2_cu B is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436
#3 0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054),
offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391
#4 0x000000000069755b in partial_die_info::fixup (this=0x9096900,
cu=0xa6a85f0) at dwarf2/read.c:19512
#5 0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0,
cu=0xa6a85f0) at dwarf2/read.c:19516
#6 0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7563
#7 0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7580
#8 0x0000000000676b82 in process_psymtab_comp_unit_reader
(reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0,
pretend_language=language_minimal) at dwarf2/read.c:6954
#9 0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7057
...
So in frame #9, a cutu_reader is created with dwarf2_cu A. Then a fixup takes
us to the following CU @ 0x18aa33d6, in frame #5. And a similar fixup in
frame #4 takes us back to CU @ 0x18a23e52. At that point, there's no
information available that we're already trying to read that CU, and we end up
creating another cutu_reader with dwarf2_cu B.
It seems that there are two related problems:
- creating two dwarf2_cu's is not optimal
- the unoptimal case is not handled correctly
This patch addresses the last problem, by moving the load_all_dies flag from
dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies
field, which ensures that the two can be kept in sync.
Tested on x86_64-linux.
gdb/ChangeLog:
2021-05-27 Tom de Vries <tdevries@suse.de>
PR symtab/27898
* dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init.
* dwarf2/cu.h (dwarf2_cu): Add load_all_dies field.
* dwarf2/read.c (load_partial_dies, find_partial_die): Update.
* dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove
load_all_dies init.
(dwarf2_per_cu_data): Remove load_all_dies field.
Simon pointed out that dwarf2/cu.h and dwarf2/comp-unit.h seemingly
mean the same thing. He suggested renaming the latter to
comp-unit-head.h, which is what this patch does.
gdb/ChangeLog
2021-05-17 Tom Tromey <tom@tromey.com>
* dwarf2/read.h: Update include.
* dwarf2/read.c: Update include.
* dwarf2/line-header.c: Update include.
* dwarf2/cu.h: Update include.
* dwarf2/comp-unit-head.h: Rename from comp-unit.h.
* dwarf2/comp-unit-head.c: Rename from comp-unit.c.
* Makefile.in (COMMON_SFILES): Update.
Address sanitizer pointed out that the patch to use 'delete' for
dwarf2_per_cu_data introduced a bug -- now it is possible to delete a
signatured_type using a pointer to its base class.
This patch fixes the problem by introducing a deleter and a unique_ptr
specialization. A virtual destructor would be more ordinary here, but
it seemed wasteful to add a vtable just for this purpose. If virtual
methods are ever needed here, we can revisit this.
2021-05-17 Tom Tromey <tromey@adacore.com>
* dwarf2/read.h (struct dwarf2_per_cu_data_deleter: New.
(dwarf2_per_cu_data_up): New typedef.
(struct dwarf2_per_bfd) <allocate_per_cu>: Change return type.
<all_comp_units>: Use dwarf2_per_cu_data_up.
* dwarf2/read.c (dwarf2_per_cu_data::operator()): New function.
(dwarf2_per_bfd::allocate_per_cu): Return dwarf2_per_cu_data_up.
(create_cu_from_index_list): Likewise.
(create_signatured_type_table_from_index)
(create_cus_from_debug_names_list, add_type_unit)
(read_comp_units_from_section): Update.
(dwarf2_find_containing_comp_unit): Change type of all_comp_units.
(run_test): Update.
I don't think there is any deep reason to separate CUs and TUs in
dwarf2_per_bfd. This patch removes all_type_units and unifies these
two containers. Some minor tweaks are needed to the index writers,
because both forms of index keep CUs and TUs separate;
Regression tested on x86-63 Fedora 32.
gdb/ChangeLog
2021-04-30 Tom Tromey <tom@tromey.com>
* dwarf2/read.h (struct tu_stats) <nr_tus>: New member.
(struct dwarf2_per_bfd) <get_cutu, get_tu>: Remove
<get_cu>: Now inline.
<all_type_units>: Remove.
* dwarf2/read.c (dwarf2_per_bfd::~dwarf2_per_bfd): Update.
(dwarf2_per_bfd::get_cutu, dwarf2_per_bfd::get_cu)
(dwarf2_per_bfd::get_tu): Remove.
(dwarf2_per_bfd::allocate_signatured_type): Update nr_tus.
(create_signatured_type_table_from_index)
(create_signatured_type_table_from_debug_names)
(dw2_symtab_iter_next, dwarf2_base_index_functions::print_stats)
(dwarf2_base_index_functions::expand_all_symtabs)
(dw2_expand_marked_cus, dw_expand_symtabs_matching_file_matcher)
(dwarf2_base_index_functions::map_symbol_filenames)
(dw2_debug_names_iterator::next, dwarf2_initialize_objfile)
(add_signatured_type_cu_to_table, create_all_type_units)
(add_type_unit, build_type_psymtabs_1, print_tu_stats)
(create_all_comp_units): Update.
* dwarf2/index-write.c (check_dwarf64_offsets, write_gdbindex)
(write_debug_names): Update.
In a patch series I am working on, I'd like to have a non-POD member
in dwarf2_per_cu_data. This currently can't be done because
dwarf2_per_cu_data is allocated on an obstack and initialized with
memset.
This patch changes the DWARF reader to allocate objects of this type
with 'new'. The various "subclasses" of this type (signatured_type in
particular) are now changed to derive from dwarf2_per_cu_data, and
also use 'new' for allocation.
Regression tested on x86-64 Fedora 32.
gdb/ChangeLog
2021-04-30 Tom Tromey <tom@tromey.com>
* dwarf2/read.h (struct dwarf2_per_bfd) <allocate_per_cu,
allocate_signatured_type>: Change return type.
<all_comp_units, all_type_units>: Hold unique pointers.
(struct dwarf2_per_cu_data): Add constructor and initializers.
(struct signatured_type): Derive from dwarf2_per_cu_data.
* dwarf2/read.c (type_unit_group): Derive from
dwarf2_per_cu_data.
(dwarf2_per_bfd::get_cutu, dwarf2_per_bfd::get_cu)
(dwarf2_per_bfd::get_tu)
(dwarf2_per_bfd::allocate_signatured_type)
(dwarf2_per_bfd::allocate_signatured_type)
(create_cu_from_index_list, create_cus_from_index_list)
(create_signatured_type_table_from_index)
(create_signatured_type_table_from_debug_names)
(create_addrmap_from_aranges)
(dwarf2_base_index_functions::find_last_source_symtab)
(dw_expand_symtabs_matching_file_matcher)
(dwarf2_gdb_index::expand_symtabs_matching)
(dwarf2_base_index_functions::map_symbol_filenames)
(create_cus_from_debug_names_list)
(dw2_debug_names_iterator::next)
(dwarf2_debug_names_index::expand_symtabs_matching)
(create_debug_type_hash_table, add_type_unit)
(fill_in_sig_entry_from_dwo_entry, lookup_dwo_signatured_type):
Update.
(allocate_type_unit_groups_table): Use delete.
(create_type_unit_group): Change return type. Use new.
(get_type_unit_group, build_type_psymtabs_1)
(build_type_psymtab_dependencies)
(process_skeletonless_type_unit, set_partial_user)
(dwarf2_build_psymtabs_hard, read_comp_units_from_section)
(create_cus_hash_table, queue_and_load_dwo_tu, follow_die_sig_1)
(read_signatured_type): Update.
(dwarf2_find_containing_comp_unit): Change type of
'all_comp_units'.
(run_test): Update.
(dwarf2_per_bfd::allocate_per_cu)
(dwarf2_per_bfd::allocate_signatured_type): Change return type.
Use new.
(add_signatured_type_cu_to_table): Update.
* dwarf2/index-write.c (write_one_signatured_type)
(check_dwarf64_offsets, psyms_seen_size, write_gdbindex)
(write_debug_names): Update.
While working on some changes to 'info sources' I ran into a situation
where I was seeing the same source files reported twice in the output
of the 'info sources' command when using either .gdb_index or the
.debug_name index.
I traced the problem back to some caching in
dwarf2_base_index_functions::map_symbol_filenames; when called GDB
caches the set of filenames, but, filesnames are not removed as the
index entries are expanded into full symtabs. As a result we can end
up seeing filenames reported both from a full symtab _and_ from
a (stale) previously cached index entry.
Now, obviously, when seeing a problem like this the "correct" fix is
to remove the stale entries from the cache, however, I ran a few
experiments to see why this wasn't really hitting us anywhere, and, as
far as I can tell, ::map_symbol_filenames is only called from three
places:
1. The mi command -file-list-exec-source-files,
2. The 'info sources' command, and
3. Filename completion
However, the result of this "bug" is that we will see duplicate
filenames, and readline's completion mechanism already removes
duplicates, so for case #3 we will never see any problems.
Cases #1 and #2 are basically the same, and in each case, to see a
problem we need to ensure we craft the test in a particular way, start
up ensuring we have some unexpected symtabs, then run one of the
commands to populate the cache, then expand one of the symtabs, and
list the sources again. At this point you'll see duplicate entries in
the results. Hardly surprising we haven't randomly hit this situation
in testing.
So, considering that use cases #1 and #2 are certainly not "high
performance" code (i.e. I don't think these justify the need for
caching) this leaves use case #3. Does this use justify the need for
caching? Well the psymbol_functions::map_symbol_filenames function
doesn't seem to do any extra caching, and within
dwarf2_base_index_functions::map_symbol_filenames, the only expensive
bit appears to be the call to dw2_get_file_names, and this already
does its own caching via this_cu->v.quick->file_names.
The upshot of all this analysis was that I'm not convinced the need
for the additional caching is justified, and so, I propose that to fix
the bug in GDB, I just remove the extra caching (for now).
If we later find that the caching _was_ useful, then we can
reintroduce it, but add it back such that it doesn't reintroduce this
bug.
As I was changing dwarf2_base_index_functions::map_symbol_filenames I
replaced the use of htab_up with std::unordered_set.
Tested using target_boards cc-with-debug-names and dwarf4-gdb-index.
gdb/ChangeLog:
* dwarf2/read.c: Add 'unordered_set' include.
(dwarf2_base_index_functions::map_symbol_filenames): Replace
'visited' hash table with 'qfn_cache' unordered_set. Remove use
of per_Bfd->filenames_cache cache, and use function local
filenames_cache instead. Reindent.
* dwarf2/read.h (struct dwarf2_per_bfd) <filenames_cache>: Delete.
gdb/testsuite/ChangeLog:
* gdb.base/info_sources.exp: Add new tests.
I noticed some holes in struct dwarf2_per_cu_data. This patch
rearranges the type slightly, and shrinks the size of some fields.
This reduces it from 136 bytes to 112 bytes (on x86-64).
I also reduced the size of the DWARF "version" fields in a couple of
spots. It seemed needless to use a short to hold a value that ranges
from 2 to 5, and this also helped the goal of shrinking
dwarf2_per_cu_data.
2021-04-21 Tom Tromey <tom@tromey.com>
* dwarf2/read.h (struct dwarf2_per_cu_data) <dwarf_version>: Now
unsigned char.
(struct dwarf2_per_cu_data): Rearrange.
* dwarf2/comp-unit.h (struct comp_unit_head) <version>: Now
unsigned char.
(struct comp_unit_head): Rearrange.
* dwarf2/comp-unit.c (read_comp_unit_head): Update.
Since partial_symtab is supposed to be objfile-independent (since series
[1]), I think it would make sense for partial_symtab to not take an
objfile as a parameter in its constructor.
This patch replaces that parameter with an objfile_per_bfd_storage
parameter.
The objfile is used for two things:
- to get the objfile_name, for debug messages. We can get that name
from the bfd instead.
- to intern the partial symtab filename. Even though it goes through
an objfile method, the request is actually forwarded to the
underlying objfile_per_bfd_storage. So we can ask the new
objfile_per_bfd_storage instead.
In order to get a reference to the BFD from the objfile_per_bfd_storage,
the BFD is saved in the objfile_per_bfd_storage object.
[1] https://sourceware.org/pipermail/gdb-patches/2021-February/176625.html
gdb/ChangeLog:
* psympriv.h (struct partial_symtab) <partial_symtab>: Change
objfile parameter for objfile_per_bfd_storage, adjust callers.
(struct standard_psymtab) <standard_psymtab>: Likewise.
(struct legacy_psymtab) <legacy_psymtab>: Likewise.
* psymtab.c (partial_symtab::partial_symtab): Likewise.
* ctfread.c (struct ctf_psymtab): Likewise.
* dwarf2/read.h (struct dwarf2_psymtab): Likewise.
* dwarf2/read.c (struct dwarf2_include_psymtab): Likewise.
(dwarf2_create_include_psymtab): Likewise.
* objfiles.h (struct objfile_per_bfd_storage)
<objfile_per_bfd_storage>: Add bfd parameter, adjust callers.
<get_bfd>: New method.
<m_bfd>: New field.
* objfiles.c (get_objfile_bfd_data): Adjust.
Change-Id: I2ed3ab5d2e6f27d034bd4dc26ae2fae7b0b8a2b9
Currently the DWARF index readers reuse the objfile's partial symbol
table in order to store an addrmap. We're going to be remove the
partial symbol object, so this patch changes the DWARF reader to store
this addrmap in the per_bfd object. This object is chosen, rather
than the quick_symbol_functions subclass, because the addrmap can be
shared across objfiles.
gdb/ChangeLog
2021-03-20 Tom Tromey <tom@tromey.com>
* dwarf2/read.h (struct dwarf2_per_bfd) <psymtabs_addrmap>: New
member.
* dwarf2/read.c (create_addrmap_from_index)
(create_addrmap_from_aranges): Set per_bfd addrmap.
(dwarf2_read_gdb_index): Don't set partial_symtabs.
(dwarf2_base_index_functions::find_pc_sect_compunit_symtab): Use
per_bfd addrmap.
(dwarf2_read_debug_names): Don't set partial_symtabs.
(dwarf2_initialize_objfile): Likewise.
This moves a bit of the DWARF-specific code out of symfile.h and into
dwarf2/read.h.
gdb/ChangeLog
2021-03-20 Tom Tromey <tom@tromey.com>
* symfile.h (enum dwarf2_section_enum)
(dwarf2_get_section_info): Move to dwarf2/read.h.
* dwarf2/read.h (enum dwarf2_section_enum)
(dwarf2_get_section_info): Move from symfile.h.
This moves dwarf2_get_dwz_file and some helper code to dwarf2/dwz.h.
The main benefit of this is just shrinking dwarf2/read.c a little bit.
gdb/ChangeLog
2021-03-06 Tom Tromey <tom@tromey.com>
* dwarf2/sect-names.h (dwarf2_elf_names): Declare.
* dwarf2/read.h (dwarf2_get_dwz_file): Move to dwz.h.
* dwarf2/read.c (dwarf2_elf_names): No longer static.
(locate_dwz_sections, dwz_search_other_debugdirs)
(dwarf2_get_dwz_file): Move to dwz.c.
* dwarf2/dwz.h (dwarf2_get_dwz_file): Move declaration from
read.h.
* dwarf2/dwz.c (locate_dwz_sections, dwz_search_other_debugdirs)
(dwarf2_get_dwz_file): Move from read.c.