dwarf2_cu has a 'language' value, but dwarf2_per_cu_data also holds a
value of this same type. There doesn't seem to be any reason to keep
two copies of this value. This patch removes the field from
dwarf2_cu, and arranges to set the value in the per-CU object instead.
Note that the value must still be set when expanding the full CU.
This is needed because the CUs will not be scanned when a DWARF index
is in use.
gdb/ChangeLog
2021-06-25 Tom Tromey <tom@tromey.com>
* dwarf2/read.c (process_psymtab_comp_unit): Don't set 'lang'.
(scan_partial_symbols, partial_die_parent_scope)
(add_partial_symbol, add_partial_subprogram)
(compute_delayed_physnames, rust_union_quirks)
(process_full_comp_unit, process_full_type_unit)
(process_imported_unit_die, process_die, dw2_linkage_name)
(dwarf2_compute_name, dwarf2_physname, read_import_statement)
(read_file_scope, queue_and_load_dwo_tu, read_func_scope)
(read_variable, dwarf2_get_subprogram_pc_bounds)
(dwarf2_attach_fields_to_type, dwarf2_add_member_fn)
(dwarf2_attach_fn_fields_to_type)
(quirk_ada_thick_pointer_struct, read_structure_type)
(handle_struct_member_die, process_structure_scope)
(read_array_type, read_array_order, prototyped_function_p)
(read_subroutine_type, dwarf2_init_complex_target_type)
(read_base_type, read_subrange_type, read_unspecified_type)
(load_partial_dies, partial_die_info::fixup, set_cu_language)
(new_symbol, need_gnat_info, determine_prefix, typename_concat)
(dwarf2_canonicalize_name, follow_die_offset)
(prepare_one_comp_unit): Update.
* dwarf2/cu.c (dwarf2_cu::start_symtab): Update.
This is another attempt at fixing the problem described in commit 4cf88725da1
"[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder()", which was
reverted in commit 3db19b2d724.
First off, some context.
A DWARF CU can be viewed as a symbol table: toplevel children of a CU DIE
represent symbol table entries for that CU. Furthermore, there is a
hierarchy: a symbol table entry such as a function itself has a symbol table
containing parameters and local variables.
The dwarf reader maintains a notion of current symbol table (that is: the
symbol table a new symbol needs to be entered into) in dwarf2_cu member
list_in_scope.
A problem then presents itself when reading inter-CU references:
- a new symbol read from a CU B needs to be entered into the symbol table of
another CU A.
- the notion of current symbol table is tracked on a per-CU basis.
This is addressed in inherit_abstract_dies by temporarily overwriting the
list_in_scope for CU B with the one for CU A.
The current symbol table is one aspect of the current dwarf reader context
that is tracked, but there are more, f.i. ones that are tracked via the
dwarf2_cu member m_builder, f.i. m_builder->m_local_using_directives.
A similar problem exists in relation to inter-CU references, but a different
solution was chosen:
- to keep track of an ancestor field in dwarf2_cu, which is updated
when traversing inter-CU references, and
- to use the ancestor field in dwarf2_cu::get_builder to return the m_builder
in scope.
There is no actual concept of a CU having an ancestor, it just marks the most
recent CU from which a CU was inter-CU-referenced. Consequently, when
following inter-CU references from a CU A to another CU B and back to CU A,
the ancestors form a cycle, which causes dwarf2_cu::get_builder to hang or
segfault, as reported in PR26327.
ISTM that the ancestor implementation is confusing and fragile, and should
go. Furthermore, it seems that keeping track of the m_builder in scope can be
handled simply with a per-objfile variable.
Fix the hang / segfault by:
- keeping track of the m_builder in scope using a new variable
per_obj->sym_cu, and
- using it in dwarf2_cu::get_builder.
Tested on x86_64-linux (openSUSE Leap 15.2), no regressions for config:
- using default gcc version 7.5.0
(with 5 unexpected FAILs)
- gcc 10.3.0 and target board
unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects
(with 1000 unexpected FAILs)
gdb/ChangeLog:
2021-06-16 Tom de Vries <tdevries@suse.de>
PR symtab/26327
* dwarf2/cu.h (dwarf2_cu::ancestor): Remove.
(dwarf2_cu::get_builder): Declare and move ...
* dwarf2/cu.c (dwarf2_cu::get_builder): ... here. Use sym_cu instead
of ancestor. Assert return value is non-null.
* dwarf2/read.c (read_file_scope): Set per_objfile->sym_cu.
(follow_die_offset, follow_die_sig_1): Remove setting of ancestor.
(dwarf2_per_objfile): Add sym_cu field.
When loading the debug info package
libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we
run into a dwarf error:
...
$ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug
Dwarf Error: Cannot not find DIE at 0x18a936e7 \
[from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug]
...
The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52.
No error message is printed when using -readnow.
What happens is the following:
- a dwarf2_per_cu_data P is created for the CU.
- a dwarf2_cu A is created for the same CU.
- another dwarf2_cu B is created for the same CU.
- the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that
per_objfile->get_cu (P) returns B.
- P->load_all_dies is set to 1.
- all dies are read into the A->partial_dies htab
- dwarf2_cu A is destroyed.
- we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies.
We can't find it, but do not try to load all dies, because P->load_all_dies
is already set to 1.
- an error message is generated.
The question is why we're creating dwarf2_cu A and B for the same CU.
The dwarf2_cu A is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7028
...
And the dwarf2_cu B is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436
#3 0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054),
offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391
#4 0x000000000069755b in partial_die_info::fixup (this=0x9096900,
cu=0xa6a85f0) at dwarf2/read.c:19512
#5 0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0,
cu=0xa6a85f0) at dwarf2/read.c:19516
#6 0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7563
#7 0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7580
#8 0x0000000000676b82 in process_psymtab_comp_unit_reader
(reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0,
pretend_language=language_minimal) at dwarf2/read.c:6954
#9 0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7057
...
So in frame #9, a cutu_reader is created with dwarf2_cu A. Then a fixup takes
us to the following CU @ 0x18aa33d6, in frame #5. And a similar fixup in
frame #4 takes us back to CU @ 0x18a23e52. At that point, there's no
information available that we're already trying to read that CU, and we end up
creating another cutu_reader with dwarf2_cu B.
It seems that there are two related problems:
- creating two dwarf2_cu's is not optimal
- the unoptimal case is not handled correctly
This patch addresses the last problem, by moving the load_all_dies flag from
dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies
field, which ensures that the two can be kept in sync.
Tested on x86_64-linux.
gdb/ChangeLog:
2021-05-27 Tom de Vries <tdevries@suse.de>
PR symtab/27898
* dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init.
* dwarf2/cu.h (dwarf2_cu): Add load_all_dies field.
* dwarf2/read.c (load_partial_dies, find_partial_die): Update.
* dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove
load_all_dies init.
(dwarf2_per_cu_data): Remove load_all_dies field.
Simon pointed out that dwarf2/cu.h and dwarf2/comp-unit.h seemingly
mean the same thing. He suggested renaming the latter to
comp-unit-head.h, which is what this patch does.
gdb/ChangeLog
2021-05-17 Tom Tromey <tom@tromey.com>
* dwarf2/read.h: Update include.
* dwarf2/read.c: Update include.
* dwarf2/line-header.c: Update include.
* dwarf2/cu.h: Update include.
* dwarf2/comp-unit-head.h: Rename from comp-unit.h.
* dwarf2/comp-unit-head.c: Rename from comp-unit.c.
* Makefile.in (COMMON_SFILES): Update.
This moves dwarf2_cu and one supporting data structure to a new header
file. The main goal, as always with this kind of change, is to make
the DWARF reader a bit more understandable.
gdb/ChangeLog
2021-05-17 Tom Tromey <tom@tromey.com>
* Makefile.in (HFILES_NO_SRCDIR): Add dwarf2/cu.h.
* dwarf2/read.c (struct delayed_method_info, struct dwarf2_cu):
Move to cu.h.
* dwarf2/cu.h: New file.