13 Commits

Author SHA1 Message Date
3bfdcabbc2 [gdb] Fix more typos
Fix some more typos:
- distinquish -> distinguish
- actualy -> actually
- singe -> single
- frash -> frame
- chid -> child
- dissassembler -> disassembler
- uninitalized -> uninitialized
- precontidion -> precondition
- regsiters -> registers
- marge -> merge
- sate -> state
- garanteed -> guaranteed
- explictly -> explicitly
- prefices (nonstandard plural) -> prefixes
- bondary -> boundary
- formated -> formatted
- ithe -> the
- arrav -> array
- coresponding -> corresponding
- owend -> owned
- fials -> fails
- diasm -> disasm
- ture -> true
- tpye -> type

There's one code change, the name of macro SIG_CODE_BONDARY_FAULT changed to
SIG_CODE_BOUNDARY_FAULT.

Tested on x86_64-linux.
2023-06-05 12:53:15 +02:00
4de4e48514 gdb/python: extend the Python Disassembler API to allow for styling
This commit extends the Python Disassembler API to allow for styling
of the instructions.

Before this commit the Python Disassembler API allowed the user to do
two things:

  - They could intercept instruction disassembly requests and return a
    string of their choosing, this string then became the disassembled
    instruction, or

  - They could call builtin_disassemble, which would call back into
    libopcode to perform the disassembly.  As libopcode printed the
    instruction GDB would collect these print requests and build a
    string.  This string was then returned from the builtin_disassemble
    call, and the user could modify or extend this string as needed.

Neither of these approaches allowed for, or preserved, disassembler
styling, which is now available within libopcodes for many of the more
popular architectures GDB supports.

This commit aims to fill this gap.  After this commit a user will be
able to do the following things:

  - Implement a custom instruction disassembler entirely in Python
    without calling back into libopcodes, the custom disassembler will
    be able to return styling information such that GDB will display
    the instruction fully styled.  All of GDB's existing style
    settings will affect how instructions coming from the Python
    disassembler are displayed in the expected manner.

  - Call builtin_disassemble and receive a result that represents how
    libopcode would like the instruction styled.  The user can then
    adjust or extend the disassembled instruction before returning the
    result to GDB.  Again, the instruction will be styled as expected.

To achieve this I will add two new classes to GDB,
DisassemblerTextPart and DisassemblerAddressPart.

Within builtin_disassemble, instead of capturing the print calls from
libopcodes and building a single string, we will now create either a
text part or address part and store these parts in a vector.

The DisassemblerTextPart will capture a small piece of text along with
the associated style that should be used to display the text.  This
corresponds to the disassembler calling
disassemble_info::fprintf_styled_func, or for disassemblers that don't
support styling disassemble_info::fprintf_func.

The DisassemblerAddressPart is used when libopcodes requests that an
address be printed, and takes care of printing the address and
associated symbol, this corresponds to the disassembler calling
disassemble_info::print_address_func.

These parts are then placed within the DisassemblerResult when
builtin_disassemble returns.

Alternatively, the user can directly create parts by calling two new
methods on the DisassembleInfo class: DisassembleInfo.text_part and
DisassembleInfo.address_part.

Having created these parts the user can then pass these parts when
initializing a new DisassemblerResult object.

Finally, when we return from Python to gdbpy_print_insn, one way or
another, the result being returned will have a list of parts.  Back in
GDB's C++ code we walk the list of parts and call back into GDB's core
to display the disassembled instruction with the correct styling.

The new API lives in parallel with the old API.  Any existing code
that creates a DisassemblerResult using a single string immediately
creates a single DisassemblerTextPart containing the entire
instruction and gives this part the default text style.  This is also
what happens if the user calls builtin_disassemble for an architecture
that doesn't (yet) support libopcode styling.

This matches up with what happens when the Python API is not involved,
an architecture without disassembler styling support uses the old
libopcodes printing API (the API that doesn't pass style info), and
GDB just prints everything using the default text style.

The reason that parts are created by calling methods on
DisassembleInfo, rather than calling the class constructor directly,
is DisassemblerAddressPart.  Ideally this part would only hold the
address which the part represents, but in order to support backwards
compatibility we need to be able to convert the
DisassemblerAddressPart into a string.  To do that we need to call
GDB's internal print_address function, and to do that we need an
gdbarch.

What this means is that the DisassemblerAddressPart needs to take a
gdb.Architecture object at creation time.  The only valid place a user
can pull this from is from the DisassembleInfo object, so having the
DisassembleInfo act as a factory ensures that the correct gdbarch is
passed over each time.  I implemented both solutions (the one
presented here, and an alternative where parts could be constructed
directly), and this felt like the cleanest solution.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
2023-05-16 10:30:47 +01:00
0af2f23333 gdb/python: rework how the disassembler API reads the result object
This commit is a refactor ahead of the next change which will make
disassembler styling available through the Python API.

Unfortunately, in order to make the styling support available, I think
the easiest solution is to make a very small change to the existing
API.

The current API relies on returning a DisassemblerResult object to
represent each disassembled instruction.  Currently GDB allows the
DisassemblerResult class to be sub-classed, which could mean that a
user tries to override the various attributes that exist on the
DisassemblerResult object.

This commit removes this ability, effectively making the
DisassemblerResult class final.

Though this is a change to the existing API, I'm hoping this isn't
going to cause too many issues:

  - The Python disassembler API was only added in the previous release
    of GDB, so I don't expect it to be widely used yet, and

  - It's not clear to me why a user would need to sub-class the
    DisassemblerResult type, I allowed it in the original patch
    because at the time I couldn't see any reason to NOT allow it.

Having prevented sub-classing I can now rework the tail end of the
gdbpy_print_insn function; instead of pulling the results out of the
DisassemblerResult object by calling back into Python, I now cast the
Python object back to its C++ type (disasm_result_object), and access
the fields directly from there.  In later commits I will be reworking
the disasm_result_object type in order to hold information about the
styled disassembler output.

The tests that dealt with sub-classing DisassemblerResult have been
removed, and a new test that confirms that DisassemblerResult can't be
sub-classed has been added.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
2023-05-16 10:30:46 +01:00
6a66780739 gdb/python: implement DisassemblerResult.__str__ method
Add the DisassemblerResult.__str__ method.  This gives the same result
as the DisassemblerResult.string attribute, but can be useful
sometimes depending on how the user is trying to print the object.

There's a test for the new functionality.
2023-05-12 18:24:24 +01:00
15ccb5e393 gdb/python: implement __repr__ methods for py-disasm.c types
Add a __repr__ method for the DisassembleInfo and DisassemblerResult
types, and add some tests for these new methods.
2023-05-12 18:24:24 +01:00
3965bff5b9 gdb/python: add mechanism to manage Python initialization functions
Currently, when we add a new python sub-system to GDB,
e.g. py-inferior.c, we end up having to create a new function like
gdbpy_initialize_inferior, which then has to be called from the
function do_start_initialization in python.c.

In some cases (py-micmd.c and py-tui.c), we have two functions
gdbpy_initialize_*, and gdbpy_finalize_*, with the second being called
from finalize_python which is also in python.c.

This commit proposes a mechanism to manage these initialization and
finalization calls, this means that adding a new Python subsystem will
no longer require changes to python.c or python-internal.h, instead,
the initialization and finalization functions will be registered
directly from the sub-system file, e.g. py-inferior.c, or py-micmd.c.

The initialization and finalization functions are managed through a
new class gdbpy_initialize_file in python-internal.h.  This class
contains a single global vector of all the initialization and
finalization functions.

In each Python sub-system we create a new gdbpy_initialize_file
object, the object constructor takes care of registering the two
callback functions.

Now from python.c we can call static functions on the
gdbpy_initialize_file class which take care of walking the callback
list and invoking each callback in turn.

To slightly simplify the Python sub-system files I added a new macro
GDBPY_INITIALIZE_FILE, which hides the need to create an object.  We
can now just do this:

  GDBPY_INITIALIZE_FILE (gdbpy_initialize_registers);

One possible problem with this change is that there is now no
guaranteed ordering of how the various sub-systems are initialized (or
finalized).  To try and avoid dependencies creeping in I have added a
use of the environment variable GDB_REVERSE_INIT_FUNCTIONS, this is
the same environment variable used in the generated init.c file.

Just like with init.c, when this environment variable is set we
reverse the list of Python initialization (and finalization)
functions.  As there is already a test that starts GDB with the
environment variable set then this should offer some level of
protection against dependencies creeping in - though for full
protection I guess we'd need to run all gdb.python/*.exp tests with
the variable set.

I have tested this patch with the environment variable set, and saw no
regressions, so I think we are fine right now.

One other change of note was for gdbpy_initialize_gdb_readline, this
function previously returned void.  In order to make this function
have the correct signature I've updated its return type to int, and we
now return 0 to indicate success.

All of the other initialize (and finalize) functions have been made
static within their respective sub-system files.

There should be no user visible changes after this commit.
2023-05-05 18:24:42 +01:00
6a208145d2 gdb/python: replace strlen call with std::string::size call
Small cleanup to use std::string::size instead of calling strlen on
the result of std::string::c_str.

Should be no user visible changes after this call.
2023-03-03 09:56:21 +00:00
83b6e1f1c5 gdb: remove language.h include from frame.h
This helps resolve some cyclic include problem later in the series.
The only language-related thing frame.h needs is enum language, and that
is in defs.h.

Doing so reveals that a bunch of files were relying on frame.h to
include language.h, so fix the fallouts here and there.

Change-Id: I178a7efec1953c2d088adb58483bade1f349b705
Reviewed-By: Bruno Larsen <blarsen@redhat.com>
2023-01-20 14:48:56 -05:00
213516ef31 Update copyright year range in header of all files managed by GDB
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
2023-01-01 17:01:16 +04:00
8eb7d135e3 gdb/disasm: mark functions passed to the disassembler noexcept
While working on another patch, Simon pointed out that GDB could be
improved by marking the functions passed to the disassembler as
noexcept.

  https://sourceware.org/pipermail/gdb-patches/2022-October/193084.html

The reason this is important is the on some hosts, libopcodes, being C
code, will not be compiled with support for handling exceptions.  As
such, an attempt to throw an exception over libopcodes code will cause
GDB to terminate.

See bug gdb/29712 for an example of when this happened.

In this commit all the functions that are passed to the disassembler,
and which might be used as callbacks by libopcodes are marked
noexcept.

Ideally, I would have liked to change these typedefs:

  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
  using memory_error_ftype = decltype (disassemble_info::memory_error_func);
  using print_address_ftype = decltype (disassemble_info::print_address_func);
  using fprintf_ftype = decltype (disassemble_info::fprintf_func);
  using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);

which are declared in disasm.h, as including the noexcept keyword.
However, when I tried this, I ran into this warning/error:

  In file included from ../../src/gdb/disasm.c:25:
  ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
  ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
    116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~

So I've left that change out.  This does mean that if somebody adds a
new use of the disassembler classes in the future, and forgets to mark
the callbacks as noexcept, this will compile fine.  We'll just have to
manually check for that during review.
2022-11-28 19:23:30 +00:00
65639fcc54 gdb/python: avoid throwing an exception over libopcodes code
Bug gdb/29712 identifies a problem with the Python disassembler API.
In some cases GDB will try to throw an exception through the
libopcodes disassembler code, however, not all targets include
exception unwind information when compiling C code, for targets that
don't include this information GDB will terminate when trying to pass
the exception through libopcodes.

To explain what GDB is trying to do, consider the following trivial
use of the Python disassembler API:

  class ExampleDisassembler(gdb.disassembler.Disassembler):

      class MyInfo(gdb.disassembler.DisassembleInfo):
          def __init__(self, info):
              super().__init__(info)

          def read_memory(self, length, offset):
              return super().read_memory(length, offset)

      def __init__(self):
          super().__init__("ExampleDisassembler")

      def __call__(self, info):
          info = self.MyInfo(info)
          return gdb.disassembler.builtin_disassemble(info)

This disassembler doesn't add any value, it defers back to GDB to do
all the actual work, but it serves to allow us to discuss the problem.

The problem occurs when a Python exception is raised by the
MyInfo.read_memory method.  The MyInfo.read_memory method is called
from the C++ function gdbpy_disassembler::read_memory_func.  The C++
stack at the point this function is called looks like this:

  #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
  #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
  #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
  #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
  #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
  #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
  #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
  #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
  #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324

  ### ... snip lots of frames as we pass through Python itself ...

  #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
  #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
  #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
  #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101

So gdbpy_disassembler::read_memory_func is called from the libopcodes
disassembler to read memory, this C++ function then calls into user
supplied Python code to do the work.

If the user supplied Python code raises an gdb.MemoryError exception
indicating the memory read failed, this is fine.  The C++ code
converts this exception back into a return value that libopcodes can
understand, and returns to libopcodes.

However, if the user supplied Python code raises some other exception,
what we want is for this exception to propagate through GDB and appear
as if raised by the call to gdb.disassembler.builtin_disassemble.  To
achieve this, when gdbpy_disassembler::read_memory_func spots an
unknown Python exception, we must pass the information about this
exception from frame #0 to frame #8 in the above backtrace.  Frame #8
is the C++ implementation of gdb.disassembler.builtin_disassemble, and
so it is this function that we want to re-raise the unknown Python
exception, so the user can, if they want, catch the exception in their
code.

The previous mechanism by which the exception was passed was to pack
the details of the Python exception into a C++ exception, then throw
the exception from frame #0, and catch the exception in frame #8,
unpack the details of the Python exception, and re-raise it.

However, this relies on the exception passing through frames #1 to #7,
some of which are in libopcodes, which is C code, and so, might not be
compiled with exception support.

This commit proposes an alternative solution that does not rely on
throwing a C++ exception.

When we spot an unhandled Python exception in frame #0, we will store
the details of this exception within the gdbpy_disassembler object
currently in use.  Then we return to libopcodes a value indicating
that the memory_read failed.

libopcodes will now continue to disassemble as though that memory read
failed (with one special case described below), then, when we
eventually return to disasmpy_builtin_disassemble we check to see if
there is an exception stored in the gdbpy_disassembler object.  If
there is then this exception can immediately be installed, and then we
return back to Python, when the user will be able to catch the
exception.

There is one extra change in gdbpy_disassembler::read_memory_func.
After the first call that results in an exception being stored on the
gdbpy_disassembler object, any future calls to the ::read_memory_func
function will immediately return as if the read failed.  This avoids
any additional calls into user supplied Python code.

My thinking here is that should the first call fail with some unknown
error, GDB should not keep trying with any additional calls.  This
maintains the illusion that the exception raised from
MyInfo.read_memory is immediately raised by
gdb.disassembler.builtin_disassemble.  I have no tests for this change
though - to trigger this issue would rely on a libopcodes disassembler
that will try to read further memory even after the first failed
read.  I'm not aware of any such disassembler that currently does
this, but that doesn't mean such a disassembler couldn't exist in the
future.

With this change in place the gdb.python/py-disasm.exp test should now
pass on AArch64.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29712

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2022-11-28 19:23:30 +00:00
23aa2befce gdb/python: fix invalid use disassemble_info::stream
After this commit:

  commit 81384924cdcc9eb2676dd9084b76845d7d0e0759
  Date:   Tue Apr 5 11:06:16 2022 +0100

      gdb: have gdb_disassemble_info carry 'this' in its stream pointer

The disassemble_info::stream field will no longer be a ui_file*.  That
commit failed to update one location in py-disasm.c though.

While running some tests using the Python disassembler API, I
triggered a call to gdbpy_disassembler::print_address_func, and, as I
had compiled GDB with the undefined behaviour sanitizer, GDB crashed
as the code currently (incorrectly) casts the stream field to be a
ui_file*.

In this commit I fix this error.

In order to test this case I had to tweak the existing test case a
little.  I also spotted some debug printf statements in py-disasm.py,
which I have removed.
2022-07-25 19:26:24 +01:00
15e15b2d9c gdb/python: implement the print_insn extension language hook
This commit extends the Python API to include disassembler support.

The motivation for this commit was to provide an API by which the user
could write Python scripts that would augment the output of the
disassembler.

To achieve this I have followed the model of the existing libopcodes
disassembler, that is, instructions are disassembled one by one.  This
does restrict the type of things that it is possible to do from a
Python script, i.e. all additional output has to fit on a single line,
but this was all I needed, and creating something more complex would,
I think, require greater changes to how GDB's internal disassembler
operates.

The disassembler API is contained in the new gdb.disassembler module,
which defines the following classes:

  DisassembleInfo

      Similar to libopcodes disassemble_info structure, has read-only
  properties: address, architecture, and progspace.  And has methods:
  __init__, read_memory, and is_valid.

      Each time GDB wants an instruction disassembled, an instance of
  this class is passed to a user written disassembler function, by
  reading the properties, and calling the methods (and other support
  methods in the gdb.disassembler module) the user can perform and
  return the disassembly.

  Disassembler

      This is a base-class which user written disassemblers should
  inherit from.  This base class provides base implementations of
  __init__ and __call__ which the user written disassembler should
  override.

  DisassemblerResult

      This class can be used to hold the result of a call to the
  disassembler, it's really just a wrapper around a string (the text
  of the disassembled instruction) and a length (in bytes).  The user
  can return an instance of this class from Disassembler.__call__ to
  represent the newly disassembled instruction.

The gdb.disassembler module also provides the following functions:

  register_disassembler

      This function registers an instance of a Disassembler sub-class
  as a disassembler, either for one specific architecture, or, as a
  global disassembler for all architectures.

  builtin_disassemble

      This provides access to GDB's builtin disassembler.  A common
  use case that I see is augmenting the existing disassembler output.
  The user code can call this function to have GDB disassemble the
  instruction in the normal way.  The user gets back a
  DisassemblerResult object, which they can then read in order to
  augment the disassembler output in any way they wish.

      This function also provides a mechanism to intercept the
  disassemblers reads of memory, thus the user can adjust what GDB
  sees when it is disassembling.

The included documentation provides a more detailed description of the
API.

There is also a new CLI command added:

  maint info python-disassemblers

This command is defined in the Python gdb.disassemblers module, and
can be used to list the currently registered Python disassemblers.
2022-06-15 09:44:54 +01:00