PowerPC PLT stub tidy

This is in preparation for the next patch adding Spectre variant 2
mitigation for PowerPC and PowerPC64.  Besides tidying code involved
in stub output (to reduce the number of places where bctr is output),
the patch adds some user visible features:

1) PowerPC64 ELFv2 global entry stubs now are aligned under the
   control of --plt-align, with a default alignment of 32 bytes.
2) PowerPC64 __glink_PLTresolve is no longer padded out with nops.
3) PowerPC32 PLT stubs are aligned under the control of --plt-align,
   with the default alignment being 16 bytes as before.
4) The PowerPC32 branch/nop table emitted before __glink_PLTresolve
   is now smaller in many cases.  It was sized incorrectly when the
   __tls_get_addr_opt stub was used, and unnecessarily included space
   for local ifuncs.

bfd/
	* elf32-ppc.c (GLINK_ENTRY_SIZE): Add parameters, handle
	__tls_get_addr_opt, and alignment sizing.
	(TLS_GET_ADDR_GLINK_SIZE): Delete.
	(is_nonpic_glink_stub): Don't use GLINK_ENTRY_SIZE.
	(ppc_elf_get_synthetic_symtab): Recognize stubs spaced at 4, 6,
	or 8 insns.
	(ppc_elf_link_hash_table_create): Init new ppc_elf_params field.
	(allocate_dynrelocs): Use new GLINK_ENTRY_SIZE.
	(ppc_elf_size_dynamic_sections): Likewise.  Size branch table
	by PLT reloc count.
	(write_glink_stub): Handle __tls_get_addr_opt stub.
	Pad out to size given by GLINK_ENTRY_SIZE.
	(ppc_elf_relocate_section): Adjust write_glink_stub call.
	(ppc_elf_finish_dynamic_symbol): Likewise.
	(ppc_elf_finish_dynamic_sections): Write PLTresolve without using
	insn array since so many need rewriting.
	* elf32-ppc.h (struct ppc_elf_params): Add plt_stub_align.
	* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Rename from
	GLINK_CALL_STUB_SIZE.  Add htab param and evaluate to size without
	nops.  Adjust all uses.
	(ppc64_elf_get_synthetic_symtab): Don't use GLINK_CALL_STUB_SIZE
	in glink_vma calculation.
	(struct ppc_link_hash_table): Add global_entry section pointer.
	(create_linkage_sections): Create separate section for global
	entry stubs.
	(PPC_LO, PPC_HI, PPC_HA): Move earlier.
	(size_global_entry_stubs): Handle sizing for aligned stubs.
	(ppc64_elf_size_dynamic_sections): Handle global_entry alloc,
	and don't stash end of glink branch table in rawsize.
	(ppc_build_one_stub): Rewrite stub size calculations.
	(build_global_entry_stubs): Use new section.
	(ppc64_elf_build_stubs): Don't pad __glink_PLTresolve with nops.
	Build lazy link stubs out to end of section.  Build global entry
	stubs in new section.
gold/
	* options.h (plt_align): Support for PowerPC32 too.
	* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
	(Stub_table::plt_call_size, branch_stub_size): Tidy.
	(Stub_table::plt_call_align): Implement using stub_align.
	(Output_data_glink::global_entry_align): New function.
	(Output_data_glink::global_entry_off): New function.
	(Output_data_glink::global_entry_address): Use global_entry_off.
	(Output_data_glink::pltresolve_size): New function, replacing
	pltresolve_size_ constant.  Update all uses.
	(Output_data_glink::add_global_entry): Align offset.
	(Output_data_glink::set_final_data_size): Use global_entry_align.
	(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
	Tidy stub output.  Use global_entry_off.
ld/
	* emultempl/ppc32elf.em (params): Init new field.
	(enum ppc32_opt): New enum to define OPTION_* values.  Add
	OPTION_PLT_ALIGN and OPTION_NO_PLT_ALIGN.
	(PARSE_AND_LIST_LONGOPTS): Handle new options.
	(PARSE_AND_LIST_ARGS_CASES): Likewise.
	(PARSE_AND_LIST_OPTIONS): Likewise.  Break up help output.
	* emultempl/ppc64elf.em (ppc_add_stub_section): Init alignment
	correctly for negative --plt-stub-align.
	* testsuite/ld-powerpc/elfv2exe.d,
	* testsuite/ld-powerpc/elfv2so.d,
	* testsuite/ld-powerpc/relbrlt.d,
	* testsuite/ld-powerpc/relbrlt.s,
	* testsuite/ld-powerpc/tlsexe.d,
	* testsuite/ld-powerpc/tlsexe.r,
	* testsuite/ld-powerpc/tlsexe32.d,
	* testsuite/ld-powerpc/tlsexe32.g,
	* testsuite/ld-powerpc/tlsexe32.r,
	* testsuite/ld-powerpc/tlsexetoc.d,
	* testsuite/ld-powerpc/tlsexetoc.r,
	* testsuite/ld-powerpc/tlsopt5_32.d,
	* testsuite/ld-powerpc/tlsso.d,
	* testsuite/ld-powerpc/tlstocso.d: Update for changed stub order.
This commit is contained in:
Alan Modra
2018-01-13 18:53:41 +10:30
parent 78742b93a5
commit 9e390558ce
24 changed files with 557 additions and 435 deletions

View File

@ -1,3 +1,19 @@
2018-01-17 Alan Modra <amodra@gmail.com>
* options.h (plt_align): Support for PowerPC32 too.
* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
(Stub_table::plt_call_size, branch_stub_size): Tidy.
(Stub_table::plt_call_align): Implement using stub_align.
(Output_data_glink::global_entry_align): New function.
(Output_data_glink::global_entry_off): New function.
(Output_data_glink::global_entry_address): Use global_entry_off.
(Output_data_glink::pltresolve_size): New function, replacing
pltresolve_size_ constant. Update all uses.
(Output_data_glink::add_global_entry): Align offset.
(Output_data_glink::set_final_data_size): Use global_entry_align.
(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
Tidy stub output. Use global_entry_off.
2018-01-15 Cary Coutant <ccoutant@gmail.com>
PR gold/22694

View File

@ -1101,7 +1101,7 @@ class General_options
NULL, N_("(ARM only) Ignore for backward compatibility"));
DEFINE_var(plt_align, options::TWO_DASHES, '\0', 0, "5",
N_("(PowerPC64 only) Align PLT call stubs to fit cache lines"),
N_("(PowerPC only) Align PLT call stubs to fit cache lines"),
N_("[=P2ALIGN]"), true, int, int, options::parse_uint, false);
DEFINE_bool(plt_localentry, options::TWO_DASHES, '\0', false,

View File

@ -3524,7 +3524,7 @@ Target_powerpc<size, big_endian>::do_relax(int pass,
if (this->glink_ != NULL)
{
int stub_size = this->glink_->pltresolve_size;
int stub_size = this->glink_->pltresolve_size();
Address value = -stub_size;
if (size == 64)
{
@ -3580,7 +3580,7 @@ Target_powerpc<size, big_endian>::do_plt_fde_location(const Output_data* plt,
// There are two FDEs for a position independent glink.
// The first covers the branch table, the second
// __glink_PLTresolve at the end of glink.
off_t resolve_size = this->glink_->pltresolve_size;
off_t resolve_size = this->glink_->pltresolve_size();
if (oview[9] == elfcpp::DW_CFA_nop)
len -= resolve_size;
else
@ -4391,9 +4391,9 @@ class Stub_table : public Output_relaxed_input_section
unsigned int
stub_align() const
{
if (size == 32)
return 16;
unsigned int min_align = 32;
unsigned int min_align = 4;
if (!parameters->options().user_set_plt_align())
return size == 64 ? 32 : min_align;
unsigned int user_align = 1 << parameters->options().plt_align();
return std::max(user_align, min_align);
}
@ -4425,9 +4425,8 @@ class Stub_table : public Output_relaxed_input_section
if (size == 32)
{
const Symbol* gsym = p->first.sym_;
if (this->targ_->is_tls_get_addr_opt(gsym))
return 12 * 4;
return 4 * 4;
return (4 * 4
+ (this->targ_->is_tls_get_addr_opt(gsym) ? 8 * 4 : 0));
}
bool is_iplt;
@ -4460,10 +4459,8 @@ class Stub_table : public Output_relaxed_input_section
unsigned int
plt_call_align(unsigned int bytes) const
{
unsigned int align = 1 << parameters->options().plt_align();
if (align > 1)
bytes = (bytes + align - 1) & -align;
return bytes;
unsigned int align = this->stub_align();
return (bytes + align - 1) & -align;
}
// Return long branch stub size.
@ -4473,9 +4470,10 @@ class Stub_table : public Output_relaxed_input_section
Address loc = this->stub_address() + this->last_plt_size_ + p->second;
if (p->first.dest_ - loc + (1 << 25) < 2 << 25)
return 4;
if (size == 64 || !parameters->options().output_is_position_independent())
return 16;
return 32;
unsigned int bytes = 16;
if (size == 32 && parameters->options().output_is_position_independent())
bytes += 16;
return bytes;
}
// Write out stubs.
@ -4884,7 +4882,6 @@ class Output_data_glink : public Output_section_data
public:
typedef typename elfcpp::Elf_types<size>::Elf_Addr Address;
static const Address invalid_address = static_cast<Address>(0) - 1;
static const int pltresolve_size = 16*4;
Output_data_glink(Target_powerpc<size, big_endian>* targ)
: Output_section_data(16), targ_(targ), global_entry_stubs_(),
@ -4900,12 +4897,35 @@ class Output_data_glink : public Output_section_data
Address
find_global_entry(const Symbol*) const;
unsigned int
global_entry_align(unsigned int off) const
{
unsigned int align = 1 << parameters->options().plt_align();
if (!parameters->options().user_set_plt_align())
align = size == 64 ? 32 : 4;
return (off + align - 1) & -align;
}
unsigned int
global_entry_off() const
{
return this->global_entry_align(this->end_branch_table_);
}
Address
global_entry_address() const
{
gold_assert(this->is_data_size_valid());
unsigned int global_entry_off = (this->end_branch_table_ + 15) & -16;
return this->address() + global_entry_off;
return this->address() + this->global_entry_off();
}
int
pltresolve_size() const
{
if (size == 64)
return (8
+ (this->targ_->abiversion() < 2 ? 11 * 4 : 14 * 4));
return 16 * 4;
}
protected:
@ -4977,10 +4997,11 @@ template<int size, bool big_endian>
void
Output_data_glink<size, big_endian>::add_global_entry(const Symbol* gsym)
{
unsigned int off = this->global_entry_align(this->ge_size_);
std::pair<typename Global_entry_stub_entries::iterator, bool> p
= this->global_entry_stubs_.insert(std::make_pair(gsym, this->ge_size_));
= this->global_entry_stubs_.insert(std::make_pair(gsym, off));
if (p.second)
this->ge_size_ += 16;
this->ge_size_ = off + 16;
}
template<int size, bool big_endian>
@ -5007,11 +5028,11 @@ Output_data_glink<size, big_endian>::set_final_data_size()
total += 4 * (count - 1);
total += -total & 15;
total += this->pltresolve_size;
total += this->pltresolve_size();
}
else
{
total += this->pltresolve_size;
total += this->pltresolve_size();
// space for branch table
total += 4 * count;
@ -5024,7 +5045,7 @@ Output_data_glink<size, big_endian>::set_final_data_size()
}
}
this->end_branch_table_ = total;
total = (total + 15) & -16;
total = this->global_entry_align(total);
total += this->ge_size_;
this->set_data_size(total);
@ -5175,7 +5196,7 @@ Stub_table<size, big_endian>::do_write(Output_file* of)
= ((pltoff - this->targ_->first_plt_entry_offset())
/ this->targ_->plt_entry_size());
Address glinkoff
= (this->targ_->glink_section()->pltresolve_size
= (this->targ_->glink_section()->pltresolve_size()
+ pltindex * 8);
if (pltindex > 32768)
glinkoff += (pltindex - 32768) * 4;
@ -5441,26 +5462,24 @@ Stub_table<size, big_endian>::do_write(Output_file* of)
Address off = plt_addr - got_addr;
if (ha(off) == 0)
{
write_insn<big_endian>(p + 0, lwz_11_30 + l(off));
write_insn<big_endian>(p + 4, mtctr_11);
write_insn<big_endian>(p + 8, bctr);
}
write_insn<big_endian>(p, lwz_11_30 + l(off));
else
{
write_insn<big_endian>(p + 0, addis_11_30 + ha(off));
write_insn<big_endian>(p + 4, lwz_11_11 + l(off));
write_insn<big_endian>(p + 8, mtctr_11);
write_insn<big_endian>(p + 12, bctr);
write_insn<big_endian>(p, addis_11_30 + ha(off));
p += 4;
write_insn<big_endian>(p, lwz_11_11 + l(off));
}
}
else
{
write_insn<big_endian>(p + 0, lis_11 + ha(plt_addr));
write_insn<big_endian>(p + 4, lwz_11_11 + l(plt_addr));
write_insn<big_endian>(p + 8, mtctr_11);
write_insn<big_endian>(p + 12, bctr);
write_insn<big_endian>(p, lis_11 + ha(plt_addr));
p += 4;
write_insn<big_endian>(p, lwz_11_11 + l(plt_addr));
}
p += 4;
write_insn<big_endian>(p, mtctr_11);
p += 4;
write_insn<big_endian>(p, bctr);
}
}
@ -5479,23 +5498,29 @@ Stub_table<size, big_endian>::do_write(Output_file* of)
write_insn<big_endian>(p, b | (delta & 0x3fffffc));
else if (!parameters->options().output_is_position_independent())
{
write_insn<big_endian>(p + 0, lis_12 + ha(bs->first.dest_));
write_insn<big_endian>(p + 4, addi_12_12 + l(bs->first.dest_));
write_insn<big_endian>(p + 8, mtctr_12);
write_insn<big_endian>(p + 12, bctr);
write_insn<big_endian>(p, lis_12 + ha(bs->first.dest_));
p += 4;
write_insn<big_endian>(p, addi_12_12 + l(bs->first.dest_));
}
else
{
delta -= 8;
write_insn<big_endian>(p + 0, mflr_0);
write_insn<big_endian>(p + 4, bcl_20_31);
write_insn<big_endian>(p + 8, mflr_12);
write_insn<big_endian>(p + 12, addis_12_12 + ha(delta));
write_insn<big_endian>(p + 16, addi_12_12 + l(delta));
write_insn<big_endian>(p + 20, mtlr_0);
write_insn<big_endian>(p + 24, mtctr_12);
write_insn<big_endian>(p + 28, bctr);
write_insn<big_endian>(p, mflr_0);
p += 4;
write_insn<big_endian>(p, bcl_20_31);
p += 4;
write_insn<big_endian>(p, mflr_12);
p += 4;
write_insn<big_endian>(p, addis_12_12 + ha(delta));
p += 4;
write_insn<big_endian>(p, addi_12_12 + l(delta));
p += 4;
write_insn<big_endian>(p, mtlr_0);
}
p += 4;
write_insn<big_endian>(p, mtctr_12);
p += 4;
write_insn<big_endian>(p, bctr);
}
}
if (this->need_save_res_)
@ -5563,8 +5588,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of)
write_insn<big_endian>(p, ld_11_11 + 8), p += 4;
}
write_insn<big_endian>(p, bctr), p += 4;
while (p < oview + this->pltresolve_size)
write_insn<big_endian>(p, nop), p += 4;
gold_assert(p == oview + this->pltresolve_size());
// Write lazy link call stubs.
uint32_t indx = 0;
@ -5590,7 +5614,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of)
Address plt_base = this->targ_->plt_section()->address();
Address iplt_base = invalid_address;
unsigned int global_entry_off = (this->end_branch_table_ + 15) & -16;
unsigned int global_entry_off = this->global_entry_off();
Address global_entry_base = this->address() + global_entry_off;
typename Global_entry_stub_entries::const_iterator ge;
for (ge = this->global_entry_stubs_.begin();
@ -5631,7 +5655,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of)
// Write out pltresolve branch table.
p = oview;
unsigned int the_end = oview_size - this->pltresolve_size;
unsigned int the_end = oview_size - this->pltresolve_size();
unsigned char* end_p = oview + the_end;
while (p < end_p - 8 * 4)
write_insn<big_endian>(p, b + end_p - p), p += 4;
@ -5639,68 +5663,85 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of)
write_insn<big_endian>(p, nop), p += 4;
// Write out pltresolve call stub.
end_p = oview + oview_size;
if (parameters->options().output_is_position_independent())
{
Address res0_off = 0;
Address after_bcl_off = the_end + 12;
Address bcl_res0 = after_bcl_off - res0_off;
write_insn<big_endian>(p + 0, addis_11_11 + ha(bcl_res0));
write_insn<big_endian>(p + 4, mflr_0);
write_insn<big_endian>(p + 8, bcl_20_31);
write_insn<big_endian>(p + 12, addi_11_11 + l(bcl_res0));
write_insn<big_endian>(p + 16, mflr_12);
write_insn<big_endian>(p + 20, mtlr_0);
write_insn<big_endian>(p + 24, sub_11_11_12);
write_insn<big_endian>(p, addis_11_11 + ha(bcl_res0));
p += 4;
write_insn<big_endian>(p, mflr_0);
p += 4;
write_insn<big_endian>(p, bcl_20_31);
p += 4;
write_insn<big_endian>(p, addi_11_11 + l(bcl_res0));
p += 4;
write_insn<big_endian>(p, mflr_12);
p += 4;
write_insn<big_endian>(p, mtlr_0);
p += 4;
write_insn<big_endian>(p, sub_11_11_12);
p += 4;
Address got_bcl = g_o_t + 4 - (after_bcl_off + this->address());
write_insn<big_endian>(p + 28, addis_12_12 + ha(got_bcl));
write_insn<big_endian>(p, addis_12_12 + ha(got_bcl));
p += 4;
if (ha(got_bcl) == ha(got_bcl + 4))
{
write_insn<big_endian>(p + 32, lwz_0_12 + l(got_bcl));
write_insn<big_endian>(p + 36, lwz_12_12 + l(got_bcl + 4));
write_insn<big_endian>(p, lwz_0_12 + l(got_bcl));
p += 4;
write_insn<big_endian>(p, lwz_12_12 + l(got_bcl + 4));
}
else
{
write_insn<big_endian>(p + 32, lwzu_0_12 + l(got_bcl));
write_insn<big_endian>(p + 36, lwz_12_12 + 4);
write_insn<big_endian>(p, lwzu_0_12 + l(got_bcl));
p += 4;
write_insn<big_endian>(p, lwz_12_12 + 4);
}
write_insn<big_endian>(p + 40, mtctr_0);
write_insn<big_endian>(p + 44, add_0_11_11);
write_insn<big_endian>(p + 48, add_11_0_11);
write_insn<big_endian>(p + 52, bctr);
write_insn<big_endian>(p + 56, nop);
write_insn<big_endian>(p + 60, nop);
p += 4;
write_insn<big_endian>(p, mtctr_0);
p += 4;
write_insn<big_endian>(p, add_0_11_11);
p += 4;
write_insn<big_endian>(p, add_11_0_11);
}
else
{
Address res0 = this->address();
write_insn<big_endian>(p + 0, lis_12 + ha(g_o_t + 4));
write_insn<big_endian>(p + 4, addis_11_11 + ha(-res0));
write_insn<big_endian>(p, lis_12 + ha(g_o_t + 4));
p += 4;
write_insn<big_endian>(p, addis_11_11 + ha(-res0));
p += 4;
if (ha(g_o_t + 4) == ha(g_o_t + 8))
write_insn<big_endian>(p + 8, lwz_0_12 + l(g_o_t + 4));
write_insn<big_endian>(p, lwz_0_12 + l(g_o_t + 4));
else
write_insn<big_endian>(p + 8, lwzu_0_12 + l(g_o_t + 4));
write_insn<big_endian>(p + 12, addi_11_11 + l(-res0));
write_insn<big_endian>(p + 16, mtctr_0);
write_insn<big_endian>(p + 20, add_0_11_11);
write_insn<big_endian>(p, lwzu_0_12 + l(g_o_t + 4));
p += 4;
write_insn<big_endian>(p, addi_11_11 + l(-res0));
p += 4;
write_insn<big_endian>(p, mtctr_0);
p += 4;
write_insn<big_endian>(p, add_0_11_11);
p += 4;
if (ha(g_o_t + 4) == ha(g_o_t + 8))
write_insn<big_endian>(p + 24, lwz_12_12 + l(g_o_t + 8));
write_insn<big_endian>(p, lwz_12_12 + l(g_o_t + 8));
else
write_insn<big_endian>(p + 24, lwz_12_12 + 4);
write_insn<big_endian>(p + 28, add_11_0_11);
write_insn<big_endian>(p + 32, bctr);
write_insn<big_endian>(p + 36, nop);
write_insn<big_endian>(p + 40, nop);
write_insn<big_endian>(p + 44, nop);
write_insn<big_endian>(p + 48, nop);
write_insn<big_endian>(p + 52, nop);
write_insn<big_endian>(p + 56, nop);
write_insn<big_endian>(p + 60, nop);
write_insn<big_endian>(p, lwz_12_12 + 4);
p += 4;
write_insn<big_endian>(p, add_11_0_11);
}
p += 4;
write_insn<big_endian>(p, bctr);
p += 4;
while (p < end_p)
{
write_insn<big_endian>(p, nop);
p += 4;
}
p += 64;
}
of->write_output_view(off, oview_size, oview);
@ -8161,7 +8202,7 @@ Target_powerpc<size, big_endian>::do_finalize_sections(
this->glink_->finalize_data_size();
odyn->add_section_plus_offset(elfcpp::DT_PPC64_GLINK,
this->glink_,
(this->glink_->pltresolve_size
(this->glink_->pltresolve_size()
- 32));
}
if (this->has_localentry0_ || this->has_tls_get_addr_opt_)
@ -10187,8 +10228,6 @@ Target_selector_powerpc<64, false> target_selector_ppc64le;
// Instantiate these constants for -O0
template<int size, bool big_endian>
const int Output_data_glink<size, big_endian>::pltresolve_size;
template<int size, bool big_endian>
const typename Output_data_glink<size, big_endian>::Address
Output_data_glink<size, big_endian>::invalid_address;
template<int size, bool big_endian>