The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
TSXLDTRK takes RTM as a prereq. Additionally introduce an umbrella "tsx"
extension option covering both RTM and HLE, paralleling the "abm" one we
already have.
As of commit ae89daecb132 ("x86: generalize disabling of sub-
architectures") there's no arbitrary subset of ISAs which can also be
disabled. This should have been reflected in documentation right away.
Since I failed to do so, correct this now.
When using just slightly non-trivial combinations of .arch, it can be
quite useful to be able to go back to prior state without needing to
re-invoke perhaps many earlier directives and without needing to invoke
perhaps many "negative" ones. Like some other architectures allow
saving (pushing) and restoring (popping) present/prior state.
For now require the same .code<N> to be in effect for ".arch pop" that
was in effect for the corresponding ".arch push".
Also change the global "no_cond_jump_promotion" to be bool, to match the
new struct field.
So far there was no way to reset the architecture to that assembly would
start with in the absence of any overrides (command line or directives).
Note that for Intel MCU "default" is merely an alias of "iamcu".
While there also zap a stray @item from the doc section, as noticed
when inspecting the generated output (which still has some quirks, but
those aren't easy to address without re-flowing almost the entire
section).
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
Unaligned load/store instructions on aligned memory or register are as
fast as aligned load/store instructions on modern Intel processors. Add
a command-line option, -muse-unaligned-vector-move, to x86 assembler to
encode encode aligned vector load/store instructions as unaligned
vector load/store instructions.
* NEWS: Mention -muse-unaligned-vector-move.
* config/tc-i386.c (use_unaligned_vector_move): New.
(encode_with_unaligned_vector_move): Likewise.
(md_assemble): Call encode_with_unaligned_vector_move for
-muse-unaligned-vector-move.
(OPTION_MUSE_UNALIGNED_VECTOR_MOVE): New.
(md_longopts): Add -muse-unaligned-vector-move.
(md_parse_option): Handle -muse-unaligned-vector-move.
(md_show_usage): Add -muse-unaligned-vector-move.
* doc/c-i386.texi: Document -muse-unaligned-vector-move.
* testsuite/gas/i386/i386.exp: Run unaligned-vector-move and
x86-64-unaligned-vector-move.
* testsuite/gas/i386/unaligned-vector-move.d: New file.
* testsuite/gas/i386/unaligned-vector-move.s: Likewise.
* testsuite/gas/i386/x86-64-unaligned-vector-move.d: Likewise.
This is to be able to generate data acted upon by AVX512-BF16 and
AMX-BF16 insns. While not part of the IEEE standard, the format is
sufficiently standardized to warrant handling in config/atof-ieee.c.
Arm, where custom handling was implemented, may want to leverage this as
well. To be able to also use the hex forms supported for other floating
point formats, a small addition to the generic hex_float() is needed.
Extend existing x86 testcases.
This is to be able to generate data passed to {,V}CVTPH2PS and acted
upon by AVX512-FP16 insns. To be able to also use the hex forms
supported for other floating point formats, a small addition to the
generic hex_float() is needed.
Extend existing x86 testcases.
Intel AVX512 FP16 instructions use maps 3, 5 and 6. Maps 5 and 6 use 3 bits
in the EVEX.mmm field (0b101, 0b110). Map 5 is for instructions that were FP32
in map 1 (0Fxx). Map 6 is for instructions that were FP32 in map 2 (0F38xx).
There are some exceptions to this rule. Some things in map 1 (0Fxx) with imm8
operands predated our current conventions; those instructions moved to map 3.
FP32 things in map 3 (0F3Axx) found new opcodes in map3 for FP16 because map3
is very sparsely populated. Most of the FP16 instructions share opcodes and
prefix (EVEX.pp) bits with the related FP32 operations.
Intel AVX512 FP16 instructions has new displacements scaling rules, please refer
to the public software developer manual for detail information.
gas/
2021-08-05 Igor Tsimbalist <igor.v.tsimbalist@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Wei Xiao <wei3.xiao@intel.com>
Lili Cui <lili.cui@intel.com>
* config/tc-i386.c (struct Broadcast_Operation): Adjust comment.
(cpu_arch): Add .avx512_fp16.
(cpu_noarch): Add noavx512_fp16.
(pte): Add evexmap5 and evexmap6.
(build_evex_prefix): Handle EVEXMAP5 and EVEXMAP6.
(check_VecOperations): Handle {1to32}.
(check_VecOperands): Handle CheckRegNumb.
(check_word_reg): Handle Toqword.
(i386_error): Add invalid_dest_and_src_register_set.
(match_template): Handle invalid_dest_and_src_register_set.
* doc/c-i386.texi: Document avx512_fp16, noavx512_fp16.
opcodes/
2021-08-05 Igor Tsimbalist <igor.v.tsimbalist@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Wei Xiao <wei3.xiao@intel.com>
Lili Cui <lili.cui@intel.com>
* i386-dis.c (EXwScalarS): New.
(EXxh): Ditto.
(EXxhc): Ditto.
(EXxmmqh): Ditto.
(EXxmmqdh): Ditto.
(EXEvexXwb): Ditto.
(DistinctDest_Fixup): Ditto.
(enum): Add xh_mode, evex_half_bcst_xmmqh_mode, evex_half_bcst_xmmqdh_mode
and w_swap_mode.
(enum): Add PREFIX_EVEX_0F3A08_W_0, PREFIX_EVEX_0F3A0A_W_0,
PREFIX_EVEX_0F3A26, PREFIX_EVEX_0F3A27, PREFIX_EVEX_0F3A56,
PREFIX_EVEX_0F3A57, PREFIX_EVEX_0F3A66, PREFIX_EVEX_0F3A67,
PREFIX_EVEX_0F3AC2, PREFIX_EVEX_MAP5_10, PREFIX_EVEX_MAP5_11,
PREFIX_EVEX_MAP5_1D, PREFIX_EVEX_MAP5_2A, PREFIX_EVEX_MAP5_2C,
PREFIX_EVEX_MAP5_2D, PREFIX_EVEX_MAP5_2E, PREFIX_EVEX_MAP5_2F,
PREFIX_EVEX_MAP5_51, PREFIX_EVEX_MAP5_58, PREFIX_EVEX_MAP5_59,
PREFIX_EVEX_MAP5_5A_W_0, PREFIX_EVEX_MAP5_5A_W_1,
PREFIX_EVEX_MAP5_5B_W_0, PREFIX_EVEX_MAP5_5B_W_1,
PREFIX_EVEX_MAP5_5C, PREFIX_EVEX_MAP5_5D, PREFIX_EVEX_MAP5_5E,
PREFIX_EVEX_MAP5_5F, PREFIX_EVEX_MAP5_78, PREFIX_EVEX_MAP5_79,
PREFIX_EVEX_MAP5_7A, PREFIX_EVEX_MAP5_7B, PREFIX_EVEX_MAP5_7C,
PREFIX_EVEX_MAP5_7D_W_0, PREFIX_EVEX_MAP6_13, PREFIX_EVEX_MAP6_56,
PREFIX_EVEX_MAP6_57, PREFIX_EVEX_MAP6_D6, PREFIX_EVEX_MAP6_D7
(enum): Add EVEX_MAP5 and EVEX_MAP6.
(enum): Add EVEX_W_MAP5_5A, EVEX_W_MAP5_5B,
EVEX_W_MAP5_78_P_0, EVEX_W_MAP5_78_P_2, EVEX_W_MAP5_79_P_0,
EVEX_W_MAP5_79_P_2, EVEX_W_MAP5_7A_P_2, EVEX_W_MAP5_7A_P_3,
EVEX_W_MAP5_7B_P_2, EVEX_W_MAP5_7C_P_0, EVEX_W_MAP5_7C_P_2,
EVEX_W_MAP5_7D, EVEX_W_MAP6_13_P_0, EVEX_W_MAP6_13_P_2,
(get_valid_dis386): Properly handle new instructions.
(intel_operand_size): Handle new modes.
(OP_E_memory): Ditto.
(OP_EX): Ditto.
* i386-dis-evex.h: Updated for AVX512_FP16.
* i386-dis-evex-mod.h: Updated for AVX512_FP16.
* i386-dis-evex-prefix.h: Updated for AVX512_FP16.
* i386-dis-evex-reg.h : Updated for AVX512_FP16.
* i386-dis-evex-w.h : Updated for AVX512_FP16.
* i386-gen.c (cpu_flag_init): Add CPU_AVX512_FP16_FLAGS,
and CPU_ANY_AVX512_FP16_FLAGS. Update CPU_ANY_AVX512F_FLAGS
and CPU_ANY_AVX512BW_FLAGS.
(cpu_flags): Add CpuAVX512_FP16.
(opcode_modifiers): Add DistinctDest.
* i386-opc.h (enum): (AVX512_FP16): New.
(i386_opcode_modifier): Add reqdistinctreg.
(i386_cpu_flags): Add cpuavx512_fp16.
(EVEXMAP5): Defined as a macro.
(EVEXMAP6): Ditto.
* i386-opc.tbl: Add Intel AVX512_FP16 instructions.
* i386-init.h: Regenerated.
* i386-tbl.h: Ditto.
Update 80387 floating point 's' suffix to read:
* Integer constructors are '.word', '.long' or '.int', and '.quad'
for the 16-, 32-, and 64-bit integer formats. The corresponding
instruction mnemonic suffixes are 's' (short), 'l' (long), and 'q'
(quad).
instead of 's' (single).
PR gas/27106
* doc/c-i386.texi: Update 80387 floating point 's' suffix
Since (%bp)/(%ebp)/(%rbp) are encoded as 0(%bp)/0(%ebp)/0(%rbp), use
disp32/disp16 on 0(%bp)/0(%ebp)/0(%rbp) for {disp32}.
Note: Since there is no disp32 on 0(%bp), use disp16 instead.
PR gas/26305
* config/tc-i386.c (build_modrm_byte): Use disp32/disp16 on
(%bp)/(%ebp)/(%rbp) for {disp32}.
* doc/c-i386.texi: Update {disp32} documentation.
* testsuite/gas/i386/pseudos.s: Add (%bp)/(%ebp) tests.
* testsuite/gas/i386/x86-64-pseudos.s: Add (%ebp)/(%rbp) tests.
* testsuite/gas/i386/pseudos.d: Updated.
* testsuite/gas/i386/x86-64-pseudos.d: Likewise.
The 64-bit version of binutils got support for the PE COFF BIG OBJ format a
couple of years ago. The BIG OBJ format is a slightly different COFF format
which extends the size of the number of section field in the header from a
uint16_t to a uint32_t and so greatly increases the number of sections allowed.
However the 32-bit version of bfd never got support for this. The GHC Haskell
compiler generates a great deal of symbols due to it's use of
-ffunction-sections and -fdata-sections.
This meant that we could not build the 32-bit version of the GHC Compiler for
many releases now as binutils didn't have this support.
This patch adds the support to the 32-bit port of binutils as well and also does
come cleanup in the code.
bfd/ChangeLog:
* coff-i386.c (COFF_WITH_PE_BIGOBJ): New.
* coff-x86_64.c (COFF_WITH_PE_BIGOBJ): New.
* config.bfd (targ_selvecs): Rename x86_64_pe_be_vec
to x86_64_pe_big_vec as it not a big-endian format.
(vec i386_pe_big_vec): New.
* configure.ac: Likewise.
* targets.c: Likewise.
* configure: Regenerate.
* pe-i386.c (TARGET_SYM_BIG, TARGET_NAME_BIG,
COFF_WITH_PE_BIGOBJ): New.
* pe-x86_64.c (TARGET_SYM_BIG, TARGET_NAME_BIG):
New.
(x86_64_pe_be_vec): Moved.
gas/ChangeLog:
* NEWS: Add news entry for big-obj.
* config/tc-i386.c (i386_target_format): Support new format.
* doc/c-i386.texi: Add i386 support.
* testsuite/gas/pe/big-obj.d: Rename test to not be x64 specific.
* testsuite/gas/pe/pe.exp (big-obj): Make test run on i386 as well.
ld/ChangeLog:
* pe-dll.c (pe_detail_list): Add pe-bigobj-i386.
Add 3 command-line options to generate lfence for load, indirect near
branch and ret to help mitigate:
https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00334.htmlhttp://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-0551
1. -mlfence-after-load=[no|yes]:
-mlfence-after-load=yes generates lfence after load instructions.
2. -mlfence-before-indirect-branch=[none|all|memory|register]:
a. -mlfence-before-indirect-branch=all generates lfence before indirect
near branches via register and a warning before indirect near branches
via memory.
b. -mlfence-before-indirect-branch=memory issue a warning before
indirect near branches via memory.
c. -mlfence-before-indirect-branch=register generates lfence before
indirect near branches via register.
Note that lfence won't be generated before indirect near branches via
register with -mlfence-after-load=yes since lfence will be generated
after loading branch target register.
3. -mlfence-before-ret=[none|or|not]
a. -mlfence-before-ret=or generates or with lfence before ret.
b. -mlfence-before-ret=not generates not with lfence before ret.
A warning will be issued and lfence won't be generated before indirect
near branch and ret if the previous item is a prefix or a constant
directive, which may be used to hardcode an instruction, since there
is no clear instruction boundary.
* config/tc-i386.c (lfence_after_load): New.
(lfence_before_indirect_branch_kind): New.
(lfence_before_indirect_branch): New.
(lfence_before_ret_kind): New.
(lfence_before_ret): New.
(last_insn): New.
(load_insn_p): New.
(insert_lfence_after): New.
(insert_lfence_before): New.
(md_assemble): Call insert_lfence_before and insert_lfence_after.
Set last_insn.
(OPTION_MLFENCE_AFTER_LOAD): New.
(OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH): New.
(OPTION_MLFENCE_BEFORE_RET): New.
(md_longopts): Add -mlfence-after-load=,
-mlfence-before-indirect-branch= and -mlfence-before-ret=.
(md_parse_option): Handle -mlfence-after-load=,
-mlfence-before-indirect-branch= and -mlfence-before-ret=.
(md_show_usage): Display -mlfence-after-load=,
-mlfence-before-indirect-branch= and -mlfence-before-ret=.
(i386_cons_align): New.
* config/tc-i386.h (i386_cons_align): New.
(md_cons_align): New.
* doc/c-i386.texi: Document -mlfence-after-load=,
-mlfence-before-indirect-branch= and -mlfence-before-ret=.
AMD ABM has 2 instructions: popcnt and lzcnt. ABM CPUID feature bit has
been reused for lzcnt and a POPCNT CPUID feature bit is added for popcnt
which used to be the part of SSE4.2. This patch removes CpuABM and adds
CpuPOPCNT. It changes ABM to enable both lzcnt and popcnt, changes SSE4.2
to also enable popcnt.
gas/
* config/tc-i386.c (cpu_arch): Add .popcnt.
* doc/c-i386.texi: Remove abm and .abm. Add popcnt and .popcnt.
Add a tab before @samp{.sse4a}.
opcodes/
* i386-gen.c (cpu_flag_init): Replace CpuABM with
CpuLZCNT|CpuPOPCNT. Add CpuPOPCNT to CPU_SSE4_2_FLAGS. Add
CPU_POPCNT_FLAGS.
(cpu_flags): Remove CpuABM. Add CpuPOPCNT.
* i386-opc.h (CpuABM): Removed.
(CpuPOPCNT): New.
(i386_cpu_flags): Remove cpuabm. Add cpupopcnt.
* i386-opc.tbl: Replace CpuABM|CpuSSE4_2 with CpuPOPCNT on
popcnt. Remove CpuABM from lzcnt.
* i386-init.h: Regenerated.
* i386-tbl.h: Likewise.
commit 7deea9aad8 changed nosse4 to include CpuSSE4a. But AMD SSE4a is
a superset of SSE3 and Intel SSE4 is a superset of SSSE3. Disable Intel
SSE4 shouldn't disable AMD SSE4a. This patch restores nosse4. It also
adds .sse4a and nosse4a.
gas/
* config/tc-i386.c (cpu_arch): Add .sse4a and nosse4a. Restore
nosse4.
* doc/c-i386.texi: Document sse4a and nosse4a.
opcodes/
* i386-gen.c (cpu_flag_init): Add CPU_ANY_SSE4A_FLAGS. Remove
CPU_ANY_SSE4_FLAGS.
Document different mnemonics of movsx, movsxd and movzx in AT&T syntax.
PR gas/25438
* doc/c-i386.texi: Document movsx, movsxd and movzx for AT&T
syntax.
AMD and Intel differ in their handling of far indirect branches as well
as LFS/LGS/LSS: AMD CPUs ignore REX.W while Intel ones honors it. (Note
how the latter three were hybrids so far, while far branches were fully
AMD-like.)
Commit d835a58baae720 disabled sysenter/sysenter in 64-bit mode by
default. By default, assembler should accept common, Intel64 only
and AMD64 ISAs since there are no conflicts.
gas/
PR gas/25516
* config/tc-i386.c (intel64): Renamed to ...
(isa64): This.
(match_template): Accept Intel64 only instruction by default.
(i386_displacement): Updated.
(md_parse_option): Updated.
* c-i386.texi: Update -mamd64/-mintel64 documentation.
* testsuite/gas/i386/i386.exp: Run x86-64-sysenter. Pass
-mamd64 to x86-64-sysenter-amd.
* testsuite/gas/i386/x86-64-sysenter.d: New file.
opcodes/
PR gas/25516
* i386-gen.c (opcode_modifiers): Replace AMD64 and Intel64
with ISA64.
* i386-opc.h (AMD64): Removed.
(Intel64): Likewose.
(AMD64): New.
(INTEL64): Likewise.
(INTEL64ONLY): Likewise.
(i386_opcode_modifier): Replace amd64 and intel64 with isa64.
* i386-opc.tbl (Amd64): New.
(Intel64): Likewise.
(Intel64Only): Likewise.
Replace AMD64 with Amd64. Update sysenter/sysenter with
Cpu64 and Intel64Only. Remove AMD64 from sysenter/sysenter.
* i386-tbl.h: Regenerated.
movsxd is a 64-bit only instruction. It supports both 16-bit and 32-bit
destination registers. Its AT&T mnemonic is movslq which only supports
64-bit destination register. There is also a discrepancy between AMD64
and Intel64 on movsxd with 16-bit destination register. AMD64 supports
32-bit source operand and Intel64 supports 16-bit source operand.
This patch updates movsxd encoding and decoding to alow 16-bit and 32-bit
destination registers. It also handles movsxd with 16-bit destination
register for AMD64 and Intel 64.
gas/
PR binutils/25445
* config/tc-i386.c (check_long_reg): Also convert to QWORD for
movsxd.
* doc/c-i386.texi: Add a node for AMD64 vs. Intel64 ISA
differences. Document movslq and movsxd.
* testsuite/gas/i386/i386.exp: Run PR binutils/25445 tests.
* testsuite/gas/i386/x86-64-movsxd-intel.d: New file.
* testsuite/gas/i386/x86-64-movsxd-intel64-intel.d: Likewise.
* testsuite/gas/i386/x86-64-movsxd-intel64-inval.l: Likewise.
* testsuite/gas/i386/x86-64-movsxd-intel64-inval.s: Likewise.
* testsuite/gas/i386/x86-64-movsxd-intel64.d: Likewise.
* testsuite/gas/i386/x86-64-movsxd-intel64.s: Likewise.
* testsuite/gas/i386/x86-64-movsxd-inval.l: Likewise.
* testsuite/gas/i386/x86-64-movsxd-inval.s: Likewise.
* testsuite/gas/i386/x86-64-movsxd.d: Likewise.
* testsuite/gas/i386/x86-64-movsxd.s: Likewise.
opcodes/
PR binutils/25445
* i386-dis.c (MOVSXD_Fixup): New function.
(movsxd_mode): New enum.
(x86_64_table): Use MOVSXD_Fixup and movsxd_mode on movsxd.
(intel_operand_size): Handle movsxd_mode.
(OP_E_register): Likewise.
(OP_G): Likewise.
* i386-opc.tbl: Remove Rex64 and allow 32-bit destination
register on movsxd. Add movsxd with 16-bit destination register
for AMD64 and Intel64 ISAs.
* i386-tbl.h: Regenerated.