This patch add support for FEAT_PoPS feature which can be enabled
through +pops command line flag.
This patch also adds support for following DC instructions and the
spec can be found here [1].
1. "dc cigdvaps" enabled on passing +memtag+pops command line flags.
2. "dc civaps" enabled on passing +pops command line flag.
[1]: https://developer.arm.com/documentation/ddi0601/2025-03/AArch64-Instructions?lang=en
FEAT_SVE_F16F32MM introduces the SVE half-precision floating-point
matrix multiply-accumulate to single-precision instruction.
FEAT_F8F32MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to single-precision instruction.
FEAT_F8F16MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to half-precision instruction.
FEAT_CMPBR - Compare and branch instructions. This patch adds these
instructions:
- CB<CC> (register)
- CB<CC> (immediate)
- CBH<CC>
- CBB<CC>
where CC is one of the following:
- EQ
- NE
- GT
- GE
- LT
- LE
- HI
- HS
- LO
- LS
FEAT_OCCMO support was introduced, but the feature flags were missing.
This patch adds these flags, as well as splitting up the tests to test
occmo vs occmo+memtag operands.
FEAT_SVE_BFSCALE introduces the SVE BFSCALE instruction, when the PE is not in
Streaming SVE mode. If FEAT_SME2 is implemented, FEAT_SVE_BFSCALE also
introduces SME multi-vector Z-targeting BFloat16 scaling instructions, BFSCALE
and BFMUL.
FEAT_FPRCVT introduces new versions of previous instructions.
The instructions are used to convert between floating points and
Integers. These new versions take as operands SIMD&FP registers
for both the source and destination register. FEAT_FPRCVT also
enables the use of some existing AdvSIMD instructions in
streaming mode. However, no changes are needed in gas to support this.
This patch introduces support for RISC-V Profiles RV20 and RV22 [1],
enabling developers to utilize these profiles through the -march option.
[1] https://github.com/riscv/riscv-profiles/releases/tag/v1.0
bfd/ChangeLog:
* elfxx-riscv.c (struct riscv_profiles): New struct.
(riscv_parse_extensions): New argument.
(riscv_find_profiles): New checking function.
(riscv_parse_subset): Add Profiles handler.
gas/ChangeLog:
* NEWS: Add RISC-V Profiles.
* doc/as.texi: Update -march input type.
* doc/c-riscv.texi: Ditto.
* testsuite/gas/riscv/option-arch-fail.l: Modify hint info.
* testsuite/gas/riscv/attribute-17.d: New test.
* testsuite/gas/riscv/attribute-18.d: New test.
* testsuite/gas/riscv/march-fail-rvi20u64v.d: New test.
* testsuite/gas/riscv/march-fail-rvi20u64v.l: New test.
So far IBM z17 was identified as arch15. Add the real name, as it has
been announced. [1]
[1]: IBM z17 announcement letter, AD25-0015,
https://www.ibm.com/docs/en/announcements/z17-makes-more-possible
gas/
* config/tc-s390.c (s390_parse_cpu): Add z17 as alternate CPU
name for arch15.
* doc/c-s390.texi: Likewise.
* doc/as.texi: Likewise.
opcodes/
* s390-mkopc.c (main): Add z17 as alternate CPU name for arch15.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds the additional extension "XTheadVdot" based on the
"V" extension, and it provides four 8-bit multiply and add with
32-bit instructions for the "v" extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([2]).
Co-Authored-By: Lifang Xia <lifang_xia@linux.alibaba.com>
[1] https://github.com/XUANTIE-RV/thead-extension-spec/tree/master/xtheadvdot
[2] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add support
for "XTheadVdot" extension.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* doc/c-riscv.texi: Likewise.
* testsuite/gas/riscv/march-help.l: Likewise.
* testsuite/gas/riscv/x-thead-vdot.d: New test.
* testsuite/gas/riscv/x-thead-vdot.s: New test.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VMAQA_VV): New.
* opcode/riscv.h (enum riscv_insn_class): Add insn class for
XTheadVdot.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
Some DW_CFA_* and DW_OP_* take wider than byte, but non-LEB128 operands.
Having to hand-encode such when needing to resort to .cfi_escape isn't
very helpful.
So far tricks had to be played to use .insn to encode extended-EVEX
insns; the X4 bit couldn't be controlled at all. Extend the syntax just
enough to cover all features, taking care to reject invalid feature
combinations (albeit aiming at being as lax there as possible, to offer
users as much flexibility as we can - we don't, after all, know what
future will bring).
In a pre-existing testcase replace all but one .byte; the one that needs
to remain wants to have EVEX.U clear in a way that's neither
controllable via AVX10/256 embedded rounding (would otherwise also set
EVEX.ND), nor via the index register (EVEX.X4), as there's no memory
operand. For one of the converted instances ModR/M.mod needs correcting:
An 8-bit displacement requires that to be 1, not 2. Also adjust source
comments to better represent what the bad insns mimic.
RISC-V already support bfloat16 instruciton like Zfbfmin, Zvfbfmin and
Zvfbfwma, so I think it's reasonable to add .bfloat16 directive to
support bfloat16 data type.
And the code logic mostly support by common code already.
There is inconsistency regarding whether or not +sme implies +sve2 and
whether +nosve2 implies +nosme. In particular, GCC 14 assumes the
dependency exists, and canonicalises target strings accordingly, whereas
LLVM treats the features as independent.
This patch removes the positive implication while retaining the negative
implication. This is the more permissive choice in each case, and
allows us to support target strings written with either interpretation
in mind.
This reduces our ability to detect invalid instructions, but we already
can't rely on this detection because gas doesn't know whether functions
might be executed in streaming mode and/or non-streaming mode.
The aarch64_feature_enable_set change is functionally redundant within
this patch. It is included because the longer term intention is to
instead remove the workaround in aarch64_parse_features, once the
internal feature checks have been modified to support having both
AARCH64_FEATURE_SME set and AARCH64_FEATURE_SVE unset.
Similarly, the dependency from +sme to +fp16 is currently redundant, but
this redundancy relies upon an incorrect dependency from +fcma to +fp16.
This can be fixed in the future, but it might require modifying internal
feature checks for a few FCMA instructions, so it's left unchanged for
now.
We agreed with LLVM that we shouldn't enforce the architectural
dependencies between fp8 muliplication features, so remove them.
Additionally, fix a typo in the gating for FEAT_SME_F8F16 instructions,
which were mistakenly gated by +sme-f8f32 instead. Until now this
mistake had been masked by the dependency between the features.
Like for RMPUPDATE documentation is about to change as far as operands
are concerned. They're merely the other way around here.
While adjustind gas documentation, also add the missing RMPQUERY
counterparts there.
Do away with at least one of the limitations - all other targets permit
multiple values to be specified with a single directive. Re-arrange the
logic further to also overcome an internal error in
riscv_insert_uleb128_fixes(), as e.g. observed by the all/sleb128-2
testcase. This way there's also no need to parse expressions twice,
thus also not raising the same diagnostics (if any) twice.
Note how this addresses a pre-existing XFAIL (where the comment wasn't
really applicable either for RISC-V).
Also update documentation, also to mention that differences between
symbols may be used with .uleb128 (albeit I'm uncertain whether there
are limitations).
I have missed @tab for .gmiccs and .padlockphe2, so fix this doc error.
gas/ChangeLog:
* doc/c-i386.texi: Add the missing @tab for .gmiccs and
.padlockphe2
On casual reading of older gcc configure scripts it might be supposed
that the test for gas string merge support tries with %progbits after
a fail on ARM with @progbits. It doesn't succeed due to a bug. So to
support building of older gcc's for ARM without users having to edit
gcc sources, add a hack to gas. The hack can disappear in a few years
when building older gcc's likely requires other work too.
I've changed the docs to reflect what we actually allow for .section
syntax prior to this patch. (No way should this hack be documented as
allowed!)
PR 32491
* config/obj-elf.c (obj_elf_section): Allow missing entsize
for ARM gcc configure bug.
* doc/as.texi: Correct syntax of ELF .section directive.
* testsuite/gas/elf/string.s,
* testsuite/gas/elf/string.d: Test it.
Commit af3394d97a allowed sections
declared with "S" (SHF_STRING) to specify the entity size, but then
would warn if the entity size was omitted, as with the old syntax.
Unfortunately, since specifying the entity size is incompatible with
binutils 2.43 or earlier, this makes it impossible to specify a
strings section in source code without generating an assembly warning
(the new syntax isn't supported in older assemblers and the old syntax
generates warnings).
Nevertheless, the old code was wrong in that it did not set the entity
size at all, in contravention of the ELF specification (though to date
there are no known cases where this mattered outside of mergeable
sections).
Fix this by permitting the original syntax without a warning again,
but by defaulting the entity size to 1. This is compatible with the
most common case of strings being byte-based.
Added some tests for the various flavours of declaration that we
support.
There are separate CPUID feature bits for SM2 and CCS instructions.
CCS is the acronym of Chinese Cipher System, it includes SM3 and SM4
instructions. This patch adds CpuGMISM2 and CpuGMICCS to replace CpuGMI on
corresponding instructions.
gas/ChangeLog:
* config/tc-i386.c: Add gmism2 and gmiccs to replace gmi.
* doc/c-i386.texi: Ditto.
opcodes/ChangeLog:
* i386-gen.c: Add GMISM2 and GMICCS to replace GMI.
* i386-opc.h (enum i386_cpu): Add CpuGMISM2 and CpuGMICCS to
replace CpuGMI.
* i386-opc.tbl: Replace GMI with GMISM2 on sm2 instruction. Replace GMI
with GMICCS on sm3 and sm4 instructions.
* i386-tbl.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-init.h: Ditto.
This patch will support AMX-MOVRS feature. Unlike all the other
AMX insns in vector space where we pass vex_len_table before
vex_w_table, we first pass vex_w_table for tileloaddrs[,t1] to
align with the order in EVEX space. The reason why we first pass
vex_w_table in EVEX space is due to AMX-AVX512, where tcvtrowd2ps
and tilemovrow with r32 shares the same opcode with tileloaddrs[,t1].
All of them have evex.w = 0 but with different evex.length. Re-doing
that shortly is not ideal.
APX_F extension is also implemented in this patch. The encoding will
be:
- EVEX.128.NP/66.MAP5.W0 F8/F9 !(11):rrr:100 for
T2RPNTLVW[Z0,Z1]RS[,T1] with NF=0.
- EVEX.128.F2/66.0F38.W0 4A !(11):rrr:100 FOR TILELOADDRS[,T1] with
NF=0.
For APX_F extension, we could not use APX_F(AMX_TRANSPOSE&AMX_MOVRS)
since the transformation could not be done. Instead, we will use
AMX_TRANSPOSE & APX_F(AMX_MOVRS). Thus, we should set AMX_TRANSPOSE
for "any" for cpu_flags in assembler. Since it will only affect the
cpu_flags_match, handle that there.
gas/ChangeLog:
* config/tc-i386.c (cpu_arch): Add amx_movrs.
(cpu_flags_match): Set any bitfield for multiple cpuid
enabled insns.
* doc/c-i386.texi: Document .amx_movrs.
* testsuite/gas/i386/x86-64.exp: Run AMX-MOVRS tests.
* testsuite/gas/i386/x86-64-amx-movrs-intel.d: New test.
* testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.d: Ditto.
* testsuite/gas/i386/x86-64-amx-movrs.s: Ditto.
opcodes/ChangeLog:
* i386-dis-evex-len.h (EVEX_LEN_0F384A_X86_64_W_0): New.
* i386-dis-evex-w.h (EVEX_W_0F384A_X86_64): Ditto.
* i386-dis-evex-x86-64.h (X86_64_EVEX_0F384A): Ditto.
* i386-dis-evex.h: New entry for AMX-MOVRS.
* i386-dis.c:
(PREFIX_VEX_0F384A_X86_64_L_0_W_0): New.
(PREFIX_VEX_MAP5_F8_X86_64_L_0_W_0): Ditto.
(PREFIX_VEX_MAP5_F9_X86_64_L_0_W_0): Ditto.
(X86_64_VEX_0F384A): Ditto.
(X86_64_VEX_MAP5_F8): Ditto.
(X86_64_VEX_MAP5_F9): Ditto.
(X86_64_EVEX_0F384A): Ditto.
(VEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_LEN_MAP5_F8_X86_64): Ditto.
(VEX_LEN_MAP5_F9_X86_64): Ditto.
(EVEX_LEN_0F384A_X86_64_W_0): Ditto.
(VEX_W_0F384A_X86_64): Ditto.
(VEX_W_MAP5_F8_X86_64): Ditto.
(VEX_W_MAP5_F9_X86_64): Ditto.
(EVEX_W_0F384A_X86_64): Ditto.
(prefix_table): New entry for AMX-MOVRS.
(x86_64_table): Ditto.
(vex_len_table): Ditto.
(vex_w_table): Ditto.
(map5_f8_opcode): New.
(map5_f9_opcode): Ditto.
(get_valid_dis386): Handle VEX_MAP5 opcode for AMX-MOVRS.
* i386-gen.c (isa_dependencies): Add AMX_MOVRS.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_MOVRS): New.
(i386_cpu_flags): Add cpuamx_movrs.
* i386-opc.tbl: Add AMX-MOVRS instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
This patch adds support for SME ZA-targeting non-widening BFloat16 instructions,
under tick FEAT_SME_B16B16 and command line flag "+sme-b16b16".
FEAT_SME_B16B16 implements FEAT_SME2 and FEAT_SVE_B16B16, in accordance with that
"+sme-b16b16" enables "+sme2" and "+sve-b16b16".
Also the test files related to FEAT_SME_B16B16 are prefixed with sme-b16b16*.
eg: sme-b16b16-1.s, sme-b16b16-1.d.
The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
In the current code, SVE2 Bfloat16 instructions are implemented with tick
FEAT_B16B16 and command line flag "+b16b16" and this feature was suspended
due to incomplete support.
In the new spec available here[1], FEAT_B16B16 is replaced with
FEAT_SVE_B16B16 and command line flag "+b16b16" is replace with "sve-b16b16".
Also the test files related to FEAT_SVE_B16B16 are prefixed with sve-b16b16*.
eg: sve-b16b16-sve2-1.s, sve-b16b16-sve2-1.d.
This patch supports the SVE Z-targeting non-widening BFloat16 instructions
with command line flag "+sve-b16b16+sve2".
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SVE-Instructions?lang=en
This patch adds support for FEAT_SME_F16F16 feature (Non-widening
half-precision FP16 to FP16 arithmetic for SME2), which is enabled
using command line flags +sme-f16f16 to -march (which enables both
FEAT_SME2 and FEAT_SME_F16F16).
There are couple of instructions (fadd and fsub variants) which should
be allowed by the assembler on either passing +sme-f16f16 or +sme-f8f16.
Those instructions are already supported in the current assembler, this
patch adds tests for those instructions as well.
In this patch, we will support AMX-FP8 feature. Since in the
foreseeable future, only AMX-MOVRS will also use VEX_MAP5, we
currently will not add a table of 256 entries and handle just
like MAP7.
gas/ChangeLog:
* config/tc-i386.c: Add amx_fp8.
* doc/c-i386.texi: Document .amx_fp8.
* testsuite/gas/i386/x86-64.exp: Run AMX-FP8 tests.
* testsuite/gas/i386/x86-64-amx-fp8-bad.d: New test.
* testsuite/gas/i386/x86-64-amx-fp8-bad.s: Ditto.
* testsuite/gas/i386/x86-64-amx-fp8-intel.d: Ditto.
* testsuite/gas/i386/x86-64-amx-fp8-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-fp8-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-fp8.d: Ditto.
* testsuite/gas/i386/x86-64-amx-fp8.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0): New.
(X86_64_VEX_MAP5_FD): Ditto.
(VEX_LEN_MAP5_FD_X86_64): Ditto.
(VEX_W_MAP5_FD_X86_64_L_0):Ditto.
(prefix_table): Add PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0.
(x86_64_table): Add X86_64_VEX_MAP5_FD.
(vex_len_table): Add VEX_LEN_MAP5_FD_X86_64.
(vex_w_table): Add VEX_W_MAP5_FD_X86_64_L_0.
* i386-gen.c: Add CPU_AMX_FP8_FLAGS and
CPU_ANY_AMX_FP8_FLAGS.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h: Add cpuamx_fp8.
* i386-opc.tbl: Add AMX_FP8 instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
In this patch, we will support AMX-TRANSPOSE. Since AMX-TRANSPOSE
will be used with other CPUIDs very often, we put it into
CPU_FLAGS_COMMON.
To implement TMM pair, we reused ImplicitGroup and adjust the condition
in process_operands for the instructions.
APX_F extension is also handled in this patch, where it extends
T2RPNTLVW[Z0,Z1][,T1] to EVEX.128.NP/66.0F38.W0 6E/6F !(11):rrr:100
with NF=0.
Also, TTDPFP16PS should base on AMX_FP16, not AMX_BF16 in ISE055.
It would be fixed in ISE056.
gas/ChangeLog:
* config/tc-i386.c (cpu_arch): Add amx_transpose.
(_is_cpu): Ditto.
(process_operands): Adjust the condition for AMX-TRANSPOSE.
* doc/c-i386.texi: Document .amx_transpose.
* testsuite/gas/i386/x86-64.exp: Run AMX-TRANSPOSE tests.
* testsuite/gas/i386/x86-64-amx-transpose-bad.d: New test.
* testsuite/gas/i386/x86-64-amx-transpose-bad.s: Ditto.
* testsuite/gas/i386/x86-64-amx-transpose-intel.d: Ditto.
* testsuite/gas/i386/x86-64-amx-transpose-inval.l: Ditto.
* testsuite/gas/i386/x86-64-amx-transpose-inval.s: Ditto.
* testsuite/gas/i386/x86-64-amx-transpose.d: Ditto.
* testsuite/gas/i386/x86-64-amx-transpose.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (MOD_VEX_0F386E_X86_64_W_0): New.
(MOD_VEX_0F386F_X86_64_W_0): Ditto.
(PREFIX_VEX_0F385F_X86_64_W_0_L_0): Ditto.
(PREFIX_VEX_0F386B_X86_64_W_0_L_0): Ditto.
(PREFIX_VEX_0F386E_X86_64_W_0_M_0_L_0): Ditto.
(PREFIX_VEX_0F386F_X86_64_W_0_M_0_L_0): Ditto.
(X86_64_VEX_0F385F): Ditto.
(X86_64_VEX_0F386B): Ditto.
(X86_64_VEX_0F386E): Ditto.
(X86_64_VEX_0F386F): Ditto.
(VEX_LEN_0F385F_X86_64_W_0): Ditto.
(VEX_LEN_0F386B_X86_64_W_0): Ditto.
(VEX_LEN_0F386E_X86_64_W_0_M_0): Ditto.
(VEX_LEN_0F386F_X86_64_W_0_M_0): Ditto.
(VEX_W_0F385F_X86_64): Ditto.
(VEX_W_0F386B_X86_64): Ditto.
(VEX_W_0F386E_X86_64): Ditto.
(VEX_W_0F386F_X86_64): Ditto.
(mod_table): Add MOD_VEX_0F386E_X86_64_W_0,
MOD_VEX_0F386F_X86_64_W_0.
(prefix_table): Add PREFIX_VEX_0F386E_X86_64_W_0_M_0_L_0,
PREFIX_VEX_0F386F_X86_64_W_0_M_0_L_0.
Add new instructions for PREFIX_VEX_0F386C_X86_64_W_0_L_0.
(x86_64_table): Add X86_64_VEX_0F385F, X86_64_VEX_0F386B,
X86_64_VEX_0F386E, X86_64_VEX_0F386F.
(vex_len_table): Add VEX_LEN_0F385F_X86_64_W_0,
VEX_LEN_0F386B_X86_64_W_0, VEX_LEN_0F386E_X86_64_W_0_M_0,
VEX_LEN_0F386F_X86_64_W_0_M_0.
(vex_w_table): Add VEX_W_0F385F_X86_64, VEX_W_0F386B_X86_64,
VEX_W_0F386E_X86_64, VEX_W_0F386F_X86_64.
* i386-gen.c (cpu_flag_init): Add AMX_TRANSPOSE.
(cpu_flags): Add CpuAMX_TRANSPOSE.
* i386-init.h: Regenerated.
* i386-mnem.h: Ditto.
* i386-opc.h (CpuAMX_TRANSPOSE): New.
(i386_cpu): Add cpuamx_transpose.
* i386-opc.tbl: Add AMX-TRANSPOSE instructions.
* i386-tbl.h: Regenerated.
Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
gas currently emits informational messages for context information along warnings.
In the context of system register tests in AArch64 backend, these messages
pollute the tests when checking for error message patterns in stderr output.
This patch aims at providing two new flags while preserving the existing
behavior if none of the options is provided.
* --info, similar to the existing --warn flag to enable diagnostic
informational messages (default behavior).
* --no-info, similar to the existing --no-warn flag to disable diagnostic
informational messages.
It also adds the flags to the existing documentation, and command manual.
The Nios II architecture has been EOL'ed by the vendor. This patch
removes all binutils, bfd, gas, binutils, and opcodes support for this
target with the exception of the readelf utility. (The ELF EM_*
number remains valid and the relocation definitions from the Nios II
ABI will never change in future, so retaining the readelf support
seems consistent with its purpose as a utility that tries to parse the
headers in any ELF file provided as an argument regardless of target.)
It's not overly useful without it, but the spec doesn't name any
dependency between the two. People may want to use it for purely
informational purposes, for example. Adjust, in particular, entity size
processing to be engaged if either flag is set, as mandated by the spec.
First of all make the declarations globally visible, such that producer
and consumer actually share them.
For the latter two simply add const (as PPC already had it,), while for
the former achieve the effect by converting to an array: There's no need
for the extra level of indirection.