[binutils][aarch64] Bfloat16 enablement [2/X]

Hi,

This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.

This patch introduces the following BFloat16 instructions to the
aarch64 backend: bfdot, bfmmla, bfcvt, bfcvtnt, bfmlal[t/b],
bfcvtn2.

Committed on behalf of Mihail Ionescu.

gas/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>
2019-11-07  Matthew Malcomson  <matthew.malcomson@arm.com>

	* config/tc-aarch64.c (vectype_to_qualifier): Special case the
	S_2H operand qualifier.
	* doc/c-aarch64.texi: Document bf16 and bf16mmla4 extensions.
	* testsuite/gas/aarch64/bfloat16.d: New test.
	* testsuite/gas/aarch64/bfloat16.s: New test.
	* testsuite/gas/aarch64/illegal-bfloat16.d: New test.
	* testsuite/gas/aarch64/illegal-bfloat16.l: New test.
	* testsuite/gas/aarch64/illegal-bfloat16.s: New test.
	* testsuite/gas/aarch64/sve-bfloat-movprfx.s: New test.
	* testsuite/gas/aarch64/sve-bfloat-movprfx.d: New test.

include/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>
2019-11-07  Matthew Malcomson  <matthew.malcomson@arm.com>

	* opcode/aarch64.h (AARCH64_FEATURE_BFLOAT16): New feature macros.
	(AARCH64_ARCH_V8_6): Include BFloat16 feature macros.
	(enum aarch64_opnd_qualifier): Introduce new operand qualifier
	AARCH64_OPND_QLF_S_2H.
	(enum aarch64_insn_class): Introduce new class "bfloat16".
	(BFLOAT16_SVE_INSNC): New feature set for bfloat16
	instructions to support the movprfx constraint.

opcodes/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>
2019-11-07  Matthew Malcomson  <matthew.malcomson@arm.com>

	* aarch64-asm.c (aarch64_ins_reglane): Use AARCH64_OPND_QLF_S_2H
	in reglane special case.
	* aarch64-dis-2.c (aarch64_opcode_lookup_1,
	aarch64_find_next_opcode): Account for new instructions.
	* aarch64-dis.c (aarch64_ext_reglane): Use AARCH64_OPND_QLF_S_2H
	in reglane special case.
	* aarch64-opc.c (struct operand_qualifier_data): Add data for
	new AARCH64_OPND_QLF_S_2H qualifier.
	* aarch64-tbl.h (QL_BFDOT QL_BFDOT64, QL_BFDOT64I, QL_BFMMLA2,
	QL_BFCVT64, QL_BFCVTN64, QL_BFCVTN2_64): New qualifiers.
	(aarch64_feature_bfloat16, aarch64_feature_bfloat16_sve,
	aarch64_feature_bfloat16_bfmmla4): New feature sets.
	(BFLOAT_SVE, BFLOAT): New feature set macros.
	(BFLOAT_SVE_INSN, BFLOAT_BFMMLA4_INSN, BFLOAT_INSN): New macros
	to define BFloat16 instructions.
	(aarch64_opcode_table): Define new instructions bfdot,
	bfmmla, bfcvt, bfcvtnt, bfdot, bfdot, bfcvtn, bfmlal[b/t]
	bfcvtn2, bfcvt.

Regression tested on aarch64-elf.

Is it ok for trunk?

Regards,
Mihail
This commit is contained in:
Matthew Malcomson
2019-11-07 16:38:59 +00:00
parent 8ae2d3d9ea
commit df6780137d
18 changed files with 773 additions and 80 deletions

View File

@ -1,3 +1,17 @@
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* config/tc-aarch64.c (vectype_to_qualifier): Special case the
S_2H operand qualifier.
* doc/c-aarch64.texi: Document bf16 extension.
* testsuite/gas/aarch64/bfloat16.d: New test.
* testsuite/gas/aarch64/bfloat16.s: New test.
* testsuite/gas/aarch64/illegal-bfloat16.d: New test.
* testsuite/gas/aarch64/illegal-bfloat16.l: New test.
* testsuite/gas/aarch64/illegal-bfloat16.s: New test.
* testsuite/gas/aarch64/sve-bfloat-movprfx.s: New test.
* testsuite/gas/aarch64/sve-bfloat-movprfx.d: New test.
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com> 2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>

View File

@ -5130,6 +5130,10 @@ vectype_to_qualifier (const struct vector_type_el *vectype)
if (vectype->type == NT_b && vectype->width == 4) if (vectype->type == NT_b && vectype->width == 4)
return AARCH64_OPND_QLF_S_4B; return AARCH64_OPND_QLF_S_4B;
/* Special case S_2H. */
if (vectype->type == NT_h && vectype->width == 2)
return AARCH64_OPND_QLF_S_2H;
/* Vector element register. */ /* Vector element register. */
return AARCH64_OPND_QLF_S_B + vectype->type; return AARCH64_OPND_QLF_S_B + vectype->type;
} }
@ -9003,6 +9007,8 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
| AARCH64_FEATURE_SHA3, 0)}, | AARCH64_FEATURE_SHA3, 0)},
{"sve2-bitperm", AARCH64_FEATURE (AARCH64_FEATURE_SVE2_BITPERM, 0), {"sve2-bitperm", AARCH64_FEATURE (AARCH64_FEATURE_SVE2_BITPERM, 0),
AARCH64_FEATURE (AARCH64_FEATURE_SVE2, 0)}, AARCH64_FEATURE (AARCH64_FEATURE_SVE2, 0)},
{"bf16", AARCH64_FEATURE (AARCH64_FEATURE_BFLOAT16, 0),
AARCH64_ARCH_NONE},
{NULL, AARCH64_ARCH_NONE, AARCH64_ARCH_NONE}, {NULL, AARCH64_ARCH_NONE, AARCH64_ARCH_NONE},
}; };

View File

@ -144,6 +144,8 @@ automatically cause those extensions to be disabled.
@multitable @columnfractions .12 .17 .17 .54 @multitable @columnfractions .12 .17 .17 .54
@headitem Extension @tab Minimum Architecture @tab Enabled by default @headitem Extension @tab Minimum Architecture @tab Enabled by default
@tab Description @tab Description
@item @code{bf16} @tab ARMv8.2-A @tab ARMv8.6-A or later
@tab Enable BFloat16 extension.
@item @code{compnum} @tab ARMv8.2-A @tab ARMv8.3-A or later @item @code{compnum} @tab ARMv8.2-A @tab ARMv8.3-A or later
@tab Enable the complex number SIMD extensions. This implies @tab Enable the complex number SIMD extensions. This implies
@code{fp16} and @code{simd}. @code{fp16} and @code{simd}.

View File

@ -0,0 +1,56 @@
#as: -march=armv8.6-a+bf16+sve
#objdump: -dr
.*: file format .*
Disassembly of section \.text:
0000000000000000 <\.text>:
*[0-9a-f]+: 647b82b1 bfdot z17\.s, z21\.h, z27\.h
*[0-9a-f]+: 64608000 bfdot z0\.s, z0\.h, z0\.h
*[0-9a-f]+: 647d42b1 bfdot z17\.s, z21\.h, z5\.h\[3\]
*[0-9a-f]+: 64784000 bfdot z0\.s, z0\.h, z0\.h\[3\]
*[0-9a-f]+: 64604000 bfdot z0\.s, z0\.h, z0\.h\[0\]
*[0-9a-f]+: 647be6b1 bfmmla z17\.s, z21\.h, z27\.h
*[0-9a-f]+: 6460e400 bfmmla z0\.s, z0\.h, z0\.h
*[0-9a-f]+: 658ab6b1 bfcvt z17\.h, p5/m, z21\.s
*[0-9a-f]+: 658aa000 bfcvt z0\.h, p0/m, z0\.s
*[0-9a-f]+: 648ab6b1 bfcvtnt z17\.h, p5/m, z21\.s
*[0-9a-f]+: 648aa000 bfcvtnt z0\.h, p0/m, z0\.s
*[0-9a-f]+: 64fb86b1 bfmlalt z17\.s, z21\.h, z27\.h
*[0-9a-f]+: 64e08400 bfmlalt z0\.s, z0\.h, z0\.h
*[0-9a-f]+: 64fb82b1 bfmlalb z17\.s, z21\.h, z27\.h
*[0-9a-f]+: 64e08000 bfmlalb z0\.s, z0\.h, z0\.h
*[0-9a-f]+: 64e546b1 bfmlalt z17\.s, z21\.h, z5\.h\[0\]
*[0-9a-f]+: 64f84c00 bfmlalt z0\.s, z0\.h, z0\.h\[7\]
*[0-9a-f]+: 64e542b1 bfmlalb z17\.s, z21\.h, z5\.h\[0\]
*[0-9a-f]+: 64f84800 bfmlalb z0\.s, z0\.h, z0\.h\[7\]
*[0-9a-f]+: 2e5bfeb1 bfdot v17\.2s, v21\.4h, v27\.4h
*[0-9a-f]+: 2e40fc00 bfdot v0\.2s, v0\.4h, v0\.4h
*[0-9a-f]+: 6e5bfeb1 bfdot v17\.4s, v21\.8h, v27\.8h
*[0-9a-f]+: 6e40fc00 bfdot v0\.4s, v0\.8h, v0\.8h
*[0-9a-f]+: 0f7bfab1 bfdot v17\.2s, v21\.4h, v27\.2h\[3\]
*[0-9a-f]+: 0f60f800 bfdot v0\.2s, v0\.4h, v0\.2h\[3\]
*[0-9a-f]+: 4f7bfab1 bfdot v17\.4s, v21\.8h, v27\.2h\[3\]
*[0-9a-f]+: 4f60f800 bfdot v0\.4s, v0\.8h, v0\.2h\[3\]
*[0-9a-f]+: 0f5bf2b1 bfdot v17\.2s, v21\.4h, v27\.2h\[0\]
*[0-9a-f]+: 0f40f000 bfdot v0\.2s, v0\.4h, v0\.2h\[0\]
*[0-9a-f]+: 4f5bf2b1 bfdot v17\.4s, v21\.8h, v27\.2h\[0\]
*[0-9a-f]+: 4f40f000 bfdot v0\.4s, v0\.8h, v0\.2h\[0\]
*[0-9a-f]+: 6e5beeb1 bfmmla v17\.4s, v21\.8h, v27\.8h
*[0-9a-f]+: 6e40ec00 bfmmla v0\.4s, v0\.8h, v0\.8h
*[0-9a-f]+: 2edbfeb1 bfmlalb v17\.4s, v21\.8h, v27\.8h
*[0-9a-f]+: 2ec0fc00 bfmlalb v0\.4s, v0\.8h, v0\.8h
*[0-9a-f]+: 6edbfeb1 bfmlalt v17\.4s, v21\.8h, v27\.8h
*[0-9a-f]+: 6ec0fc00 bfmlalt v0\.4s, v0\.8h, v0\.8h
*[0-9a-f]+: 0fcff2b1 bfmlalb v17\.4s, v21\.8h, v15\.h\[0\]
*[0-9a-f]+: 0ff0f800 bfmlalb v0\.4s, v0\.8h, v0\.h\[7\]
*[0-9a-f]+: 4fcff2b1 bfmlalt v17\.4s, v21\.8h, v15\.h\[0\]
*[0-9a-f]+: 4ff0f800 bfmlalt v0\.4s, v0\.8h, v0\.h\[7\]
*[0-9a-f]+: 0ea16ab1 bfcvtn v17\.4h, v21\.4s
*[0-9a-f]+: 0ea16800 bfcvtn v0\.4h, v0\.4s
*[0-9a-f]+: 4ea16ab1 bfcvtn2 v17\.8h, v21\.4s
*[0-9a-f]+: 4ea16800 bfcvtn2 v0\.8h, v0\.4s
*[0-9a-f]+: 1e6342b1 bfcvt h17, s21
*[0-9a-f]+: 1e634000 bfcvt h0, s0

View File

@ -0,0 +1,70 @@
/* The instructions with non-zero register numbers are there to ensure we have
the correct argument positioning (i.e. check that the first argument is at
the end of the word etc).
The instructions with all-zero register numbers are to ensure the previous
encoding didn't just "happen" to fit -- so that if we change the registers
that changes the correct part of the word.
Each of the numbered patterns begin and end with a 1, so we can replace
them with all-zeros and see the entire range has changed. */
// SVE
bfdot z17.s, z21.h, z27.h
bfdot z0.s, z0.h, z0.h
bfdot z17.s, z21.h, z5.h[3]
bfdot z0.s, z0.h, z0.h[3]
bfdot z0.s, z0.h, z0.h[0]
bfmmla z17.s, z21.h, z27.h
bfmmla z0.s, z0.h, z0.h
bfcvt z17.h, p5/m, z21.s
bfcvt z0.h, p0/m, z0.s
bfcvtnt z17.h, p5/m, z21.s
bfcvtnt z0.h, p0/m, z0.s
bfmlalt z17.s, z21.h, z27.h
bfmlalt z0.s, z0.h, z0.h
bfmlalb z17.s, z21.h, z27.h
bfmlalb z0.s, z0.h, z0.h
bfmlalt z17.s, z21.h, z5.h[0]
bfmlalt z0.s, z0.h, z0.h[7]
bfmlalb z17.s, z21.h, z5.h[0]
bfmlalb z0.s, z0.h, z0.h[7]
// SIMD
bfdot v17.2s, v21.4h, v27.4h
bfdot v0.2s, v0.4h, v0.4h
bfdot v17.4s, v21.8h, v27.8h
bfdot v0.4s, v0.8h, v0.8h
bfdot v17.2s, v21.4h, v27.2h[3]
bfdot v0.2s, v0.4h, v0.2h[3]
bfdot v17.4s, v21.8h, v27.2h[3]
bfdot v0.4s, v0.8h, v0.2h[3]
bfdot v17.2s, v21.4h, v27.2h[0]
bfdot v0.2s, v0.4h, v0.2h[0]
bfdot v17.4s, v21.8h, v27.2h[0]
bfdot v0.4s, v0.8h, v0.2h[0]
bfmmla v17.4s, v21.8h, v27.8h
bfmmla v0.4s, v0.8h, v0.8h
bfmlalb v17.4s, v21.8h, v27.8h
bfmlalb v0.4s, v0.8h, v0.8h
bfmlalt v17.4s, v21.8h, v27.8h
bfmlalt v0.4s, v0.8h, v0.8h
bfmlalb v17.4s, v21.8h, v15.h[0]
bfmlalb v0.4s, v0.8h, v0.h[7]
bfmlalt v17.4s, v21.8h, v15.h[0]
bfmlalt v0.4s, v0.8h, v0.h[7]
bfcvtn v17.4h, v21.4s
bfcvtn v0.4h, v0.4s
bfcvtn2 v17.8h, v21.4s
bfcvtn2 v0.8h, v0.4s
bfcvt h17, s21
bfcvt h0, s0

View File

@ -0,0 +1,4 @@
#name: Illegal Bfloat16 instructions
#as: -march=armv8.6-a+bf16+sve
#source: illegal-bfloat16.s
#error_output: illegal-bfloat16.l

View File

@ -0,0 +1,95 @@
[^ :]+: Assembler messages:
[^ :]+:[0-9]+: Error: operand mismatch -- `bfdot z0\.s,z1\.h,z2\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfdot z0\.s, z1\.h, z2\.h
[^ :]+:[0-9]+: Error: operand mismatch -- `bfdot z0\.s,z1\.h,z3\.s\[3\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfdot z0\.s, z1\.h, z3\.h\[3\]
[^ :]+:[0-9]+: Error: register element index out of range 0 to 3 at operand 3 -- `bfdot z0\.s,z1\.h,z3\.h\[4\]'
[^ :]+:[0-9]+: Error: z0-z7 expected at operand 3 -- `bfdot z0\.s,z1\.h,z8\.h\[3\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmmla z0\.s,z1\.h,z2\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmmla z0\.s, z1\.h, z2\.h
[^ :]+:[0-9]+: Error: operand mismatch -- `bfcvt z0\.h,p1/z,z2\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfcvt z0\.h, p1/m, z2\.s
[^ :]+:[0-9]+: Error: operand mismatch -- `bfcvt z0\.h,p1/m,z2\.h'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfcvt z0\.h, p1/m, z2\.s
[^ :]+:[0-9]+: Error: operand mismatch -- `bfcvtnt z0\.h,p1/z,z2\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfcvtnt z0\.h, p1/m, z2\.s
[^ :]+:[0-9]+: Error: operand mismatch -- `bfcvtnt z0\.h,p1/m,z2\.h'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfcvtnt z0\.h, p1/m, z2\.s
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalt z0\.s,z0\.h,z0\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalt z0\.s, z0\.h, z0\.h
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalt z32\.s,z0\.h,z0\.h'
[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `bfmlalt z0\.s,z32\.h,z0\.h'
[^ :]+:[0-9]+: Error: operand 3 must be an indexed SVE vector register -- `bfmlalt z0\.s,z0\.h,z32\.h'
[^ :]+:[0-9]+: Error: register element index out of range 0 to 7 at operand 3 -- `bfmlalt z0\.s,z0\.h,z0\.h\[8\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalt z0\.s,z0\.h,z0\.s\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalt z0\.s, z0\.h, z0\.h\[0\]
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalt z32\.s,z0\.h,z0\.h\[0\]'
[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `bfmlalt z0\.s,z32\.h,z0\.h\[0\]'
[^ :]+:[0-9]+: Error: z0-z7 expected at operand 3 -- `bfmlalt z0\.s,z0\.h,z8\.h\[0\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalb z0\.s,z0\.h,z0\.s'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalb z0\.s, z0\.h, z0\.h
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalb z32\.s,z0\.h,z0\.h'
[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `bfmlalb z0\.s,z32\.h,z0\.h'
[^ :]+:[0-9]+: Error: operand 3 must be an indexed SVE vector register -- `bfmlalb z0\.s,z0\.h,z32\.h'
[^ :]+:[0-9]+: Error: register element index out of range 0 to 7 at operand 3 -- `bfmlalb z0\.s,z0\.h,z0\.h\[8\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalb z0\.s,z0\.h,z0\.s\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalb z0\.s, z0\.h, z0\.h\[0\]
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalb z32\.s,z0\.h,z0\.h\[0\]'
[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `bfmlalb z0\.s,z32\.h,z0\.h\[0\]'
[^ :]+:[0-9]+: Error: z0-z7 expected at operand 3 -- `bfmlalb z0\.s,z0\.h,z8\.h\[0\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfdot v0\.2s,v1\.4h,v2\.2s\[3\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfdot v0\.2s, v1\.4h, v2\.2h\[3\]
[^ :]+:[0-9]+: Info: other valid variant\(s\):
[^ :]+:[0-9]+: Info: bfdot v0\.4s, v1\.8h, v2\.2h\[3\]
[^ :]+:[0-9]+: Error: register element index out of range 0 to 3 at operand 3 -- `bfdot v0\.4s,v1\.8h,v2\.2h\[4\]'
[^ :]+:[0-9]+: Error: invalid element size 8 and vector size combination s at operand 3 -- `bfmmla v0\.4s,v1\.8h,v2\.8s'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmmla v0\.4s,v1\.4h,v2\.8h'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmmla v0\.4s, v1\.8h, v2\.8h
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalb v0\.4s,v0\.4h,v0\.8h'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalb v0\.4s, v0\.8h, v0\.8h
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalb v32\.4s,v0\.8h,v0\.8h'
[^ :]+:[0-9]+: Error: operand 2 must be a SIMD vector register -- `bfmlalb v0\.4s,v32\.8h,v0\.8h'
[^ :]+:[0-9]+: Error: operand 3 must be a SIMD vector register -- `bfmlalb v0\.4s,v0\.8h,v32\.8h'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalt v0\.4s,v0\.8h,v0\.4h'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalt v0\.4s, v0\.8h, v0\.8h
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalt v32\.4s,v0\.8h,v0\.8h'
[^ :]+:[0-9]+: Error: operand 2 must be a SIMD vector register -- `bfmlalt v0\.4s,v32\.8h,v0\.8h'
[^ :]+:[0-9]+: Error: operand 3 must be a SIMD vector register -- `bfmlalt v0\.4s,v0\.8h,v32\.8h'
[^ :]+:[0-9]+: Error: register element index out of range 0 to 7 at operand 3 -- `bfmlalb v0\.4s,v0\.8h,v0\.h\[8\]'
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalb v32\.4s,v0\.8h,v0\.h\[0\]'
[^ :]+:[0-9]+: Error: operand 2 must be a SIMD vector register -- `bfmlalb v0\.4s,v32\.8h,v0\.h\[0\]'
[^ :]+:[0-9]+: Error: register number out of range 0 to 15 at operand 3 -- `bfmlalb v0\.4s,v0\.8h,v16\.h\[0\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalb v0\.4s,v0\.4h,v0\.h\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalb v0\.4s, v0\.8h, v0\.h\[0\]
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalb v0\.4s,v0\.8h,v0\.s\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalb v0\.4s, v0\.8h, v0\.h\[0\]
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalt v0\.4s,v0\.8h,v0\.s\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalt v0\.4s, v0\.8h, v0\.h\[0\]
[^ :]+:[0-9]+: Error: operand mismatch -- `bfmlalt v0\.4s,v0\.4h,v0\.h\[0\]'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfmlalt v0\.4s, v0\.8h, v0\.h\[0\]
[^ :]+:[0-9]+: Error: register element index out of range 0 to 7 at operand 3 -- `bfmlalt v0\.4s,v0\.8h,v0\.h\[8\]'
[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `bfmlalt v32\.4s,v0\.8h,v0\.h\[0\]'
[^ :]+:[0-9]+: Error: operand 2 must be a SIMD vector register -- `bfmlalt v0\.4s,v32\.8h,v0\.h\[0\]'
[^ :]+:[0-9]+: Error: register number out of range 0 to 15 at operand 3 -- `bfmlalt v0\.4s,v0\.8h,v16\.h\[0\]'
[^ :]+:[0-9]+: Error: operand mismatch -- `bfcvt h0,h1'
[^ :]+:[0-9]+: Info: did you mean this\?
[^ :]+:[0-9]+: Info: bfcvt h0, s1

View File

@ -0,0 +1,67 @@
// SVE
bfdot z0.s, z1.h, z2.s // Fails from size types
bfdot z0.s, z1.h, z3.s[3] // Fails from size types
bfdot z0.s, z1.h, z3.h[4] // Fails from index size
bfdot z0.s, z1.h, z8.h[3] // Fails from vector number
bfmmla z0.s, z1.h, z2.s // Fails from size types
bfcvt z0.h, p1/z, z2.s // Fails from merge type
bfcvt z0.h, p1/m, z2.h // Fails from size type
bfcvtnt z0.h, p1/z, z2.s // Fails from merge type
bfcvtnt z0.h, p1/m, z2.h // Fails from size type
bfmlalt z0.s, z0.h, z0.s // Fails from size type
bfmlalt z32.s, z0.h, z0.h
bfmlalt z0.s, z32.h, z0.h
bfmlalt z0.s, z0.h, z32.h
bfmlalt z0.s, z0.h, z0.h[8] // Fails from index size
bfmlalt z0.s, z0.h, z0.s[0] // Fails from size type
bfmlalt z32.s, z0.h, z0.h[0]
bfmlalt z0.s, z32.h, z0.h[0]
bfmlalt z0.s, z0.h, z8.h[0] // Fails from vector index
bfmlalb z0.s, z0.h, z0.s // Fails from size type
bfmlalb z32.s, z0.h, z0.h
bfmlalb z0.s, z32.h, z0.h
bfmlalb z0.s, z0.h, z32.h
bfmlalb z0.s, z0.h, z0.h[8] // Fails from index size
bfmlalb z0.s, z0.h, z0.s[0] // Fails from size type
bfmlalb z32.s, z0.h, z0.h[0]
bfmlalb z0.s, z32.h, z0.h[0]
bfmlalb z0.s, z0.h, z8.h[0] // Fails from vector index
// SIMD
bfdot v0.2s, v1.4h, v2.2s[3] // Fails from size types
bfdot v0.4s, v1.8h, v2.2h[4] // Fails from index size
bfmmla v0.4s, v1.8h, v2.8s // Fails from size types
bfmmla v0.4s, v1.4h, v2.8h // Fails from size types
bfmlalb v0.4s, v0.4h, v0.8h
bfmlalb v32.4s, v0.8h, v0.8h
bfmlalb v0.4s, v32.8h, v0.8h
bfmlalb v0.4s, v0.8h, v32.8h
bfmlalt v0.4s, v0.8h, v0.4h
bfmlalt v32.4s, v0.8h, v0.8h
bfmlalt v0.4s, v32.8h, v0.8h
bfmlalt v0.4s, v0.8h, v32.8h
bfmlalb v0.4s, v0.8h, v0.h[8]
bfmlalb v32.4s, v0.8h, v0.h[0]
bfmlalb v0.4s, v32.8h, v0.h[0]
bfmlalb v0.4s, v0.8h, v16.h[0]
bfmlalb v0.4s, v0.4h, v0.h[0]
bfmlalb v0.4s, v0.8h, v0.s[0]
bfmlalt v0.4s, v0.8h, v0.s[0]
bfmlalt v0.4s, v0.4h, v0.h[0]
bfmlalt v0.4s, v0.8h, v0.h[8]
bfmlalt v32.4s, v0.8h, v0.h[0]
bfmlalt v0.4s, v32.8h, v0.h[0]
bfmlalt v0.4s, v0.8h, v16.h[0]
bfcvt h0, h1 // Fails from size types

View File

@ -0,0 +1,27 @@
#as: -march=armv8.6-a+bf16+sve
#objdump: -dr
.* file format .*
Disassembly of section \.text:
0000000000000000 <\.text>:
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64638040 bfdot z0\.s, z2\.h, z3\.h
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64634040 bfdot z0\.s, z2\.h, z3\.h\[0\]
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 6463e440 bfmmla z0\.s, z2\.h, z3\.h
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64e38040 bfmlalb z0\.s, z2\.h, z3\.h
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64e38440 bfmlalt z0\.s, z2\.h, z3\.h
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64e34040 bfmlalb z0\.s, z2\.h, z3\.h\[0\]
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 64e34440 bfmlalt z0\.s, z2\.h, z3\.h\[0\]
*[0-9a-f]+: 0420bc20 movprfx z0, z1
*[0-9a-f]+: 658aa040 bfcvt z0\.h, p0/m, z2\.s
*[0-9a-f]+: 04512020 movprfx z0\.h, p0/m, z1\.h
*[0-9a-f]+: 658aa040 bfcvt z0\.h, p0/m, z2\.s

View File

@ -0,0 +1,31 @@
.text
.arch armv8.2-a+bf16+sve
movprfx z0, z1
bfdot z0.s, z2.h, z3.h
movprfx z0, z1
bfdot z0.s, z2.h, z3.h[0]
movprfx z0, z1
bfmmla z0.s, z2.h, z3.h
movprfx z0, z1
bfmlalb z0.s, z2.h, z3.h
movprfx z0, z1
bfmlalt z0.s, z2.h, z3.h
movprfx z0, z1
bfmlalb z0.s, z2.h, z3.h[0]
movprfx z0, z1
bfmlalt z0.s, z2.h, z3.h[0]
# Unpredicated movprfx + bfcvt
movprfx z0, z1
bfcvt z0.h, p0/m, z2.s
# Predicated movprfx + bfcvt
movprfx z0.h, p0/m, z1.h
bfcvt z0.h, p0/m, z2.s

View File

@ -1,3 +1,14 @@
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* opcode/aarch64.h (AARCH64_FEATURE_BFLOAT16): New feature macros.
(AARCH64_ARCH_V8_6): Include BFloat16 feature macros.
(enum aarch64_opnd_qualifier): Introduce new operand qualifier
AARCH64_OPND_QLF_S_2H.
(enum aarch64_insn_class): Introduce new class "bfloat16".
(BFLOAT16_SVE_INSNC): New feature set for bfloat16
instructions to support the movprfx constraint.
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com> 2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>

View File

@ -64,6 +64,7 @@ typedef uint32_t aarch64_insn;
#define AARCH64_FEATURE_F16_FML 0x1000000000ULL /* v8.2 FP16FML ins. */ #define AARCH64_FEATURE_F16_FML 0x1000000000ULL /* v8.2 FP16FML ins. */
#define AARCH64_FEATURE_V8_5 0x2000000000ULL /* ARMv8.5 processors. */ #define AARCH64_FEATURE_V8_5 0x2000000000ULL /* ARMv8.5 processors. */
#define AARCH64_FEATURE_V8_6 0x00000002 /* ARMv8.6 processors. */ #define AARCH64_FEATURE_V8_6 0x00000002 /* ARMv8.6 processors. */
#define AARCH64_FEATURE_BFLOAT16 0x00000004 /* Bfloat16 insns. */
/* Flag Manipulation insns. */ /* Flag Manipulation insns. */
#define AARCH64_FEATURE_FLAGMANIP 0x4000000000ULL #define AARCH64_FEATURE_FLAGMANIP 0x4000000000ULL
@ -131,7 +132,8 @@ typedef uint32_t aarch64_insn;
| AARCH64_FEATURE_ID_PFR2 \ | AARCH64_FEATURE_ID_PFR2 \
| AARCH64_FEATURE_SSBS) | AARCH64_FEATURE_SSBS)
#define AARCH64_ARCH_V8_6 AARCH64_FEATURE (AARCH64_ARCH_V8_5, \ #define AARCH64_ARCH_V8_6 AARCH64_FEATURE (AARCH64_ARCH_V8_5, \
AARCH64_FEATURE_V8_6) AARCH64_FEATURE_V8_6 \
| AARCH64_FEATURE_BFLOAT16)
#define AARCH64_ARCH_NONE AARCH64_FEATURE (0, 0) #define AARCH64_ARCH_NONE AARCH64_FEATURE (0, 0)
#define AARCH64_ANY AARCH64_FEATURE (-1, 0) /* Any basic core. */ #define AARCH64_ANY AARCH64_FEATURE (-1, 0) /* Any basic core. */
@ -462,11 +464,13 @@ enum aarch64_opnd_qualifier
AARCH64_OPND_QLF_S_S, AARCH64_OPND_QLF_S_S,
AARCH64_OPND_QLF_S_D, AARCH64_OPND_QLF_S_D,
AARCH64_OPND_QLF_S_Q, AARCH64_OPND_QLF_S_Q,
/* This type qualifier has a special meaning in that it means that 4 x 1 byte /* These type qualifiers have a special meaning in that they mean 4 x 1 byte
are selected by the instruction. Other than that it has no difference or 2 x 2 byte are selected by the instruction. Other than that they have
with AARCH64_OPND_QLF_S_B in encoding. It is here purely for syntactical no difference with AARCH64_OPND_QLF_S_B in encoding. They are here purely
reasons and is an exception from normal AArch64 disassembly scheme. */ for syntactical reasons and is an exception from normal AArch64
disassembly scheme. */
AARCH64_OPND_QLF_S_4B, AARCH64_OPND_QLF_S_4B,
AARCH64_OPND_QLF_S_2H,
/* Qualifying an operand which is a SIMD vector register or a SIMD vector /* Qualifying an operand which is a SIMD vector register or a SIMD vector
register list; indicating register shape. register list; indicating register shape.
@ -609,6 +613,7 @@ enum aarch64_insn_class
cryptosm3, cryptosm3,
cryptosm4, cryptosm4,
dotproduct, dotproduct,
bfloat16,
}; };
/* Opcode enumerators. */ /* Opcode enumerators. */

View File

@ -1,3 +1,25 @@
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* aarch64-asm.c (aarch64_ins_reglane): Use AARCH64_OPND_QLF_S_2H
in reglane special case.
* aarch64-dis-2.c (aarch64_opcode_lookup_1,
aarch64_find_next_opcode): Account for new instructions.
* aarch64-dis.c (aarch64_ext_reglane): Use AARCH64_OPND_QLF_S_2H
in reglane special case.
* aarch64-opc.c (struct operand_qualifier_data): Add data for
new AARCH64_OPND_QLF_S_2H qualifier.
* aarch64-tbl.h (QL_BFDOT QL_BFDOT64, QL_BFDOT64I, QL_BFMMLA2,
QL_BFCVT64, QL_BFCVTN64, QL_BFCVTN2_64): New qualifiers.
(aarch64_feature_bfloat16, aarch64_feature_bfloat16_sve): New feature
sets.
(BFLOAT_SVE, BFLOAT): New feature set macros.
(BFLOAT_SVE_INSN, BFLOAT_INSN): New macros to define BFloat16
instructions.
(aarch64_opcode_table): Define new instructions bfdot,
bfmmla, bfcvt, bfcvtnt, bfdot, bfdot, bfcvtn, bfmlal[b/t]
bfcvtn2, bfcvt.
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com> 2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>

View File

@ -130,6 +130,7 @@ aarch64_ins_reglane (const aarch64_operand *self, const aarch64_opnd_info *info,
switch (info->qualifier) switch (info->qualifier)
{ {
case AARCH64_OPND_QLF_S_4B: case AARCH64_OPND_QLF_S_4B:
case AARCH64_OPND_QLF_S_2H:
/* L:H */ /* L:H */
assert (reglane_index < 4); assert (reglane_index < 4);
insert_fields (code, reglane_index, 0, 2, FLD_L, FLD_H); insert_fields (code, reglane_index, 0, 2, FLD_L, FLD_H);

View File

@ -8660,11 +8660,22 @@ aarch64_opcode_lookup_1 (uint32_t word)
{ {
if (((word >> 16) & 0x1) == 0) if (((word >> 16) & 0x1) == 0)
{ {
/* 33222222222211111111110000000000 if (((word >> 17) & 0x1) == 0)
10987654321098765432109876543210 {
011001x0100xxxx0101xxxxxxxxxxxxx /* 33222222222211111111110000000000
fcvtnt. */ 10987654321098765432109876543210
return 2068; 011001x0100xxx00101xxxxxxxxxxxxx
fcvtnt. */
return 2068;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
011001x0100xxx10101xxxxxxxxxxxxx
bfcvtnt. */
return 2394;
}
} }
else else
{ {
@ -9118,19 +9129,52 @@ aarch64_opcode_lookup_1 (uint32_t word)
{ {
if (((word >> 23) & 0x1) == 0) if (((word >> 23) & 0x1) == 0)
{ {
/* 33222222222211111111110000000000 if (((word >> 31) & 0x1) == 0)
10987654321098765432109876543210 {
x11001x0011xxxxx010xxxxxxxxxxxxx /* 33222222222211111111110000000000
st1b. */ 10987654321098765432109876543210
return 1868; 011001x0011xxxxx010xxxxxxxxxxxxx
bfdot. */
return 2391;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
111001x0011xxxxx010xxxxxxxxxxxxx
st1b. */
return 1868;
}
} }
else else
{ {
/* 33222222222211111111110000000000 if (((word >> 31) & 0x1) == 0)
10987654321098765432109876543210 {
x11001x0111xxxxx010xxxxxxxxxxxxx if (((word >> 10) & 0x1) == 0)
st1h. */ {
return 1889; /* 33222222222211111111110000000000
10987654321098765432109876543210
011001x0111xxxxx010xx0xxxxxxxxxx
bfmlalb. */
return 2398;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
011001x0111xxxxx010xx1xxxxxxxxxx
bfmlalt. */
return 2397;
}
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
111001x0111xxxxx010xxxxxxxxxxxxx
st1h. */
return 1889;
}
} }
} }
} }
@ -9169,11 +9213,44 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
else else
{ {
/* 33222222222211111111110000000000 if (((word >> 23) & 0x1) == 0)
10987654321098765432109876543210 {
x11001x0x11xxxxx1x0xxxxxxxxxxxxx /* 33222222222211111111110000000000
st1h. */ 10987654321098765432109876543210
return 1890; x11001x0011xxxxx1x0xxxxxxxxxxxxx
bfdot. */
return 2390;
}
else
{
if (((word >> 31) & 0x1) == 0)
{
if (((word >> 10) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
011001x0111xxxxx1x0xx0xxxxxxxxxx
bfmlalb. */
return 2396;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
011001x0111xxxxx1x0xx1xxxxxxxxxx
bfmlalt. */
return 2395;
}
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
111001x0111xxxxx1x0xxxxxxxxxxxxx
st1h. */
return 1890;
}
}
} }
} }
} }
@ -9529,9 +9606,9 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
else else
{ {
if (((word >> 20) & 0x1) == 0) if (((word >> 22) & 0x1) == 0)
{ {
if (((word >> 22) & 0x1) == 0) if (((word >> 20) & 0x1) == 0)
{ {
if (((word >> 23) & 0x1) == 0) if (((word >> 23) & 0x1) == 0)
{ {
@ -9551,28 +9628,6 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
} }
else else
{
if (((word >> 23) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x11001x00110xxxx111xxxxxxxxxxxxx
st1b. */
return 1874;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x11001x01110xxxx111xxxxxxxxxxxxx
st1h. */
return 1895;
}
}
}
else
{
if (((word >> 22) & 0x1) == 0)
{ {
if (((word >> 23) & 0x1) == 0) if (((word >> 23) & 0x1) == 0)
{ {
@ -9591,15 +9646,48 @@ aarch64_opcode_lookup_1 (uint32_t word)
return 1913; return 1913;
} }
} }
else }
else
{
if (((word >> 23) & 0x1) == 0)
{ {
if (((word >> 23) & 0x1) == 0) if (((word >> 31) & 0x1) == 0)
{ {
/* 33222222222211111111110000000000 /* 33222222222211111111110000000000
10987654321098765432109876543210 10987654321098765432109876543210
x11001x00111xxxx111xxxxxxxxxxxxx 011001x0011xxxxx111xxxxxxxxxxxxx
st4b. */ bfmmla. */
return 1925; return 2392;
}
else
{
if (((word >> 20) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
111001x00110xxxx111xxxxxxxxxxxxx
st1b. */
return 1874;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
111001x00111xxxx111xxxxxxxxxxxxx
st4b. */
return 1925;
}
}
}
else
{
if (((word >> 20) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x11001x01110xxxx111xxxxxxxxxxxxx
st1h. */
return 1895;
} }
else else
{ {
@ -13993,11 +14081,22 @@ aarch64_opcode_lookup_1 (uint32_t word)
{ {
if (((word >> 22) & 0x1) == 0) if (((word >> 22) & 0x1) == 0)
{ {
/* 33222222222211111111110000000000 if (((word >> 23) & 0x1) == 0)
10987654321098765432109876543210 {
011001x1x0001x10101xxxxxxxxxxxxx /* 33222222222211111111110000000000
fcvtx. */ 10987654321098765432109876543210
return 2070; 011001x100001x10101xxxxxxxxxxxxx
fcvtx. */
return 2070;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
011001x110001x10101xxxxxxxxxxxxx
bfcvt. */
return 2393;
}
} }
else else
{ {
@ -16503,11 +16602,55 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
else else
{ {
/* 33222222222211111111110000000000 if (((word >> 11) & 0x1) == 0)
10987654321098765432109876543210 {
xx101110xx0xxxxx1x1xx1xxxxxxxxxx /* 33222222222211111111110000000000
fcadd. */ 10987654321098765432109876543210
return 373; xx101110xx0xxxxx1x1x01xxxxxxxxxx
fcadd. */
return 373;
}
else
{
if (((word >> 12) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
xx101110xx0xxxxx1x1011xxxxxxxxxx
bfmmla. */
return 2401;
}
else
{
if (((word >> 23) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
xx1011100x0xxxxx1x1111xxxxxxxxxx
bfdot. */
return 2399;
}
else
{
if (((word >> 30) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x01011101x0xxxxx1x1111xxxxxxxxxx
bfmlalb. */
return 2406;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x11011101x0xxxxx1x1111xxxxxxxxxx
bfmlalt. */
return 2405;
}
}
}
}
} }
} }
} }
@ -17060,21 +17203,43 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
else else
{ {
if (((word >> 30) & 0x1) == 0) if (((word >> 23) & 0x1) == 0)
{ {
/* 33222222222211111111110000000000 if (((word >> 30) & 0x1) == 0)
10987654321098765432109876543210 {
00001110xx1xxxx1011010xxxxxxxxxx /* 33222222222211111111110000000000
fcvtn. */ 10987654321098765432109876543210
return 178; 000011100x1xxxx1011010xxxxxxxxxx
fcvtn. */
return 178;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
010011100x1xxxx1011010xxxxxxxxxx
fcvtn2. */
return 179;
}
} }
else else
{ {
/* 33222222222211111111110000000000 if (((word >> 30) & 0x1) == 0)
10987654321098765432109876543210 {
01001110xx1xxxx1011010xxxxxxxxxx /* 33222222222211111111110000000000
fcvtn2. */ 10987654321098765432109876543210
return 179; 000011101x1xxxx1011010xxxxxxxxxx
bfcvtn. */
return 2402;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
010011101x1xxxx1011010xxxxxxxxxx
bfcvtn2. */
return 2403;
}
} }
} }
} }
@ -22165,11 +22330,44 @@ aarch64_opcode_lookup_1 (uint32_t word)
} }
else else
{ {
/* 33222222222211111111110000000000 if (((word >> 29) & 0x1) == 0)
10987654321098765432109876543210 {
xxx01111xxxxxxxx1111x0xxxxxxxxxx if (((word >> 23) & 0x1) == 0)
sqrdmlsh. */ {
return 130; /* 33222222222211111111110000000000
10987654321098765432109876543210
xx0011110xxxxxxx1111x0xxxxxxxxxx
bfdot. */
return 2400;
}
else
{
if (((word >> 30) & 0x1) == 0)
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x00011111xxxxxxx1111x0xxxxxxxxxx
bfmlalb. */
return 2408;
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
x10011111xxxxxxx1111x0xxxxxxxxxx
bfmlalt. */
return 2407;
}
}
}
else
{
/* 33222222222211111111110000000000
10987654321098765432109876543210
xx101111xxxxxxxx1111x0xxxxxxxxxx
sqrdmlsh. */
return 130;
}
} }
} }
} }
@ -22787,6 +22985,8 @@ aarch64_find_next_opcode (const aarch64_opcode *opcode)
case 823: return NULL; /* fsqrt --> NULL. */ case 823: return NULL; /* fsqrt --> NULL. */
case 831: value = 832; break; /* frintz --> frintz. */ case 831: value = 832; break; /* frintz --> frintz. */
case 832: return NULL; /* frintz --> NULL. */ case 832: return NULL; /* frintz --> NULL. */
case 824: value = 2404; break; /* fcvt --> bfcvt. */
case 2404: return NULL; /* bfcvt --> NULL. */
case 833: value = 834; break; /* frinta --> frinta. */ case 833: value = 834; break; /* frinta --> frinta. */
case 834: return NULL; /* frinta --> NULL. */ case 834: return NULL; /* frinta --> NULL. */
case 835: value = 836; break; /* frintx --> frintx. */ case 835: value = 836; break; /* frintx --> frintx. */

View File

@ -348,6 +348,7 @@ aarch64_ext_reglane (const aarch64_operand *self, aarch64_opnd_info *info,
switch (info->qualifier) switch (info->qualifier)
{ {
case AARCH64_OPND_QLF_S_4B: case AARCH64_OPND_QLF_S_4B:
case AARCH64_OPND_QLF_S_2H:
/* L:H */ /* L:H */
info->reglane.index = extract_fields (code, 0, 2, FLD_H, FLD_L); info->reglane.index = extract_fields (code, 0, 2, FLD_H, FLD_L);
info->reglane.regno &= 0x1f; info->reglane.regno &= 0x1f;

View File

@ -712,6 +712,7 @@ struct operand_qualifier_data aarch64_opnd_qualifiers[] =
{8, 1, 0x3, "d", OQK_OPD_VARIANT}, {8, 1, 0x3, "d", OQK_OPD_VARIANT},
{16, 1, 0x4, "q", OQK_OPD_VARIANT}, {16, 1, 0x4, "q", OQK_OPD_VARIANT},
{4, 1, 0x0, "4b", OQK_OPD_VARIANT}, {4, 1, 0x0, "4b", OQK_OPD_VARIANT},
{4, 1, 0x0, "2h", OQK_OPD_VARIANT},
{1, 4, 0x0, "4b", OQK_OPD_VARIANT}, {1, 4, 0x0, "4b", OQK_OPD_VARIANT},
{1, 8, 0x0, "8b", OQK_OPD_VARIANT}, {1, 8, 0x0, "8b", OQK_OPD_VARIANT},

View File

@ -2257,6 +2257,50 @@
{ \ { \
QLF2(X, NIL), \ QLF2(X, NIL), \
} }
/* e.g. BFDOT <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb> */
#define QL_BFDOT64 \
{ \
QLF3(V_2S, V_4H, V_4H),\
QLF3(V_4S, V_8H, V_8H),\
}
/* e.g. BFDOT <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.2H[<index>] */
#define QL_BFDOT64I \
{ \
QLF3(V_2S, V_4H, S_2H),\
QLF3(V_4S, V_8H, S_2H),\
}
/* e.g. BFMMLA <Vd>.4s, <Vn>.8h, <Vm>.8h */
#define QL_BFMMLA \
{ \
QLF3(V_4S, V_8H, V_8H),\
}
/* e.g. BFCVT <Hd>, <Sn> */
#define QL_BFCVT64 \
{ \
QLF2(S_H,S_S), \
}
/* e.g. BFCVT <Hd>, <Sn> */
#define QL_BFCVTN64 \
{ \
QLF2(V_4H,V_4S), \
}
/* e.g. BFCVT <Hd>, <Sn> */
#define QL_BFCVTN2_64 \
{ \
QLF2(V_8H,V_4S), \
}
/* e.g. BFMLAL2 <Vd>.4s, <Vn>.8h, <Vm>.H[<index>] */
#define QL_V3BFML4S \
{ \
QLF3(V_4S, V_8H, S_H), \
}
/* Opcode table. */ /* Opcode table. */
@ -2331,6 +2375,10 @@ static const aarch64_feature_set aarch64_feature_bti =
AARCH64_FEATURE (AARCH64_FEATURE_BTI, 0); AARCH64_FEATURE (AARCH64_FEATURE_BTI, 0);
static const aarch64_feature_set aarch64_feature_memtag = static const aarch64_feature_set aarch64_feature_memtag =
AARCH64_FEATURE (AARCH64_FEATURE_V8_5 | AARCH64_FEATURE_MEMTAG, 0); AARCH64_FEATURE (AARCH64_FEATURE_V8_5 | AARCH64_FEATURE_MEMTAG, 0);
static const aarch64_feature_set aarch64_feature_bfloat16 =
AARCH64_FEATURE (AARCH64_FEATURE_BFLOAT16, 0);
static const aarch64_feature_set aarch64_feature_bfloat16_sve =
AARCH64_FEATURE (AARCH64_FEATURE_BFLOAT16 | AARCH64_FEATURE_SVE, 0);
static const aarch64_feature_set aarch64_feature_tme = static const aarch64_feature_set aarch64_feature_tme =
AARCH64_FEATURE (AARCH64_FEATURE_TME, 0); AARCH64_FEATURE (AARCH64_FEATURE_TME, 0);
static const aarch64_feature_set aarch64_feature_sve2 = static const aarch64_feature_set aarch64_feature_sve2 =
@ -2387,6 +2435,8 @@ static const aarch64_feature_set aarch64_feature_v8_6 =
#define SVE2_SM4 &aarch64_feature_sve2sm4 #define SVE2_SM4 &aarch64_feature_sve2sm4
#define SVE2_BITPERM &aarch64_feature_sve2bitperm #define SVE2_BITPERM &aarch64_feature_sve2bitperm
#define ARMV8_6 &aarch64_feature_v8_6 #define ARMV8_6 &aarch64_feature_v8_6
#define BFLOAT16_SVE &aarch64_feature_bfloat16_sve
#define BFLOAT16 &aarch64_feature_bfloat16
#define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \ #define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL } { NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@ -2477,6 +2527,13 @@ static const aarch64_feature_set aarch64_feature_v8_6 =
#define SVE2BITPERM_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \ #define SVE2BITPERM_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2_BITPERM, OPS, QUALS, \ { NAME, OPCODE, MASK, CLASS, OP, SVE2_BITPERM, OPS, QUALS, \
FLAGS | F_STRICT, 0, TIED, NULL } FLAGS | F_STRICT, 0, TIED, NULL }
#define BFLOAT16_SVE_INSN(NAME,OPCODE,MASK,CLASS,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, 0, BFLOAT16_SVE, OPS, QUALS, FLAGS, 0, 0, NULL }
#define BFLOAT16_SVE_INSNC(NAME,OPCODE,MASK,CLASS,OPS,QUALS,FLAGS, CONSTRAINTS, TIED) \
{ NAME, OPCODE, MASK, CLASS, 0, BFLOAT16_SVE, OPS, QUALS, FLAGS | F_STRICT, \
CONSTRAINTS, TIED, NULL }
#define BFLOAT16_INSN(NAME,OPCODE,MASK,CLASS,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, 0, BFLOAT16, OPS, QUALS, FLAGS, 0, 0, NULL }
struct aarch64_opcode aarch64_opcode_table[] = struct aarch64_opcode aarch64_opcode_table[] =
{ {
@ -4974,6 +5031,29 @@ struct aarch64_opcode aarch64_opcode_table[] =
V8_4_INSN ("ldapursw", 0x99800000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0), V8_4_INSN ("ldapursw", 0x99800000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0),
V8_4_INSN ("stlur", 0xd9000000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0), V8_4_INSN ("stlur", 0xd9000000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0),
V8_4_INSN ("ldapur", 0xd9400000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0), V8_4_INSN ("ldapur", 0xd9400000, 0xffe00c00, ldst_unscaled, OP2 (Rt, ADDR_OFFSET), QL_STLX, 0),
/* BFloat instructions. */
BFLOAT16_SVE_INSNC ("bfdot", 0x64608000, 0xffe0fc00, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfdot", 0x64604000, 0xffe0fc00, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_INDEX), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfmmla", 0x6460e400, 0xffe0fc00, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfcvt", 0x658aa000, 0xffffe000, sve_misc, OP3 (SVE_Zd, SVE_Pg3, SVE_Zn), OP_SVE_HMS, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfcvtnt", 0x648aa000, 0xffffe000, sve_misc, OP3 (SVE_Zd, SVE_Pg3, SVE_Zn), OP_SVE_HMS, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfmlalt", 0x64e08400, 0xffe0fc00, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfmlalb", 0x64e08000, 0xffe0fc00, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfmlalt", 0x64e04400, 0xffe0f400, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
BFLOAT16_SVE_INSNC ("bfmlalb", 0x64e04000, 0xffe0f400, sve_misc, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),
/* BFloat Advanced SIMD instructions. */
BFLOAT16_INSN ("bfdot", 0x2e40fc00, 0xbfe0fc00, bfloat16, OP3 (Vd, Vn, Vm), QL_BFDOT64, F_SIZEQ),
/* Using dotproduct as iclass to treat instruction similar to udot. */
BFLOAT16_INSN ("bfdot", 0x0f40f000, 0xbfc0f400, dotproduct, OP3 (Vd, Vn, Em), QL_BFDOT64I, F_SIZEQ),
BFLOAT16_INSN ("bfmmla", 0x6e40ec00, 0xffe0fc00, bfloat16, OP3 (Vd, Vn, Vm), QL_BFMMLA, F_SIZEQ),
BFLOAT16_INSN ("bfcvtn", 0x0ea16800, 0xfffffc00, bfloat16, OP2 (Vd, Vn), QL_BFCVTN64, 0),
BFLOAT16_INSN ("bfcvtn2", 0x4ea16800, 0xfffffc00, bfloat16, OP2 (Vd, Vn), QL_BFCVTN2_64, 0),
BFLOAT16_INSN ("bfcvt", 0x1e634000, 0xfffffc00, bfloat16, OP2 (Fd, Fn), QL_BFCVT64, 0),
BFLOAT16_INSN ("bfmlalt", 0x6ec0fc00, 0xffe0fc00, bfloat16, OP3 (Vd, Vn, Vm), QL_BFMMLA, 0),
BFLOAT16_INSN ("bfmlalb", 0x2ec0fc00, 0xffe0fc00, bfloat16, OP3 (Vd, Vn, Vm), QL_BFMMLA, 0),
BFLOAT16_INSN ("bfmlalt", 0x4fc0f000, 0xffc0f400, bfloat16, OP3 (Vd, Vn, Em16), QL_V3BFML4S, 0),
BFLOAT16_INSN ("bfmlalb", 0x0fc0f000, 0xffc0f400, bfloat16, OP3 (Vd, Vn, Em16), QL_V3BFML4S, 0),
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL}, {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
}; };