x86: Add -O[2|s] assembler command-line options

On x86, some instructions have alternate shorter encodings:

1. When the upper 32 bits of destination registers of

andq $imm31, %r64
testq $imm31, %r64
xorq %r64, %r64
subq %r64, %r64

known to be zero, we can encode them without the REX_W bit:

andl $imm31, %r32
testl $imm31, %r32
xorl %r32, %r32
subl %r32, %r32

This optimization is enabled with -O, -O2 and -Os.
2. Since 0xb0 mov with 32-bit destination registers zero-extends 32-bit
immediate to 64-bit destination register, we can use it to encode 64-bit
mov with 32-bit immediates.  This optimization is enabled with -O, -O2
and -Os.
3. Since the upper bits of destination registers of VEX128 and EVEX128
instructions are extended to zero, if all bits of destination registers
of AVX256 or AVX512 instructions are zero, we can use VEX128 or EVEX128
encoding to encode AVX256 or AVX512 instructions.  When 2 source
registers are identical, AVX256 and AVX512 andn and xor instructions:

VOP %reg, %reg, %dest_reg

can be encoded with

VOP128 %reg, %reg, %dest_reg

This optimization is enabled with -O2 and -Os.
4. 16-bit, 32-bit and 64-bit register tests with immediate may be
encoded as 8-bit register test with immediate.  This optimization is
enabled with -Os.

This patch does:

1. Add {nooptimize} pseudo prefix to disable instruction size
optimization.
2. Add optimize to i386_opcode_modifier to tell assembler that encoding
of an instruction may be optimized.

gas/

	PR gas/22871
	* NEWS: Mention -O[2|s].
	* config/tc-i386.c (_i386_insn): Add no_optimize.
	(optimize): New.
	(optimize_for_space): Likewise.
	(fits_in_imm7): New function.
	(fits_in_imm31): Likewise.
	(optimize_encoding): Likewise.
	(md_assemble): Call optimize_encoding to optimize encoding.
	(parse_insn): Handle {nooptimize}.
	(md_shortopts): Append "O::".
	(md_parse_option): Handle -On.
	* doc/c-i386.texi: Document -O0, -O, -O1, -O2 and -Os as well
	as {nooptimize}.
	* testsuite/gas/cfi/cfi-x86_64.d: Pass -O0 to assembler.
	* testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d: Likewise.
	* testsuite/gas/i386/i386.exp: Run optimize-1, optimize-2,
	optimize-3, x86-64-optimize-1, x86-64-optimize-2,
	x86-64-optimize-3 and x86-64-optimize-4.
	* testsuite/gas/i386/optimize-1.d: New file.
	* testsuite/gas/i386/optimize-1.s: Likewise.
	* testsuite/gas/i386/optimize-2.d: Likewise.
	* testsuite/gas/i386/optimize-2.s: Likewise.
	* testsuite/gas/i386/optimize-3.d: Likewise.
	* testsuite/gas/i386/optimize-3.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-1.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-1.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.

opcodes/

	PR gas/22871
	* i386-gen.c (opcode_modifiers): Add Optimize.
	* i386-opc.h (Optimize): New enum.
	(i386_opcode_modifier): Add optimize.
	* i386-opc.tbl: Add "Optimize" to "mov $imm, reg",
	"sub reg, reg/mem", "test $imm, acc", "test $imm, reg/mem",
	"and $imm, acc", "and $imm, reg/mem", "xor reg, reg/mem",
	"movq $imm, reg" and AVX256 and AVX512 versions of vandnps,
	vandnpd, vpandn, vpandnd, vpandnq, vxorps, vxorpd, vpxor,
	vpxord and vpxorq.
	* i386-tbl.h: Regenerated.
This commit is contained in:
H.J. Lu
2018-02-27 07:36:33 -08:00
parent bc7c0509f2
commit b6f8c7c452
26 changed files with 6161 additions and 5356 deletions

View File

@ -1,3 +1,39 @@
2018-02-27 H.J. Lu <hongjiu.lu@intel.com>
PR gas/22871
* NEWS: Mention -O[2|s].
* config/tc-i386.c (_i386_insn): Add no_optimize.
(optimize): New.
(optimize_for_space): Likewise.
(fits_in_imm7): New function.
(fits_in_imm31): Likewise.
(optimize_encoding): Likewise.
(md_assemble): Call optimize_encoding to optimize encoding.
(parse_insn): Handle {nooptimize}.
(md_shortopts): Append "O::".
(md_parse_option): Handle -On.
* doc/c-i386.texi: Document -O0, -O, -O1, -O2 and -Os as well
as {nooptimize}.
* testsuite/gas/cfi/cfi-x86_64.d: Pass -O0 to assembler.
* testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d: Likewise.
* testsuite/gas/i386/i386.exp: Run optimize-1, optimize-2,
optimize-3, x86-64-optimize-1, x86-64-optimize-2,
x86-64-optimize-3 and x86-64-optimize-4.
* testsuite/gas/i386/optimize-1.d: New file.
* testsuite/gas/i386/optimize-1.s: Likewise.
* testsuite/gas/i386/optimize-2.d: Likewise.
* testsuite/gas/i386/optimize-2.s: Likewise.
* testsuite/gas/i386/optimize-3.d: Likewise.
* testsuite/gas/i386/optimize-3.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-1.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-1.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.
2018-02-27 Nick Clifton <nickc@redhat.com>
* po/ru.po: Updated Russian translation.

View File

@ -1,5 +1,8 @@
-*- text -*-
* Add -O[2|s] command-line options to x86 assembler to enable alternate
shorter instruction encoding.
* Add support for .nop directive. It is currently supported only for
x86 targets.

View File

@ -372,6 +372,9 @@ struct _i386_insn
/* Prefer the REX byte in encoding. */
bfd_boolean rex_encoding;
/* Disable instruction size optimization. */
bfd_boolean no_optimize;
/* How to encode vector instructions. */
enum
{
@ -600,6 +603,22 @@ static enum check_kind
}
sse_check, operand_check = check_warning;
/* Optimization:
1. Clear the REX_W bit with register operand if possible.
2. Above plus use 128bit vector instruction to clear the full vector
register.
*/
static int optimize = 0;
/* Optimization:
1. Clear the REX_W bit with register operand if possible.
2. Above plus use 128bit vector instruction to clear the full vector
register.
3. Above plus optimize "test{q,l,w} $imm8,%r{64,32,16}" to
"testb $imm7,%r8".
*/
static int optimize_for_space = 0;
/* Register prefix used for error message. */
static const char *register_prefix = "%";
@ -2185,6 +2204,18 @@ fits_in_imm4 (offsetT num)
return (num & 0xf) == num;
}
static INLINE int
fits_in_imm7 (offsetT num)
{
return (num & 0x7f) == num;
}
static INLINE int
fits_in_imm31 (offsetT num)
{
return (num & 0x7fffffff) == num;
}
static i386_operand_type
smallest_imm_type (offsetT num)
{
@ -3712,6 +3743,179 @@ check_hle (void)
}
}
/* Try the shortest encoding by shortening operand size. */
static void
optimize_encoding (void)
{
int j;
if (optimize_for_space
&& i.reg_operands == 1
&& i.imm_operands == 1
&& !i.types[1].bitfield.byte
&& i.op[0].imms->X_op == O_constant
&& fits_in_imm7 (i.op[0].imms->X_add_number)
&& ((i.tm.base_opcode == 0xa8
&& i.tm.extension_opcode == None)
|| (i.tm.base_opcode == 0xf6
&& i.tm.extension_opcode == 0x0)))
{
/* Optimize: -Os:
test $imm7, %r64/%r32/%r16 -> test $imm7, %r8
*/
unsigned int base_regnum = i.op[1].regs->reg_num;
if (flag_code == CODE_64BIT || base_regnum < 4)
{
i.types[1].bitfield.byte = 1;
/* Ignore the suffix. */
i.suffix = 0;
if (base_regnum >= 4
&& !(i.op[1].regs->reg_flags & RegRex))
{
/* Handle SP, BP, SI and DI registers. */
if (i.types[1].bitfield.word)
j = 16;
else if (i.types[1].bitfield.dword)
j = 32;
else
j = 48;
i.op[1].regs -= j;
}
}
}
else if (flag_code == CODE_64BIT
&& ((i.reg_operands == 1
&& i.imm_operands == 1
&& i.op[0].imms->X_op == O_constant
&& ((i.tm.base_opcode == 0xb0
&& i.tm.extension_opcode == None
&& fits_in_unsigned_long (i.op[0].imms->X_add_number))
|| (fits_in_imm31 (i.op[0].imms->X_add_number)
&& (((i.tm.base_opcode == 0x24
|| i.tm.base_opcode == 0xa8)
&& i.tm.extension_opcode == None)
|| (i.tm.base_opcode == 0x80
&& i.tm.extension_opcode == 0x4)
|| ((i.tm.base_opcode == 0xf6
|| i.tm.base_opcode == 0xc6)
&& i.tm.extension_opcode == 0x0)))))
|| (i.reg_operands == 2
&& i.op[0].regs == i.op[1].regs
&& ((i.tm.base_opcode == 0x30
|| i.tm.base_opcode == 0x28)
&& i.tm.extension_opcode == None)))
&& i.types[1].bitfield.qword)
{
/* Optimize: -O:
andq $imm31, %r64 -> andl $imm31, %r32
testq $imm31, %r64 -> testl $imm31, %r32
xorq %r64, %r64 -> xorl %r32, %r32
subq %r64, %r64 -> subl %r32, %r32
movq $imm31, %r64 -> movl $imm31, %r32
movq $imm32, %r64 -> movl $imm32, %r32
*/
i.tm.opcode_modifier.norex64 = 1;
if (i.tm.base_opcode == 0xb0 || i.tm.base_opcode == 0xc6)
{
/* Handle
movq $imm31, %r64 -> movl $imm31, %r32
movq $imm32, %r64 -> movl $imm32, %r32
*/
i.tm.operand_types[0].bitfield.imm32 = 1;
i.tm.operand_types[0].bitfield.imm32s = 0;
i.tm.operand_types[0].bitfield.imm64 = 0;
i.types[0].bitfield.imm32 = 1;
i.types[0].bitfield.imm32s = 0;
i.types[0].bitfield.imm64 = 0;
i.types[1].bitfield.dword = 1;
i.types[1].bitfield.qword = 0;
if (i.tm.base_opcode == 0xc6)
{
/* Handle
movq $imm31, %r64 -> movl $imm31, %r32
*/
i.tm.base_opcode = 0xb0;
i.tm.extension_opcode = None;
i.tm.opcode_modifier.shortform = 1;
i.tm.opcode_modifier.modrm = 0;
}
}
}
else if (optimize > 1
&& i.reg_operands == 3
&& i.op[0].regs == i.op[1].regs
&& !i.types[2].bitfield.xmmword
&& (i.tm.opcode_modifier.vex
|| (!i.mask
&& !i.rounding
&& i.tm.opcode_modifier.evex
&& cpu_arch_flags.bitfield.cpuavx512vl))
&& ((i.tm.base_opcode == 0x55
|| i.tm.base_opcode == 0x6655
|| i.tm.base_opcode == 0x66df
|| i.tm.base_opcode == 0x57
|| i.tm.base_opcode == 0x6657
|| i.tm.base_opcode == 0x66ef)
&& i.tm.extension_opcode == None))
{
/* Optimize: -O2:
VOP, one of vandnps, vandnpd, vxorps and vxorpd:
EVEX VOP %zmmM, %zmmM, %zmmN
-> VEX VOP %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
EVEX VOP %ymmM, %ymmM, %ymmN
-> VEX VOP %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
VEX VOP %ymmM, %ymmM, %ymmN
-> VEX VOP %xmmM, %xmmM, %xmmN
VOP, one of vpandn and vpxor:
VEX VOP %ymmM, %ymmM, %ymmN
-> VEX VOP %xmmM, %xmmM, %xmmN
VOP, one of vpandnd and vpandnq:
EVEX VOP %zmmM, %zmmM, %zmmN
-> VEX vpandn %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
EVEX VOP %ymmM, %ymmM, %ymmN
-> VEX vpandn %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
VOP, one of vpxord and vpxorq:
EVEX VOP %zmmM, %zmmM, %zmmN
-> VEX vpxor %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
EVEX VOP %ymmM, %ymmM, %ymmN
-> VEX vpxor %xmmM, %xmmM, %xmmN (M and N < 16)
-> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
*/
if (i.tm.opcode_modifier.evex)
{
/* If only lower 16 vector registers are used, we can use
VEX encoding. */
for (j = 0; j < 3; j++)
if (register_number (i.op[j].regs) > 15)
break;
if (j < 3)
i.tm.opcode_modifier.evex = EVEX128;
else
{
i.tm.opcode_modifier.vex = VEX128;
i.tm.opcode_modifier.vexw = VEXW0;
i.tm.opcode_modifier.evex = 0;
}
}
else
i.tm.opcode_modifier.vex = VEX128;
if (i.tm.opcode_modifier.vex)
for (j = 0; j < 3; j++)
{
i.types[j].bitfield.xmmword = 1;
i.types[j].bitfield.ymmword = 0;
}
}
}
/* This is the guts of the machine-dependent assembler. LINE points to a
machine dependent instruction. This function is supposed to emit
the frags/bytes it assembles to. */
@ -3877,6 +4081,9 @@ md_assemble (char *line)
i.disp_operands = 0;
}
if (optimize && !i.no_optimize && i.tm.opcode_modifier.optimize)
optimize_encoding ();
if (!process_suffix ())
return;
@ -4131,6 +4338,10 @@ parse_insn (char *line, char *mnemonic)
/* {rex} */
i.rex_encoding = TRUE;
break;
case 0x8:
/* {nooptimize} */
i.no_optimize = TRUE;
break;
default:
abort ();
}
@ -10074,9 +10285,9 @@ md_operand (expressionS *e)
#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
const char *md_shortopts = "kVQ:sqn";
const char *md_shortopts = "kVQ:sqnO::";
#else
const char *md_shortopts = "qn";
const char *md_shortopts = "qnO::";
#endif
#define OPTION_32 (OPTION_MD_BASE + 0)
@ -10513,6 +10724,27 @@ md_parse_option (int c, const char *arg)
intel64 = 1;
break;
case 'O':
if (arg == NULL)
{
optimize = 1;
/* Turn off -Os. */
optimize_for_space = 0;
}
else if (*arg == 's')
{
optimize_for_space = 1;
/* Turn on all encoding optimizations. */
optimize = -1;
}
else
{
optimize = atoi (arg);
/* Turn off -Os. */
optimize_for_space = 0;
}
break;
default:
return 0;
}

View File

@ -411,6 +411,28 @@ with 01, 10 and 11 RC bits, respectively.
This option specifies that the assembler should accept only AMD64 or
Intel64 ISA in 64-bit mode. The default is to accept both.
@cindex @samp{-O0} option, i386
@cindex @samp{-O0} option, x86-64
@cindex @samp{-O} option, i386
@cindex @samp{-O} option, x86-64
@cindex @samp{-O1} option, i386
@cindex @samp{-O1} option, x86-64
@cindex @samp{-O2} option, i386
@cindex @samp{-O2} option, x86-64
@cindex @samp{-Os} option, i386
@cindex @samp{-Os} option, x86-64
@item -O0 | -O | -O1 | -O2 | -Os
Optimize instruction encoding with smaller instruction size. @samp{-O}
and @samp{-O1} encode 64-bit register load instructions with 64-bit
immediate as 32-bit register load instructions with 31-bit or 32-bits
immediates and encode 64-bit register clearing instructions with 32-bit
register clearing instructions. @samp{-O2} includes @samp{-O1}
optimization plus encodes 256-bit and 512-bit vector register clearing
instructions with 128-bit vector register clearing instructions.
@samp{-Os} includes @samp{-O2} optimization plus encodes 16-bit, 32-bit
and 64-bit register tests with immediate as 8-bit register test with
immediate. @samp{-O0} turns off this optimization.
@end table
@c man end
@ -647,6 +669,9 @@ Different encoding options can be specified via pseudo prefixes:
@samp{@{rex@}} -- prefer REX prefix for integer and legacy vector
instructions (x86-64 only). Note that this differs from the @samp{rex}
prefix which generates REX prefix unconditionally.
@item
@samp{@{nooptimize@}} -- disable instruction size optimization.
@end itemize
@cindex conversion instructions, i386

View File

@ -1,3 +1,4 @@
#as: -O0
#objdump: -Wf
#name: CFI on x86-64
#...

View File

@ -433,6 +433,9 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_32_check]]
run_list_test "inval-pseudo" "-al"
run_dump_test "nop-1"
run_dump_test "nop-2"
run_dump_test "optimize-1"
run_dump_test "optimize-2"
run_dump_test "optimize-3"
# These tests require support for 8 and 16 bit relocs,
# so we only run them for ELF and COFF targets.
@ -913,6 +916,10 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
run_dump_test "x86-64-movd-intel"
run_dump_test "x86-64-nop-1"
run_dump_test "x86-64-nop-2"
run_dump_test "x86-64-optimize-1"
run_dump_test "x86-64-optimize-2"
run_dump_test "x86-64-optimize-3"
run_dump_test "x86-64-optimize-4"
if { ![istarget "*-*-aix*"]
&& ![istarget "*-*-beos*"]

View File

@ -1,4 +1,5 @@
#source: ../../../cfi/cfi-x86_64.s
#as: -O0
#readelf: -wf
#name: CFI on x86-64
Contents of the .eh_frame section:

View File

@ -0,0 +1,45 @@
#as: -O2
#objdump: -drw
#name: optimized encoding 1 with -O2
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: 62 f1 f5 4f 55 e9 vandnpd %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 f5 af 55 e9 vandnpd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 55 e9 vandnpd %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 55 e9 vandnpd %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 74 4f 55 e9 vandnps %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 74 af 55 e9 vandnps %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f0 55 e9 vandnps %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f0 55 e9 vandnps %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 df e9 vpandn %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 75 4f df e9 vpandnd %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 75 af df e9 vpandnd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 df e9 vpandn %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 df e9 vpandn %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 f5 4f df e9 vpandnq %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 f5 af df e9 vpandnq %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 df e9 vpandn %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 df e9 vpandn %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 f5 4f 57 e9 vxorpd %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 f5 af 57 e9 vxorpd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 57 e9 vxorpd %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 57 e9 vxorpd %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 74 4f 57 e9 vxorps %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 74 af 57 e9 vxorps %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f0 57 e9 vxorps %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f0 57 e9 vxorps %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 ef e9 vpxor %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 75 4f ef e9 vpxord %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 75 af ef e9 vpxord %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 ef e9 vpxor %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 ef e9 vpxor %xmm1,%xmm1,%xmm5
+[a-f0-9]+: 62 f1 f5 4f ef e9 vpxorq %zmm1,%zmm1,%zmm5\{%k7\}
+[a-f0-9]+: 62 f1 f5 af ef e9 vpxorq %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+[a-f0-9]+: c5 f1 ef e9 vpxor %xmm1,%xmm1,%xmm5
+[a-f0-9]+: c5 f1 ef e9 vpxor %xmm1,%xmm1,%xmm5
#pass

View File

@ -0,0 +1,48 @@
# Check instructions with optimized encoding
.allow_index_reg
.text
_start:
vandnpd %zmm1, %zmm1, %zmm5{%k7}
vandnpd %ymm1, %ymm1, %ymm5{z}{%k7}
vandnpd %zmm1, %zmm1, %zmm5
vandnpd %ymm1, %ymm1, %ymm5
vandnps %zmm1, %zmm1, %zmm5{%k7}
vandnps %ymm1, %ymm1, %ymm5{z}{%k7}
vandnps %zmm1, %zmm1, %zmm5
vandnps %ymm1, %ymm1, %ymm5
vpandn %ymm1, %ymm1, %ymm5
vpandnd %zmm1, %zmm1, %zmm5{%k7}
vpandnd %ymm1, %ymm1, %ymm5{z}{%k7}
vpandnd %zmm1, %zmm1, %zmm5
vpandnd %ymm1, %ymm1, %ymm5
vpandnq %zmm1, %zmm1, %zmm5{%k7}
vpandnq %ymm1, %ymm1, %ymm5{z}{%k7}
vpandnq %zmm1, %zmm1, %zmm5
vpandnq %ymm1, %ymm1, %ymm5
vxorpd %zmm1, %zmm1, %zmm5{%k7}
vxorpd %ymm1, %ymm1, %ymm5{z}{%k7}
vxorpd %zmm1, %zmm1, %zmm5
vxorpd %ymm1, %ymm1, %ymm5
vxorps %zmm1, %zmm1, %zmm5{%k7}
vxorps %ymm1, %ymm1, %ymm5{z}{%k7}
vxorps %zmm1, %zmm1, %zmm5
vxorps %ymm1, %ymm1, %ymm5
vpxor %ymm1, %ymm1, %ymm5
vpxord %zmm1, %zmm1, %zmm5{%k7}
vpxord %ymm1, %ymm1, %ymm5{z}{%k7}
vpxord %zmm1, %zmm1, %zmm5
vpxord %ymm1, %ymm1, %ymm5
vpxorq %zmm1, %zmm1, %zmm5{%k7}
vpxorq %ymm1, %ymm1, %ymm5{z}{%k7}
vpxorq %zmm1, %zmm1, %zmm5
vpxorq %ymm1, %ymm1, %ymm5

View File

@ -0,0 +1,19 @@
#as: -Os
#objdump: -drw
#name: optimized encoding 2 with -Os
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f7 c7 7f 00 00 00 test \$0x7f,%edi
+[a-f0-9]+: 66 f7 c7 7f 00 test \$0x7f,%di
#pass

View File

@ -0,0 +1,13 @@
# Check instructions with optimized encoding
.allow_index_reg
.text
_start:
testl $0x7f, %eax
testw $0x7f, %ax
testb $0x7f, %al
test $0x7f, %ebx
test $0x7f, %bx
test $0x7f, %bl
test $0x7f, %edi
test $0x7f, %di

View File

@ -0,0 +1,12 @@
#as: -Os
#objdump: -drw
#name: optimized encoding 3 with -Os
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: a9 7f 00 00 00 test \$0x7f,%eax
#pass

View File

@ -0,0 +1,6 @@
# Check instructions with optimized encoding
.allow_index_reg
.text
_start:
{nooptimize} testl $0x7f, %eax

View File

@ -0,0 +1,53 @@
#as: -O
#objdump: -drw
#name: x86-64 optimized encoding 1 with -O
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: 48 25 00 00 00 00 and \$0x0,%rax 2: R_X86_64_32S foo
+[a-f0-9]+: 25 ff ff ff 7f and \$0x7fffffff,%eax
+[a-f0-9]+: 81 e3 ff ff ff 7f and \$0x7fffffff,%ebx
+[a-f0-9]+: 41 81 e6 ff ff ff 7f and \$0x7fffffff,%r14d
+[a-f0-9]+: 48 25 00 00 00 80 and \$0xffffffff80000000,%rax
+[a-f0-9]+: 48 81 e3 00 00 00 80 and \$0xffffffff80000000,%rbx
+[a-f0-9]+: 49 81 e6 00 00 00 80 and \$0xffffffff80000000,%r14
+[a-f0-9]+: a9 ff ff ff 7f test \$0x7fffffff,%eax
+[a-f0-9]+: f7 c3 ff ff ff 7f test \$0x7fffffff,%ebx
+[a-f0-9]+: 41 f7 c6 ff ff ff 7f test \$0x7fffffff,%r14d
+[a-f0-9]+: 48 a9 00 00 00 80 test \$0xffffffff80000000,%rax
+[a-f0-9]+: 48 f7 c3 00 00 00 80 test \$0xffffffff80000000,%rbx
+[a-f0-9]+: 49 f7 c6 00 00 00 80 test \$0xffffffff80000000,%r14
+[a-f0-9]+: 48 33 06 xor \(%rsi\),%rax
+[a-f0-9]+: 31 c0 xor %eax,%eax
+[a-f0-9]+: 31 db xor %ebx,%ebx
+[a-f0-9]+: 45 31 f6 xor %r14d,%r14d
+[a-f0-9]+: 48 31 d0 xor %rdx,%rax
+[a-f0-9]+: 48 31 d3 xor %rdx,%rbx
+[a-f0-9]+: 49 31 d6 xor %rdx,%r14
+[a-f0-9]+: 29 c0 sub %eax,%eax
+[a-f0-9]+: 29 db sub %ebx,%ebx
+[a-f0-9]+: 45 29 f6 sub %r14d,%r14d
+[a-f0-9]+: 48 29 d0 sub %rdx,%rax
+[a-f0-9]+: 48 29 d3 sub %rdx,%rbx
+[a-f0-9]+: 49 29 d6 sub %rdx,%r14
+[a-f0-9]+: 48 81 20 ff ff ff 7f andq \$0x7fffffff,\(%rax\)
+[a-f0-9]+: 48 81 20 00 00 00 80 andq \$0xffffffff80000000,\(%rax\)
+[a-f0-9]+: 48 f7 00 ff ff ff 7f testq \$0x7fffffff,\(%rax\)
+[a-f0-9]+: 48 f7 00 00 00 00 80 testq \$0xffffffff80000000,\(%rax\)
+[a-f0-9]+: b8 ff ff ff 7f mov \$0x7fffffff,%eax
+[a-f0-9]+: b8 ff ff ff 7f mov \$0x7fffffff,%eax
+[a-f0-9]+: 41 b8 ff ff ff 7f mov \$0x7fffffff,%r8d
+[a-f0-9]+: 41 b8 ff ff ff 7f mov \$0x7fffffff,%r8d
+[a-f0-9]+: b8 ff ff ff ff mov \$0xffffffff,%eax
+[a-f0-9]+: b8 ff ff ff ff mov \$0xffffffff,%eax
+[a-f0-9]+: 41 b8 ff ff ff ff mov \$0xffffffff,%r8d
+[a-f0-9]+: 41 b8 ff ff ff ff mov \$0xffffffff,%r8d
+[a-f0-9]+: b8 ff 03 00 00 mov \$0x3ff,%eax
+[a-f0-9]+: b8 ff 03 00 00 mov \$0x3ff,%eax
+[a-f0-9]+: 48 b8 00 00 00 00 01 00 00 00 movabs \$0x100000000,%rax
+[a-f0-9]+: 48 b8 00 00 00 00 01 00 00 00 movabs \$0x100000000,%rax
#pass

View File

@ -0,0 +1,47 @@
# Check 64bit instructions with optimized encoding
.allow_index_reg
.text
_start:
andq $foo, %rax
andq $((1<<31) - 1), %rax
andq $((1<<31) - 1), %rbx
andq $((1<<31) - 1), %r14
andq $-((1<<31)), %rax
andq $-((1<<31)), %rbx
andq $-((1<<31)), %r14
testq $((1<<31) - 1), %rax
testq $((1<<31) - 1), %rbx
testq $((1<<31) - 1), %r14
testq $-((1<<31)), %rax
testq $-((1<<31)), %rbx
testq $-((1<<31)), %r14
xorq (%rsi), %rax
xorq %rax, %rax
xorq %rbx, %rbx
xorq %r14, %r14
xorq %rdx, %rax
xorq %rdx, %rbx
xorq %rdx, %r14
subq %rax, %rax
subq %rbx, %rbx
subq %r14, %r14
subq %rdx, %rax
subq %rdx, %rbx
subq %rdx, %r14
andq $((1<<31) - 1), (%rax)
andq $-((1<<31)), (%rax)
testq $((1<<31) - 1), (%rax)
testq $-((1<<31)), (%rax)
mov $((1<<31) - 1),%rax
movq $((1<<31) - 1),%rax
mov $((1<<31) - 1),%r8
movq $((1<<31) - 1),%r8
mov $0xffffffff,%rax
movq $0xffffffff,%rax
mov $0xffffffff,%r8
movq $0xffffffff,%r8
mov $1023,%rax
movq $1023,%rax
mov $0x100000000,%rax
movq $0x100000000,%rax

View File

@ -0,0 +1,77 @@
#as: -O2
#objdump: -drw
#name: x86-64 optimized encoding 2 with -O2
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: 62 71 f5 4f 55 f9 vandnpd %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 f5 af 55 f9 vandnpd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 f5 08 55 c1 vandnpd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 f5 08 55 c1 vandnpd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 f5 00 55 c9 vandnpd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 f5 00 55 c9 vandnpd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 71 74 4f 55 f9 vandnps %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 74 af 55 f9 vandnps %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 74 08 55 c1 vandnps %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 74 08 55 c1 vandnps %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 74 00 55 c9 vandnps %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 74 00 55 c9 vandnps %xmm17,%xmm17,%xmm1
+[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 71 75 4f df f9 vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 75 af df f9 vpandnd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 75 08 df c1 vpandnd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 75 08 df c1 vpandnd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 75 00 df c9 vpandnd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 75 00 df c9 vpandnd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 71 f5 4f df f9 vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 f5 af df f9 vpandnq %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 f5 08 df c1 vpandnq %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 f5 08 df c1 vpandnq %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 f5 00 df c9 vpandnq %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 f5 00 df c9 vpandnq %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 71 f5 4f 57 f9 vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 f5 af 57 f9 vxorpd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 f5 08 57 c1 vxorpd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 f5 08 57 c1 vxorpd %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 f5 00 57 c9 vxorpd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 f5 00 57 c9 vxorpd %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 71 74 4f 57 f9 vxorps %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 74 af 57 f9 vxorps %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 74 08 57 c1 vxorps %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 74 08 57 c1 vxorps %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 74 00 57 c9 vxorps %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 74 00 57 c9 vxorps %xmm17,%xmm17,%xmm1
+[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 71 75 4f ef f9 vpxord %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 75 af ef f9 vpxord %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 75 08 ef c1 vpxord %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 75 08 ef c1 vpxord %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 75 00 ef c9 vpxord %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 75 00 ef c9 vpxord %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 71 f5 4f ef f9 vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
+[a-f0-9]+: 62 71 f5 af ef f9 vpxorq %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15
+[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15
+[a-f0-9]+: 62 e1 f5 08 ef c1 vpxorq %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 e1 f5 08 ef c1 vpxorq %xmm1,%xmm1,%xmm16
+[a-f0-9]+: 62 b1 f5 00 ef c9 vpxorq %xmm17,%xmm17,%xmm1
+[a-f0-9]+: 62 b1 f5 00 ef c9 vpxorq %xmm17,%xmm17,%xmm1
#pass

View File

@ -0,0 +1,80 @@
# Check 64bit instructions with optimized encoding
.allow_index_reg
.text
_start:
vandnpd %zmm1, %zmm1, %zmm15{%k7}
vandnpd %ymm1, %ymm1, %ymm15{z}{%k7}
vandnpd %zmm1, %zmm1, %zmm15
vandnpd %ymm1, %ymm1, %ymm15
vandnpd %zmm1, %zmm1, %zmm16
vandnpd %ymm1, %ymm1, %ymm16
vandnpd %zmm17, %zmm17, %zmm1
vandnpd %ymm17, %ymm17, %ymm1
vandnps %zmm1, %zmm1, %zmm15{%k7}
vandnps %ymm1, %ymm1, %ymm15{z}{%k7}
vandnps %zmm1, %zmm1, %zmm15
vandnps %ymm1, %ymm1, %ymm15
vandnps %zmm1, %zmm1, %zmm16
vandnps %ymm1, %ymm1, %ymm16
vandnps %zmm17, %zmm17, %zmm1
vandnps %ymm17, %ymm17, %ymm1
vpandn %ymm1, %ymm1, %ymm15
vpandnd %zmm1, %zmm1, %zmm15{%k7}
vpandnd %ymm1, %ymm1, %ymm15{z}{%k7}
vpandnd %zmm1, %zmm1, %zmm15
vpandnd %ymm1, %ymm1, %ymm15
vpandnd %zmm1, %zmm1, %zmm16
vpandnd %ymm1, %ymm1, %ymm16
vpandnd %zmm17, %zmm17, %zmm1
vpandnd %ymm17, %ymm17, %ymm1
vpandnq %zmm1, %zmm1, %zmm15{%k7}
vpandnq %ymm1, %ymm1, %ymm15{z}{%k7}
vpandnq %zmm1, %zmm1, %zmm15
vpandnq %ymm1, %ymm1, %ymm15
vpandnq %zmm1, %zmm1, %zmm16
vpandnq %ymm1, %ymm1, %ymm16
vpandnq %zmm17, %zmm17, %zmm1
vpandnq %ymm17, %ymm17, %ymm1
vxorpd %zmm1, %zmm1, %zmm15{%k7}
vxorpd %ymm1, %ymm1, %ymm15{z}{%k7}
vxorpd %zmm1, %zmm1, %zmm15
vxorpd %ymm1, %ymm1, %ymm15
vxorpd %zmm1, %zmm1, %zmm16
vxorpd %ymm1, %ymm1, %ymm16
vxorpd %zmm17, %zmm17, %zmm1
vxorpd %ymm17, %ymm17, %ymm1
vxorps %zmm1, %zmm1, %zmm15{%k7}
vxorps %ymm1, %ymm1, %ymm15{z}{%k7}
vxorps %zmm1, %zmm1, %zmm15
vxorps %ymm1, %ymm1, %ymm15
vxorps %zmm1, %zmm1, %zmm16
vxorps %ymm1, %ymm1, %ymm16
vxorps %zmm17, %zmm17, %zmm1
vxorps %ymm17, %ymm17, %ymm1
vpxor %ymm1, %ymm1, %ymm15
vpxord %zmm1, %zmm1, %zmm15{%k7}
vpxord %ymm1, %ymm1, %ymm15{z}{%k7}
vpxord %zmm1, %zmm1, %zmm15
vpxord %ymm1, %ymm1, %ymm15
vpxord %zmm1, %zmm1, %zmm16
vpxord %ymm1, %ymm1, %ymm16
vpxord %zmm17, %zmm17, %zmm1
vpxord %ymm17, %ymm17, %ymm1
vpxorq %zmm1, %zmm1, %zmm15{%k7}
vpxorq %ymm1, %ymm1, %ymm15{z}{%k7}
vpxorq %zmm1, %zmm1, %zmm15
vpxorq %ymm1, %ymm1, %ymm15
vpxorq %zmm1, %zmm1, %zmm16
vpxorq %ymm1, %ymm1, %ymm16
vpxorq %zmm17, %zmm17, %zmm1
vpxorq %ymm17, %ymm17, %ymm1

View File

@ -0,0 +1,27 @@
#as: -Os
#objdump: -drw
#name: x86-64 optimized encoding 3 with -Os
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: a8 7f test \$0x7f,%al
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: f6 c3 7f test \$0x7f,%bl
+[a-f0-9]+: 40 f6 c7 7f test \$0x7f,%dil
+[a-f0-9]+: 40 f6 c7 7f test \$0x7f,%dil
+[a-f0-9]+: 40 f6 c7 7f test \$0x7f,%dil
+[a-f0-9]+: 40 f6 c7 7f test \$0x7f,%dil
+[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b
+[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b
+[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b
+[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b
#pass

View File

@ -0,0 +1,21 @@
# Check 64bit instructions with optimized encoding
.allow_index_reg
.text
_start:
testq $0x7f, %rax
testl $0x7f, %eax
testw $0x7f, %ax
testb $0x7f, %al
test $0x7f, %rbx
test $0x7f, %ebx
test $0x7f, %bx
test $0x7f, %bl
test $0x7f, %rdi
test $0x7f, %edi
test $0x7f, %di
test $0x7f, %dil
test $0x7f, %r9
test $0x7f, %r9d
test $0x7f, %r9w
test $0x7f, %r9b

View File

@ -0,0 +1,12 @@
#as: -Os
#objdump: -drw
#name: x86-64 optimized encoding 4 with -Os
.*: +file format .*
Disassembly of section .text:
0+ <_start>:
+[a-f0-9]+: a9 7f 00 00 00 test \$0x7f,%eax
#pass

View File

@ -0,0 +1,6 @@
# Check 64bit instructions with optimized encoding
.allow_index_reg
.text
_start:
{nooptimize} testl $0x7f, %eax

View File

@ -1,3 +1,17 @@
2018-02-27 H.J. Lu <hongjiu.lu@intel.com>
PR gas/22871
* i386-gen.c (opcode_modifiers): Add Optimize.
* i386-opc.h (Optimize): New enum.
(i386_opcode_modifier): Add optimize.
* i386-opc.tbl: Add "Optimize" to "mov $imm, reg",
"sub reg, reg/mem", "test $imm, acc", "test $imm, reg/mem",
"and $imm, acc", "and $imm, reg/mem", "xor reg, reg/mem",
"movq $imm, reg" and AVX256 and AVX512 versions of vandnps,
vandnpd, vpandn, vpandnd, vpandnq, vxorps, vxorpd, vpxor,
vpxord and vpxorq.
* i386-tbl.h: Regenerated.
2018-02-26 Alan Modra <amodra@gmail.com>
* crx-dis.c (getregliststring): Allocate a large enough buffer

View File

@ -646,6 +646,7 @@ static bitfield opcode_modifiers[] =
BITFIELD (Disp8MemShift),
BITFIELD (NoDefMask),
BITFIELD (ImplicitQuadGroup),
BITFIELD (Optimize),
BITFIELD (OldGcc),
BITFIELD (ATTMnemonic),
BITFIELD (ATTSyntax),

View File

@ -601,6 +601,9 @@ enum
*/
ImplicitQuadGroup,
/* Support encoding optimization. */
Optimize,
/* Compatible with old (<= 2.8.1) versions of gcc */
OldGcc,
/* AT&T mnemonic. */
@ -678,6 +681,7 @@ typedef struct i386_opcode_modifier
unsigned int disp8memshift:3;
unsigned int nodefmask:1;
unsigned int implicitquadgroup:1;
unsigned int optimize:1;
unsigned int oldgcc:1;
unsigned int attmnemonic:1;
unsigned int attsyntax:1;

View File

@ -27,8 +27,8 @@ mov, 2, 0x88, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3,
// In the 64bit mode the short form mov immediate is redefined to have
// 64bit value.
mov, 2, 0xb0, None, 1, 0, W|ShortForm|No_sSuf|No_qSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32 }
mov, 2, 0xc6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
mov, 2, 0xb0, None, 1, Cpu64, W|ShortForm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Imm64, Reg64 }
mov, 2, 0xc6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
mov, 2, 0xb0, None, 1, Cpu64, W|ShortForm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
// The segment register moves accept WordReg so that a segment register
// can be copied to a 32 bit register, and vice versa, without using a
// size prefix. When moving to a 32 bit register, the upper 16 bits
@ -166,7 +166,7 @@ add, 2, 0x80, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8
inc, 1, 0x40, None, 1, CpuNo64, ShortForm|No_bSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Reg32 }
inc, 1, 0xfe, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
sub, 2, 0x28, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
sub, 2, 0x28, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
sub, 2, 0x83, 0x5, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
sub, 2, 0x2c, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
sub, 2, 0x80, 0x5, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
@ -186,20 +186,20 @@ cmp, 2, 0x80, 0x7, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Re
test, 2, 0x84, None, 1, 0, W|CheckRegSize|Modrm|No_sSuf|No_ldSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|Byte|Word|Dword|Qword|BaseIndex }
test, 2, 0x84, None, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
test, 2, 0xa8, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
test, 2, 0xf6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
test, 2, 0xa8, None, 1, 0, W|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
test, 2, 0xf6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
and, 2, 0x20, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
and, 2, 0x83, 0x4, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
and, 2, 0x24, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
and, 2, 0x80, 0x4, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
and, 2, 0x24, None, 1, 0, W|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
and, 2, 0x80, 0x4, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
or, 2, 0x8, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
or, 2, 0x83, 0x1, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
or, 2, 0xc, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
or, 2, 0x80, 0x1, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
xor, 2, 0x30, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
xor, 2, 0x30, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
xor, 2, 0x83, 0x6, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
xor, 2, 0x34, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
xor, 2, 0x80, 0x6, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
@ -830,6 +830,7 @@ rex.wrxb, 0, 0x4f, None, 1, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ld
{vex3}, 0, 0x5, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
{evex}, 0, 0x6, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
{rex}, 0, 0x7, None, 0, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
{nooptimize}, 0, 0x8, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
// 486 extensions.
@ -964,8 +965,8 @@ movd, 2, 0xf7e, None, 2, CpuMMX|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|
// we only mark constants larger than 32bit as Disp64.
movq, 2, 0xa0, None, 1, Cpu64, D|W|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Disp64|Unspecified|Qword, Acc|Qword }
movq, 2, 0x88, None, 1, Cpu64, D|W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
movq, 2, 0xc6, 0x0, 1, Cpu64, W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
movq, 2, 0xb0, None, 1, Cpu64, W|ShortForm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm64, Reg64 }
movq, 2, 0xc6, 0x0, 1, Cpu64, W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
movq, 2, 0xb0, None, 1, Cpu64, W|ShortForm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
movq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
movq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
movq, 2, 0x666e, None, 1, CpuAVX|Cpu64, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Qword|Unspecified|BaseIndex, RegXMM }
@ -1843,8 +1844,8 @@ vaddsd, 3, 0xf258, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|Ign
vaddss, 3, 0xf358, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
vaddsubpd, 3, 0x66d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vaddsubps, 3, 0xf2d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandpd, 3, 0x6654, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vandps, 3, 0x54, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vblendpd, 4, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
@ -2275,8 +2276,8 @@ vunpckhpd, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|Ch
vunpckhps, 3, 0x15, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vunpcklpd, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vunpcklps, 3, 0x14, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
vzeroall, 0, 0x77, None, 1, CpuAVX, Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
vzeroupper, 0, 0x77, None, 1, CpuAVX, Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
@ -2301,7 +2302,7 @@ vpaddusb, 3, 0x66dc, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|
vpaddusw, 3, 0x66dd, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpalignr, 4, 0x660f, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpand, 3, 0x66db, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpandn, 3, 0x66df, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpandn, 3, 0x66df, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpavgb, 3, 0x66e0, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpavgw, 3, 0x66e3, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpblendvb, 4, 0x664c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexSources=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
@ -2397,7 +2398,7 @@ vpunpcklbw, 3, 0x6660, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=
vpunpckldq, 3, 0x6662, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpunpcklqdq, 3, 0x666c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpunpcklwd, 3, 0x6661, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpxor, 3, 0x66ef, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
vpxor, 3, 0x66ef, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
// New AVX2 instructions.
@ -3935,22 +3936,22 @@ vrsqrt14pd, 2, 0x664E, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=1|V
vpaddd, 3, 0x66FE, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandd, 3, 0x66DB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandnd, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandnd, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpord, 3, 0x66EB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpsubd, 3, 0x66FA, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpunpckhdq, 3, 0x666A, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpxord, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpxord, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpaddq, 3, 0x66D4, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandnq, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandnq, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpandq, 3, 0x66DB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpmuludq, 3, 0x66F4, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vporq, 3, 0x66EB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpsubq, 3, 0x66FB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpunpckhqdq, 3, 0x666D, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpxorq, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vpxorq, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vunpckhpd, 3, 0x6615, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vunpcklpd, 3, 0x6614, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
@ -4897,7 +4898,7 @@ vpaddd, 3, 0x66FE, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOp
vpandd, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpandd, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpord, 3, 0x66EB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpord, 3, 0x66EB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vprold, 3, 0x6672, 1, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, RegXMM }
@ -4929,12 +4930,12 @@ vpunpckhdq, 3, 0x666A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|V
vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpaddq, 3, 0x66D4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpaddq, 3, 0x66D4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpandq, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpandq, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpmuludq, 3, 0x66F4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
@ -4968,7 +4969,7 @@ vpunpckhqdq, 3, 0x666D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|
vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vshufpd, 4, 0x66C6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vshufpd, 4, 0x66C6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vunpckhpd, 3, 0x6615, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
@ -5670,31 +5671,31 @@ ktestw, 2, 0x99, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=0|VexW=1|IgnoreSize
kshiftlb, 3, 0x6632, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegMask, RegMask }
kshiftrb, 3, 0x6630, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegMask, RegMask }
vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vandpd, 3, 0x6654, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandpd, 3, 0x6654, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vandpd, 3, 0x6654, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vorpd, 3, 0x6656, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vorpd, 3, 0x6656, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vorpd, 3, 0x6656, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vandnps, 3, 0x55, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandnps, 3, 0x55, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vandps, 3, 0x54, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vandps, 3, 0x54, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vandps, 3, 0x54, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vorps, 3, 0x56, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vorps, 3, 0x56, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vorps, 3, 0x56, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vxorps, 3, 0x57, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vxorps, 3, 0x57, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
vbroadcastf32x2, 2, 0x6619, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=1|VexW=1|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegZMM }
vbroadcastf32x2, 2, 0x6619, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=1|VexW=1|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM }

File diff suppressed because it is too large Load Diff