Chapter 3 Instruction
Set Mapping
This chapter provides a general mapping between the Solaris x86 assembly language
mnemonics and the Intel or Advanced Micro Devices (AMD) mnemonics.
Instruction Overview
It is beyond the scope of this manual to document the x86 architecture instruction
set. This chapter provides a general mapping between the Solaris x86 assembly language
mnemonics and the Intel or AMD mnemonics to enable you to refer to your vendor's documentation
for detailed information about a specific instruction. Instructions are grouped by
functionality in tables with the following sections:
-
Solaris mnemonic
-
Intel/AMD mnemonic
-
Description (short)
-
Notes
For certain Solaris mnemonics, the allowed data type suffixes for that mnemonic
are indicated in braces ({}) following the mnemonic. For example, bswap{lq} indicates that the following mnemonics are valid: bswap, bswapl (which is the default and equivalent to bswap),
and bswapq. See Instructions for information on data type suffixes.
To locate a specific Solaris x86 mnemonic, look up the mnemonic in the index.
General-Purpose Instructions
The general-purpose instructions perform basic data movement, memory addressing,
arithmetic and logical operations, program flow control, input/output, and string
operations on integer, pointer, and BCD data types.
Data Transfer Instructions
The data transfer instructions move data between memory and
the general-purpose and segment registers, and perform operations such as conditional
moves, stack access, and data conversion.
Table 3–1 Data Transfer Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
bswap{lq}
|
BSWAP
|
byte swap
|
bswapq valid only under -xarch=amd64
|
|
cbtw
|
CBW
|
convert byte to word
|
|
|
cltd
|
CDQ
|
convert doubleword to quadword
|
%eax -> %edx:%eax
|
|
cltq
|
CDQE
|
convert doubleword to quadword
|
%eax -> %rax
cltq valid only under -xarch=amd64
|
|
cmova{wlq}, cmov{wlq}.a
|
CMOVA
|
conditional move if above
|
cmovaq valid only under -xarch=amd64
|
|
cmovae{wlq}, cmov{wlq}.ae
|
CMOVAE
|
conditional move if above or equal
|
cmovaeq valid only under -xarch=amd64
|
|
cmovb{wlq}, cmov{wlq}.b
|
CMOVB
|
conditional move if below
|
cmovbq valid only under -xarch=amd64
|
|
cmovbe{wlq}, cmov{wlq}.be
|
CMOVBE
|
conditional move if below or equal
|
cmovbeq valid only under -xarch=amd64
|
|
cmovc{wlq}, cmov{wlq}.c
|
CMOVC
|
conditional move if carry
|
cmovcq valid only under -xarch=amd64
|
|
cmove{wlq}, cmov{wlq}.e
|
CMOVE
|
conditional move if equal
|
cmoveq valid only under -xarch=amd64
|
|
cmovg{wlq}, cmov{wlq}.g
|
CMOVG
|
conditional move if greater
|
cmovgq valid only under -xarch=amd64
|
|
cmovge{wlq}, cmov{wlq}.ge
|
CMOVGE
|
conditional move if greater or equal
|
cmovgeq valid only under -xarch=amd64
|
|
cmovl{wlq}, cmov{wlq}.l
|
CMOVL
|
conditional move if less
|
cmovlq valid only under -xarch=amd64
|
|
cmovle{wlq}, cmov{wlq}.le
|
COMVLE
|
conditional move if less or equal
|
cmovleq valid only under -xarch=amd64
|
|
cmovna{wlq}, cmov{wlq}.na
|
CMOVNA
|
conditional move if not above
|
cmovnaq valid only under -xarch=amd64
|
|
cmovnae{wlq}, cmov{wlq}.nae
|
CMOVNAE
|
conditional move if not above or equal
|
cmovnaeq valid only under -xarch=amd64
|
|
cmovnb{wlq}, cmov{wlq}.nb
|
CMOVNB
|
conditional move if not below
|
cmovnbq valid only under -xarch=amd64
|
|
cmovnbe{wlq}, cmov{wlq}.nbe
|
CMOVNBE
|
conditional move if not below or equal
|
cmovnbeq valid only under -xarch=amd64
|
|
cmovnc{wlq}, cmov{wlq}.nc
|
CMOVNC
|
conditional move if not carry
|
cmovncq valid only under -xarch=amd64
|
|
cmovne{wlq}, cmov{wlq}.ne
|
CMOVNE
|
conditional move if not equal
|
cmovneq valid only under -xarch=amd64
|
|
cmovng{wlq}, cmov{wlq}.ng
|
CMOVNG
|
conditional move if greater
|
cmovngq valid only under -xarch=amd64
|
|
cmovnge{wlq}, cmov{wlq}.nge
|
CMOVNGE
|
conditional move if not greater or equal
|
cmovngeq valid only under -xarch=amd64
|
|
cmovnl{wlq}, cmov{wlq}.nl
|
CMOVNL
|
conditional move if not less
|
cmovnlq valid only under -xarch=amd64
|
|
cmovnle{wlq}, cmov{wlq}.nle
|
CMOVNLE
|
conditional move if not above or equal
|
cmovnleq valid only under -xarch=amd64
|
|
cmovno{wlq}, cmov{wlq}.no
|
CMOVNO
|
conditional move if not overflow
|
cmovnoq valid only under -xarch=amd64
|
|
cmovnp{wlq}, cmov{wlq}.np
|
CMOVNP
|
conditional move if not parity
|
cmovnpq valid only under -xarch=amd64
|
|
cmovns{wlq}, cmov{wlq}.ns
|
CMOVNS
|
conditional move if not sign (non-negative)
|
cmovnsq valid only under -xarch=amd64
|
|
cmovnz{wlq}, cmov{wlq}.nz
|
CMOVNZ
|
conditional move if not zero
|
cmovnzq valid only under -xarch=amd64
|
|
cmovo{wlq}, cmov{wlq}.o
|
CMOVO
|
conditional move if overflow
|
cmovoq valid only under -xarch=amd64
|
|
cmovp{wlq}, cmov{wlq}.p
|
CMOVP
|
conditional move if parity
|
cmovpq valid only under -xarch=amd64
|
|
cmovpe{wlq}, cmov{wlq}.pe
|
CMOVPE
|
conditional move if parity even
|
cmovpeq valid only under -xarch=amd64
|
|
cmovpo{wlq}, cmov{wlq}.po
|
CMOVPO
|
conditional move if parity odd
|
cmovpoq valid only under -xarch=amd64
|
|
cmovs{wlq}, cmov{wlq}.s
|
CMOVS
|
conditional move if sign (negative)
|
cmovsq valid only under -xarch=amd64
|
|
cmovz{wlq}, cmov{wlq}.z
|
CMOVZ
|
conditional move if zero
|
cmovzq valid only under -xarch=amd64
|
|
cmpxchg{bwlq}
|
CMPXCHG
|
compare and exchange
|
cmpxchgq valid only under -xarch=amd64
|
|
cmpxchg8b
|
CMPXCHG8B
|
compare and exchange 8 bytes
|
|
|
cqtd
|
CQO
|
convert quadword to octword
|
%rax -> %rdx:%rax
cqtd valid only under -xarch=amd64
|
|
cqto
|
CQO
|
convert quadword to octword
|
%rax -> %rdx:%rax
cqto valid only under -xarch=amd64
|
|
cwtd
|
CWD
|
convert word to doubleword
|
%ax -> %dx:%ax
|
|
cwtl
|
CWDE
|
convert word to doubleword in %eax register
|
%ax -> %eax
|
|
mov{bwlq}
|
MOV
|
move data between immediate values, general purpose registers, segment registers,
and memory
|
movq valid only under -xarch=amd64
|
|
movabs{bwlq}
|
MOVABS
|
move immediate value to register
|
movabs valid only under -xarch=amd64
|
|
movabs{bwlq}A
|
MOVABS
|
move immediate value to register {AL, AX, GAX, RAX}
|
movabs valid only under -xarch=amd64
|
|
movsb{wlq}, movsw{lq}
|
MOVSX
|
move and sign extend
|
movsbq and movswq valid only under -xarch=amd64
|
|
movzb{wlq}, movzw{lq}
|
MOVZX
|
move and zero extend
|
movzbq and movzwq valid only under -xarch=amd64
|
|
pop{wlq}
|
POP
|
pop stack
|
popq valid only under -xarch=amd64
|
|
popaw
|
POPA
|
pop general-purpose registers from stack
|
popaw invalid under -xarch=amd64
|
|
popal, popa
|
POPAD
|
pop general-purpose registers from stack
|
invalid under -xarch=amd64
|
|
push{wlq}
|
PUSH
|
push onto stack
|
pushq valid only under -xarch=amd64
|
|
pushaw
|
PUSHA
|
push general-purpose registers onto stack
|
pushaw invalid under -xarch=amd64
|
|
pushal, pusha
|
PUSHAD
|
push general-purpose registers onto stack
|
invalid under -xarch=amd64
|
|
xadd{bwlq}
|
XADD
|
exchange and add
|
xaddq valid only under -xarch=amd64
|
|
xchg{bwlq}
|
XCHG
|
exchange
|
xchgq valid only under -xarch=amd64
|
|
xchg{bwlq}A
|
XCHG
|
exchange
|
xchgqA valid only under -xarch=amd64
|
Binary Arithmetic Instructions
The binary arithmetic instructions perform basic integer computions
on operands in memory or the general-purpose registers.
Table 3–2 Binary Arithmetic Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
adc{bwlq}
|
ADC
|
add with carry
|
adcq valid only under -xarch=amd64
|
|
add{bwlq}
|
ADD
|
integer add
|
addq valid only under -xarch=amd64
|
|
cmp{bwlq}
|
CMP
|
compare
|
cmpq valid only under -xarch=amd64
|
|
dec{bwlq}
|
DEC
|
decrement
|
decq valid only under -xarch=amd64
|
|
div{bwlq}
|
DIV
|
divide (unsigned)
|
divq valid only under -xarch=amd64
|
|
idiv{bwlq}
|
IDIV
|
divide (signed)
|
idivq valid only under -xarch=amd64
|
|
imul{bwlq}
|
IMUL
|
multiply (signed)
|
imulq valid only under -xarch=amd64
|
|
inc{bwlq}
|
INC
|
increment
|
incq valid only under -xarch=amd64
|
|
mul{bwlq}
|
MUL
|
multiply (unsigned)
|
mulq valid only under -xarch=amd64
|
|
neg{bwlq}
|
NEG
|
negate
|
negq valid only under -xarch=amd64
|
|
sbb{bwlq}
|
SBB
|
subtract with borrow
|
sbbq valid only under -xarch=amd64
|
|
sub{bwlq}
|
SUB
|
subtract
|
subq valid only under -xarch=amd64
|
Decimal Arithmetic Instructions
The decimal arithmetic instructions perform decimal arithmetic
on binary coded decimal (BCD) data.
Table 3–3 Decimal Arithmetic Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
aaa
|
AAA
|
ASCII adjust after addition
|
invalid under -xarch=amd64
|
|
aad
|
AAD
|
ASCII adjust before division
|
invalid under -xarch=amd64
|
|
aam
|
AAM
|
ASCII adjust after multiplication
|
invalid under -xarch=amd64
|
|
aas
|
AAS
|
ASCII adjust after subtraction
|
invalid under -xarch=amd64
|
|
daa
|
DAA
|
decimal adjust after addition
|
invalid under -xarch=amd64
|
|
das
|
DAS
|
decimal adjust after subtraction
|
invalid under -xarch=amd64
|
Logical Instructions
The logical
instructions perform basic logical operations on their operands.
Table 3–4 Logical Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
and{bwlq}
|
AND
|
bitwise logical AND
|
andq valid only under -xarch=amd64
|
|
not{bwlq}
|
NOT
|
bitwise logical NOT
|
notq valid only under -xarch=amd64
|
|
or{bwlq}
|
OR
|
bitwise logical OR
|
orq valid only under -xarch=amd64
|
|
xor{bwlq}
|
XOR
|
bitwise logical exclusive OR
|
xorq valid only under -xarch=amd64
|
Shift and Rotate Instructions
The shift
and rotate instructions shift and rotate the bits in their operands.
Table 3–5 Shift and Rotate Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
rcl{bwlq}
|
RCL
|
rotate through carry left
|
rclq valid only under -xarch=amd64
|
|
rcr{bwlq}
|
RCR
|
rotate through carry right
|
rcrq valid only under -xarch=amd64
|
|
rol{bwlq}
|
ROL
|
rotate left
|
rolq valid only under -xarch=amd64
|
|
ror{bwlq}
|
ROR
|
rotate right
|
rorq valid only under -xarch=amd64
|
|
sal{bwlq}
|
SAL
|
shift arithmetic left
|
salq valid only under -xarch=amd64
|
|
sar{bwlq}
|
SAR
|
shift arithmetic right
|
sarq valid only under -xarch=amd64
|
|
shl{bwlq}
|
SHL
|
shift logical left
|
shlq valid only under -xarch=amd64
|
|
shld{bwlq}
|
SHLD
|
shift left double
|
shldq valid only under -xarch=amd64
|
|
shr{bwlq}
|
SHR
|
shift logical right
|
shrq valid only under -xarch=amd64
|
|
shrd{bwlq}
|
SHRD
|
shift right double
|
shrdq valid only under -xarch=amd64
|
Bit and Byte Instructions
The bit
instructions test and modify individual bits in operands. The byte instructions set
the value of a byte operand to indicate the status of flags in the %eflags register.
Table 3–6 Bit and Byte Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
bsf{wlq}
|
BSF
|
bit scan forward
|
bsfq valid only under -xarch=amd64
|
|
bsr{wlq}
|
BSR
|
bit scan reverse
|
bsrq valid only under -xarch=amd64
|
|
bt{wlq}
|
BT
|
bit test
|
btq valid only under -xarch=amd64
|
|
btc{wlq}
|
BTC
|
bit test and complement
|
btcq valid only under -xarch=amd64
|
|
btr{wlq}
|
BTR
|
bit test and reset
|
btrq valid only under -xarch=amd64
|
|
bts{wlq}
|
BTS
|
bit test and set
|
btsq valid only under -xarch=amd64
|
|
seta
|
SETA
|
set byte if above
|
|
|
setae
|
SETAE
|
set byte if above or equal
|
|
|
setb
|
SETB
|
set byte if below
|
|
|
setbe
|
SETBE
|
set byte if below or equal
|
|
|
setc
|
SETC
|
set byte if carry
|
|
|
sete
|
SETE
|
set byte if equal
|
|
|
setg
|
SETG
|
set byte if greater
|
|
|
setge
|
SETGE
|
set byte if greater or equal
|
|
|
setl
|
SETL
|
set byte if less
|
|
|
setle
|
SETLE
|
set byte if less or equal
|
|
|
setna
|
SETNA
|
set byte if not above
|
|
|
setnae
|
SETNAE
|
set byte if not above or equal
|
|
|
setnb
|
SETNB
|
set byte if not below
|
|
|
setnbe
|
SETNBE
|
set byte if not below or equal
|
|
|
setnc
|
SETNC
|
set byte if not carry
|
|
|
setne
|
SETNE
|
set byte if not equal
|
|
|
setng
|
SETNG
|
set byte if not greater
|
|
|
setnge
|
SETNGE
|
set byte if not greater or equal
|
|
|
setnl
|
SETNL
|
set byte if not less
|
|
|
setnle
|
SETNLE
|
set byte if not less or equal
|
|
|
setno
|
SETNO
|
set byte if not overflow
|
|
|
setnp
|
SETNP
|
set byte if not parity
|
|
|
setns
|
SETNS
|
set byte if not sign (non-negative)
|
|
|
setnz
|
SETNZ
|
set byte if not zero
|
|
|
seto
|
SETO
|
set byte if overflow
|
|
|
setp
|
SETP
|
set byte if parity
|
|
|
setpe
|
SETPE
|
set byte if parity even
|
|
|
setpo
|
SETPO
|
set byte if parity odd
|
|
|
sets
|
SETS
|
set byte if sign (negative)
|
|
|
setz
|
SETZ
|
set byte if zero
|
|
|
test{bwlq}
|
TEST
|
logical compare
|
testq valid only under -xarch=amd64
|
Control Transfer Instructions
The control transfer instructions control the flow of program
execution.
Table 3–7 Control Transfer Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
bound{wl}
|
BOUND
|
detect value out of range
|
boundw invalid under -xarch=amd64
|
|
call
|
CALL
|
call procedure
|
|
|
enter
|
ENTER
|
high-level procedure entry
|
|
|
int
|
INT
|
software interrupt
|
|
|
into
|
INTO
|
interrupt on overflow
|
invalid under -xarch=amd64
|
|
iret
|
IRET
|
return from interrupt
|
|
|
ja
|
JA
|
jump if above
|
|
|
jae
|
JAE
|
jump if above or equal
|
|
|
jb
|
JB
|
jump if below
|
|
|
jbe
|
JBE
|
jump if below or equal
|
|
|
jc
|
JC
|
jump if carry
|
|
|
jcxz
|
JCXZ
|
jump register %cx zero
|
|
|
je
|
JE
|
jump if equal
|
|
|
jecxz
|
JECXZ
|
jump register %ecx zero
|
invalid under -xarch=amd64
|
|
jg
|
JG
|
jump if greater
|
|
|
jge
|
JGE
|
jump if greater or equal
|
|
|
jl
|
JL
|
jump if less
|
|
|
jle
|
JLE
|
jump if less or equal
|
|
|
jmp
|
JMP
|
jump
|
|
|
jnae
|
JNAE
|
jump if not above or equal
|
|
|
jnb
|
JNB
|
jump if not below
|
|
|
jnbe
|
JNBE
|
jump if not below or equal
|
|
|
jnc
|
JNC
|
jump if not carry
|
|
|
jne
|
JNE
|
jump if not equal
|
|
|
jng
|
JNG
|
jump if not greater
|
|
|
jnge
|
JNGE
|
jump if not greater or equal
|
|
|
jnl
|
JNL
|
jump if not less
|
|
|
jnle
|
JNLE
|
jump if not less or equal
|
|
|
jno
|
JNO
|
jump if not overflow
|
|
|
jnp
|
JNP
|
jump if not parity
|
|
|
jns
|
JNS
|
jump if not sign (non-negative)
|
|
|
jnz
|
JNZ
|
jump if not zero
|
|
|
jo
|
JO
|
jump if overflow
|
|
|
jp
|
JP
|
jump if parity
|
|
|
jpe
|
JPE
|
jump if parity even
|
|
|
jpo
|
JPO
|
jump if parity odd
|
|
|
js
|
JS
|
jump if sign (negative)
|
|
|
jz
|
JZ
|
jump if zero
|
|
|
lcall
|
CALL
|
call far procedure
|
valid as indirect only for -xarg=amd64
|
|
leave
|
LEAVE
|
high-level procedure exit
|
|
|
loop
|
LOOP
|
loop with %ecx counter
|
|
|
loope
|
LOOPE
|
loop with %ecx and equal
|
|
|
loopne
|
LOOPNE
|
loop with %ecx and not equal
|
|
|
loopnz
|
LOOPNZ
|
loop with %ecx and not zero
|
|
|
loopz
|
LOOPZ
|
loop with %ecx and zero
|
|
|
lret
|
RET
|
return from far procedure
|
valid as indirect only for -xarg=amd64
|
|
ret
|
RET
|
return
|
|
String Instructions
The string
instructions operate on strings of bytes. Operations include storing strings in memory,
loading strings from memory, comparing strings, and scanning strings for substrings.
Note –
The Solaris mnemonics for certain instructions differ slightly from the
Intel/AMD mnemonics. Alphabetization of the table below is by the Solaris mnemonic.
All string operations default to long (doubleword).
Table 3–8 String Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cmps{q}
|
CMPS
|
compare string
|
cmpsq valid only under -xarch=amd64
|
|
cmpsb
|
CMPSB
|
compare byte string
|
|
|
cmpsl
|
CMPSD
|
compare doubleword string
|
|
|
cmpsw
|
CMPSW
|
compare word string
|
|
|
lods{q}
|
LODS
|
load string
|
lodsq valid only under -xarch=amd64
|
|
lodsb
|
LODSB
|
load byte string
|
|
|
lodsl
|
LODSD
|
load doubleword string
|
|
|
lodsw
|
LODSW
|
load word string
|
|
|
movs{q}
|
MOVS
|
move string
|
movsq valid only under -xarch=amd64
|
|
movsb
|
MOVSB
|
move byte string
|
movsb is not movsb{wlq}. See Table 3–1
|
|
movsl, smovl
|
MOVSD
|
move doubleword string
|
|
|
movsw, smovw
|
MOVSW
|
move word string
|
movsw is not movsw{lq}. See Table 3–1
|
|
rep
|
REP
|
repeat while %ecx not zero
|
|
|
repnz
|
REPNE
|
repeat while not equal
|
|
|
repnz
|
REPNZ
|
repeat while not zero
|
|
|
repz
|
REPE
|
repeat while equal
|
|
|
repz
|
REPZ
|
repeat while zero
|
|
|
scas{q}
|
SCAS
|
scan string
|
scasq valid only under -xarch=amd64
|
|
scasb
|
SCASB
|
scan byte string
|
|
|
scasl
|
SCASD
|
scan doubleword string
|
|
|
scasw
|
SCASW
|
scan word string
|
|
|
stos{q}
|
STOS
|
store string
|
stosq valid only under -xarch=amd64
|
|
stosb
|
STOSB
|
store byte string
|
|
|
stosl
|
STOSD
|
store doubleword string
|
|
|
stosw
|
STOSW
|
store word string
|
|
I/O Instructions
The input/output instructions transfer data between the processor's
I/O ports, registers, and memory.
Table 3–9 I/O Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
in
|
IN
|
read from a port
|
|
|
ins
|
INS
|
input string from a port
|
|
|
insb
|
INSB
|
input byte string from port
|
|
|
insl
|
INSD
|
input doubleword string from port
|
|
|
insw
|
INSW
|
input word string from port
|
|
|
out
|
OUT
|
write to a port
|
|
|
outs
|
OUTS
|
output string to port
|
|
|
outsb
|
OUTSB
|
output byte string to port
|
|
|
outsl
|
OUTSD
|
output doubleword string to port
|
|
|
outsw
|
OUTSW
|
output word string to port
|
|
Flag Control (EFLAG) Instructions
The status flag control instructions operate on the bits in
the %eflags register.
Table 3–10 Flag Control Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
clc
|
CLC
|
set carry flag
|
|
|
cld
|
CLD
|
clear direction flag
|
|
|
cli
|
CLI
|
clear interrupt flag
|
|
|
cmc
|
CMC
|
complement carry flag
|
|
|
lahf
|
LAHF
|
load flags into %ah register
|
|
|
popfw
|
POPF
|
pop %eflags from stack
|
|
|
popf{lq}
|
POPFL
|
pop %eflags from stack
|
popfq valid only under -xarch=amd64
|
|
pushfw
|
PUSHF
|
push %eflags onto stack
|
|
|
pushf{lq}
|
PUSHFL
|
push %eflags onto stack
|
pushfq valid only under -xarch=amd64
|
|
sahf
|
SAHF
|
store %ah register into flags
|
|
|
stc
|
STC
|
set carry flag
|
|
|
std
|
STD
|
set direction flag
|
|
|
sti
|
STI
|
set interrupt flag
|
|
Segment Register Instructions
The segment register instructions load far pointers (segment
addresses) into the segment registers.
Table 3–11 Segment Register Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
lds{wl}
|
LDS
|
load far pointer using %ds
|
ldsl and ldsw invalid under -xarch=amd64
|
|
les{wl}
|
LES
|
load far pointer using %es
|
lesl and lesw invalid under -xarch=amd64
|
|
lfs{wl}
|
LFS
|
load far pointer using %fs
|
|
|
lgs{wl}
|
LGS
|
load far pointer using %gs
|
|
|
lss{wl}
|
LSS
|
load far pointer using %ss
|
|
Miscellaneous Instructions
The instructions documented in this section
provide a number of useful functions.
Table 3–12 Miscellaneous Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cpuid
|
CPUID
|
processor identification
|
|
|
lea{wlq}
|
LEA
|
load effective address
|
leaq valid only under -xarch=amd64
|
|
nop
|
NOP
|
no operation
|
|
|
ud2
|
UD2
|
undefined instruction
|
|
|
xlat
|
XLAT
|
table lookup translation
|
|
|
xlatb
|
XLATB
|
table lookup translation
|
|
Floating-Point Instructions
The floating point instructions operate on floating-point, integer, and binary
coded decimal (BCD) operands.
Data Transfer Instructions (Floating Point)
The data transfer instructions
move floating-point, integer, and BCD values between memory and the floating point
registers.
Table 3–13 Data Transfer Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fbld
|
FBLD
|
load BCD
|
|
|
fbstp
|
FBSTP
|
store BCD and pop
|
|
|
fcmovb
|
FCMOVB
|
floating-point conditional move if below
|
|
|
fcmovbe
|
FCMOVBE
|
floating-point conditional move if below or equal
|
|
|
fcmove
|
FCMOVE
|
floating-point conditional move if equal
|
|
|
fcmovnb
|
FCMOVNB
|
floating-point conditional move if not below
|
|
|
fcmovnbe
|
FCMOVNBE
|
floating-point conditional move if not below or equal
|
|
|
fcmovne
|
FCMOVNE
|
floating-point conditional move if not equal
|
|
|
fcmovnu
|
FCMOVNU
|
floating-point conditional move if unordered
|
|
|
fcmovu
|
FCMOVU
|
floating-point conditional move if unordered
|
|
|
fild
|
FILD
|
load integer
|
|
|
fist
|
FIST
|
store integer
|
|
|
fistp
|
FISTP
|
store integer and pop
|
|
|
fld
|
FLD
|
load floating-point value
|
|
|
fst
|
FST
|
store floating-point value
|
|
|
fstp
|
FSTP
|
store floating-point value and pop
|
|
|
fxch
|
FXCH
|
exchange registers
|
|
Basic Arithmetic Instructions (Floating-Point)
The basic arithmetic instructions
perform basic arithmetic operations on floating-point and integer operands.
Table 3–14 Basic Arithmetic Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fabs
|
FABS
|
absolute value
|
|
|
fadd
|
FADD
|
add floating-point
|
|
|
faddp
|
FADDP
|
add floating-point and pop
|
|
|
fchs
|
FCHS
|
change sign
|
|
|
fdiv
|
FDIV
|
divide floating-point
|
|
|
fdivp
|
FDIVP
|
divide floating-point and pop
|
|
|
fdivr
|
FDIVR
|
divide floating-point reverse
|
|
|
fdivrp
|
FDIVRP
|
divide floating-point reverse and pop
|
|
|
fiadd
|
FIADD
|
add integer
|
|
|
fidiv
|
FIDIV
|
divide integer
|
|
|
fidivr
|
FIDIVR
|
divide integer reverse
|
|
|
fimul
|
FIMUL
|
multiply integer
|
|
|
fisub
|
FISUB
|
subtract integer
|
|
|
fisubr
|
FISUBR
|
subtract integer reverse
|
|
|
fmul
|
FMUL
|
multiply floating-point
|
|
|
fmulp
|
FMULP
|
multiply floating-point and pop
|
|
|
fprem
|
FPREM
|
partial remainder
|
|
|
fprem1
|
FPREM1
|
IEEE partial remainder
|
|
|
frndint
|
FRNDINT
|
round to integer
|
|
|
fscale
|
FSCALE
|
scale by power of two
|
|
|
fsqrt
|
FSQRT
|
square root
|
|
|
fsub
|
FSUB
|
subtract floating-point
|
|
|
fsubp
|
FSUBP
|
subtract floating-point and pop
|
|
|
fsubr
|
FSUBR
|
subtract floating-point reverse
|
|
|
fsubrp
|
FSUBRP
|
subtract floating-point reverse and pop
|
|
|
fxtract
|
FXTRACT
|
extract exponent and significand
|
|
Comparison Instructions (Floating-Point)
The floating-point comparison
instructions operate on floating-point or integer operands.
Table 3–15 Comparison Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fcom
|
FCOM
|
compare floating-point
|
|
|
fcomi
|
FCOMI
|
compare floating-point and set %eflags
|
|
|
fcomip
|
FCOMIP
|
compare floating-point, set %eflags, and pop
|
|
|
fcomp
|
FCOMP
|
compare floating-point and pop
|
|
|
fcompp
|
FCOMPP
|
compare floating-point and pop twice
|
|
|
ficom
|
FICOM
|
compare integer
|
|
|
ficomp
|
FICOMP
|
compare integer and pop
|
|
|
ftst
|
FTST
|
test floating-point (compare with 0.0)
|
|
|
fucom
|
FUCOM
|
unordered compare floating-point
|
|
|
fucomi
|
FUCOMI
|
unordered compare floating-point and set %eflags
|
|
|
fucomip
|
FUCOMIP
|
unordered compare floating-point, set %eflags, and pop
|
|
|
fucomp
|
FUCOMP
|
unordered compare floating-point and pop
|
|
|
fucompp
|
FUCOMPP
|
compare floating-point and pop twice
|
|
|
fxam
|
FXAM
|
examine floating-point
|
|
Transcendental Instructions
(Floating-Point)
The transcendental instructions perform trigonometric and logarithmic
operations on floating-point operands.
Table 3–16 Transcendental Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
f2xm1
|
F2XM1
|
computes 2x-1
|
|
|
fcos
|
FCOS
|
cosine
|
|
|
fpatan
|
FPATAN
|
partial arctangent
|
|
|
fptan
|
FPTAN
|
partial tangent
|
|
|
fsin
|
FSIN
|
sine
|
|
|
fsincos
|
FSINCOS
|
sine and cosine
|
|
|
fyl2x
|
FYL2X
|
computes y * log2x
|
|
|
fyl2xp1
|
FYL2XP1
|
computes y * log2(x+1)
|
|
Load Constants (Floating-Point) Instructions
The load constants instructions
load common constants, such as , into the floating-point registers.
Table 3–17 Load Constants Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fld1
|
FLD1
|
load +1.0
|
|
|
fldl2e
|
FLDL2E
|
load log2e
|
|
|
fldl2t
|
FLDL2T
|
load log210
|
|
|
fldlg2
|
FLDLG2
|
load log102
|
|
|
fldln2
|
FLDLN2
|
load loge2
|
|
|
fldpi
|
FLDPI
|
load
|
|
|
fldz
|
FLDZ
|
load +0.0
|
|
Control Instructions (Floating-Point)
The floating-point control instructions operate
on the floating-point register stack and save and restore the floating-point state.
Table 3–18 Control Instructions (Floating-Point)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fclex
|
FCLEX
|
clear floating-point exception flags after checking for error conditions
|
|
|
fdecstp
|
FDECSTP
|
decrement floating-point register stack pointer
|
|
|
ffree
|
FFREE
|
free floating-point register
|
|
|
fincstp
|
FINCSTP
|
increment floating-point register stack pointer
|
|
|
finit
|
FINIT
|
initialize floating-point unit after checking error conditions
|
|
|
fldcw
|
FLDCW
|
load floating-point unit control word
|
|
|
fldenv
|
FLDENV
|
load floating-point unit environment
|
|
|
fnclex
|
FNCLEX
|
clear floating-point exception flags without checking for error conditions
|
|
|
fninit
|
FNINIT
|
initialize floating-point unit without checking error conditions
|
|
|
fnop
|
FNOP
|
floating-point no operation
|
|
|
fnsave
|
FNSAVE
|
save floating-point unit state without checking error conditions
|
|
|
fnstcw
|
FNSTCW
|
store floating-point unit control word without checking error conditions
|
|
|
fnstenv
|
FNSTENV
|
store floating-point unit environment without checking error conditions
|
|
|
fnstsw
|
FNSTSW
|
store floating-point unit status word without checking error conditions
|
|
|
frstor
|
FRSTOR
|
restore floating-point unit state
|
|
|
fsave
|
FSAVE
|
save floating-point unit state after checking error conditions
|
|
|
fstcw
|
FSTCW
|
store floating-point unit control word after checking error conditions
|
|
|
fstenv
|
FSTENV
|
store floating-point unit environment after checking error conditions
|
|
|
fstsw
|
FSTSW
|
store floating-point unit status word after checking error conditions
|
|
|
fwait
|
FWAIT
|
wait for floating-point unit
|
|
|
wait
|
WAIT
|
wait for floating-point unit
|
|
SIMD State Management Instructions
The fxsave and fxrstor instructions save and restore the state of the floating-point unit
and the MMX, XMM, and MXCSR registers.
Table 3–19 SIMD State Management Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
fxrstor
|
FXRSTOR
|
restore floating-point unit and SIMD state
|
|
|
fxsave
|
FXSAVE
|
save floating-point unit and SIMD state
|
|
MMX Instructions
The MMX instructions enable x86 processors to perform single-instruction, multiple-data(SIMD) operations on packed byte, word, doubleword, or quadword integer operands contained in memory, in MMX registers, or in general-purpose registers.
Data Transfer Instructions (MMX)
The data transfer instructions move doubleword
and quadword operands between MMX registers and between MMX registers and memory.
Table 3–20 Data Transfer Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
movd
|
MOVD
|
move doubleword
|
movdq valid only under -xarch=amd64
|
|
movq
|
MOVQ
|
move quadword
|
valid only under -xarch=amd64
|
Conversion Instructions (MMX)
The conversion instructions pack and unpack
bytes, words, and doublewords.
Table 3–21 Conversion Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
packssdw
|
PACKSSDW
|
pack doublewords into words with signed saturation
|
|
|
packsswb
|
PACKSSWB
|
pack words into bytes with signed saturation
|
|
|
packuswb
|
PACKUSWB
|
pack words into bytes with unsigned saturation
|
|
|
punpckhbw
|
PUNPCKHBW
|
unpack high-order bytes
|
|
|
punpckhdq
|
PUNPCKHDQ
|
unpack high-order doublewords
|
|
|
punpckhwd
|
PUNPCKHWD
|
unpack high-order words
|
|
|
punpcklbw
|
PUNPCKLBW
|
unpack low-order bytes
|
|
|
punpckldq
|
PUNPCKLDQ
|
unpack low-order doublewords
|
|
|
punpcklwd
|
PUNPCKLWD
|
unpack low-order words
|
|
Packed Arithmetic Instructions (MMX)
The packed arithmetic instructions perform packed
integer arithmetic on packed byte, word, and doubleword integers.
Table 3–22 Packed Arithmetic Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
paddb
|
PADDB
|
add packed byte integers
|
|
|
paddd
|
PADDD
|
add packed doubleword integers
|
|
|
paddsb
|
PADDSB
|
add packed signed byte integers with signed saturation
|
|
|
paddsw
|
PADDSW
|
add packed signed word integers with signed saturation
|
|
|
paddusb
|
PADDUSB
|
add packed unsigned byte integers with unsigned saturation
|
|
|
paddusw
|
PADDUSW
|
add packed unsigned word integers with unsigned saturation
|
|
|
paddw
|
PADDW
|
add packed word integers
|
|
|
pmaddwd
|
PMADDWD
|
multiply and add packed word integers
|
|
|
pmulhw
|
PMULHW
|
multiply packed signed word integers and store high result
|
|
|
pmullw
|
PMULLW
|
multiply packed signed word integers and store low result
|
|
|
psubb
|
PSUBB
|
subtract packed byte integers
|
|
|
psubd
|
PSUBD
|
subtract packed doubleword integers
|
|
|
psubsb
|
PSUBSB
|
subtract packed signed byte integers with signed saturation
|
|
|
psubsw
|
PSUBSW
|
subtract packed signed word integers with signed saturation
|
|
|
psubusb
|
PSUBUSB
|
subtract packed unsigned byte integers with unsigned saturation
|
|
|
psubusw
|
PSUBUSW
|
subtract packed unsigned word integers with unsigned saturation
|
|
|
psubw
|
PSUBW
|
subtract packed word integers
|
|
Comparison Instructions (MMX)
The compare instructions compare packed bytes,
words, or doublewords.
Table 3–23 Comparison Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
pcmpeqb
|
PCMPEQB
|
compare packed bytes for equal
|
|
|
pcmpeqd
|
PCMPEQD
|
compare packed doublewords for equal
|
|
|
pcmpeqw
|
PCMPEQW
|
compare packed words for equal
|
|
|
pcmpgtb
|
PCMPGTB
|
compare packed signed byte integers for greater than
|
|
|
pcmpgtd
|
PCMPGTD
|
compare packed signed doubleword integers for greater than
|
|
|
pcmpgtw
|
PCMPGTW
|
compare packed signed word integers for greater than
|
|
Logical Instructions (MMX)
The logical instructions perform logical operations
on quadword operands.
Table 3–24 Logical Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
pand
|
PAND
|
bitwise logical AND
|
|
|
pandn
|
PANDN
|
bitwise logical AND NOT
|
|
|
por
|
POR
|
bitwise logical OR
|
|
|
pxor
|
PXOR
|
bitwise logical XOR
|
|
Shift and Rotate Instructions (MMX)
The shift and rotate instructions
operate on packed bytes, words, doublewords, or quadwords in 64–bit operands.
Table 3–25 Shift and Rotate Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
pslld
|
PSLLD
|
shift packed doublewords left logical
|
|
|
psllq
|
PSLLQ
|
shift packed quadword left logical
|
|
|
psllw
|
PSLLW
|
shift packed words left logical
|
|
|
psrad
|
PSRAD
|
shift packed doublewords right arithmetic
|
|
|
psraw
|
PSRAW
|
shift packed words right arithmetic
|
|
|
psrld
|
PSRLD
|
shift packed doublewords right logical
|
|
|
psrlq
|
PSRLQ
|
shift packed quadword right logical
|
|
|
psrlw
|
PSRLW
|
shift packed words right logical
|
|
State Management Instructions (MMX)
The emms (EMMS)
instruction clears the MMX state from the MMX registers.
Table 3–26 State Management Instructions (MMX)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
emms
|
EMMS
|
empty MMX state
|
|
SSE Instructions
SSE instructions are an extension of the SIMD execution model introduced with the MMX technology. SSE instructions are divided into four subgroups:
-
SIMD single-precision floating-point instructions that operate on the XMM registers
-
MXSCR state management instructions
-
64–bit SIMD integer instructions that operate on the MMX registers
-
Instructions that provide cache control, prefetch, and instruction ordering functionality
SIMD Single-Precision Floating-Point Instructions (SSE)
The SSE SIMD instructions operate on packed and scalar single-precision floating-point values located in the XMM registers or memory.
Data Transfer Instructions (SSE)
The SSE data transfer instructions move packed
and scalar single-precision floating-point operands between XMM registers and between
XMM registers and memory.
Table 3–27 Data Transfer Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
movaps
|
MOVAPS
|
move four aligned packed single-precision floating-point values between XMM
registers or memory
|
|
|
movhlps
|
MOVHLPS
|
move two packed single-precision floating-point values from the high quadword
of an XMM register to the low quadword of another XMM register
|
|
|
movhps
|
MOVHPS
|
move two packed single-precision floating-point values to or from the high quadword
of an XMM register or memory
|
|
|
movlhps
|
MOVLHPS
|
move two packed single-precision floating-point values from the low quadword
of an XMM register to the high quadword of another XMM register
|
|
|
movlps
|
MOVLPS
|
move two packed single-precision floating-point values to or from the low quadword
of an XMM register or memory
|
|
|
movmskps
|
MOVMSKPS
|
extract sign mask from four packed single-precision floating-point values
|
|
|
movss
|
MOVSS
|
move scalar single-precision floating-point value between XMM registers or memory
|
|
|
movups
|
MOVUPS
|
move four unaligned packed single-precision floating-point values between XMM
registers or memory
|
|
Packed Arithmetic Instructions (SSE)
SSE packed arithmetic instructions perform packed
and scalar arithmetic operations on packed and scalar single-precision floating-point
operands.
Table 3–28 Packed Arithmetic Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
addps
|
ADDPS
|
add packed single-precision floating-point values
|
|
|
addss
|
ADDSS
|
add scalar single-precision floating-point values
|
|
|
divps
|
DIVPS
|
divide packed single-precision floating-point values
|
|
|
divss
|
DIVSS
|
divide scalar single-precision floating-point values
|
|
|
maxps
|
MAXPS
|
return maximum packed single-precision floating-point values
|
|
|
maxss
|
MAXSS
|
return maximum scalar single-precision floating-point values
|
|
|
minps
|
MINPS
|
return minimum packed single-precision floating-point values
|
|
|
minss
|
MINSS
|
return minimum scalar single-precision floating-point values.
|
|
|
mulps
|
MULPS
|
multiply packed single-precision floating-point values
|
|
|
mulss
|
MULSS
|
multiply scalar single-precision floating-point values
|
|
|
rcpps
|
RCPPS
|
compute reciprocals of packed single-precision floating-point values
|
|
|
rcpss
|
RCPSS
|
compute reciprocal of scalar single-precision floating-point values
|
|
|
rsqrtps
|
RSQRTPS
|
compute reciprocals of square roots of packed single-precision floating-point
values
|
|
|
rsqrtss
|
RSQRTSS
|
compute reciprocal of square root of scalar single-precision floating-point
values
|
|
|
sqrtps
|
SQRTPS
|
compute square roots of packed single-precision floating-point values
|
|
|
sqrtss
|
SQRTSS
|
compute square root of scalar single-precision floating-point values
|
|
|
subps
|
SUBPS
|
subtract packed single-precision floating-point values
|
|
|
subss
|
SUBSS
|
subtract scalar single-precision floating-point values
|
|
Comparison Instructions (SSE)
The SEE compare instructions compare packed
and scalar single-precision floating-point operands.
Table 3–29 Comparison Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cmpps
|
CMPPS
|
compare packed single-precision floating-point values
|
|
|
cmpss
|
CMPSS
|
compare scalar single-precision floating-point values
|
|
|
comiss
|
COMISS
|
perform ordered comparison of scalar single-precision floating-point values
and set flags in EFLAGS register
|
|
|
ucomiss
|
UCOMISS
|
perform unordered comparison of scalar single-precision floating-point values
and set flags in EFLAGS register
|
|
Logical Instructions (SSE)
The SSE logical instructions perform bitwise
AND, AND NOT, OR, and XOR operations on packed single-precision floating-point operands.
Table 3–30 Logical Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
andnps
|
ANDNPS
|
perform bitwise logical AND NOT of packed single-precision floating-point values
|
|
|
andps
|
ANDPS
|
perform bitwise logical AND of packed single-precision floating-point values
|
|
|
orps
|
ORPS
|
perform bitwise logical OR of packed single-precision floating-point values
|
|
|
xorps
|
XORPS
|
perform bitwise logical XOR of packed single-precision floating-point values
|
|
Shuffle and Unpack Instructions (SSE)
The SSE shuffle and unpack
instructions shuffle or interleave single-precision floating-point values in packed
single-precision floating-point operands.
Table 3–31 Shuffle and Unpack Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
shufps
|
SHUFPS
|
shuffles values in packed single-precision floating-point operands
|
|
|
unpckhps
|
UNPCKHPS
|
unpacks and interleaves the two high-order values from two single-precision
floating-point operands
|
|
|
unpcklps
|
UNPCKLPS
|
unpacks and interleaves the two low-order values from two single-precision floating-point
operands
|
|
Conversion Instructions (SSE)
The SSE conversion instructions convert packed
and individual doubleword integers into packed and scalar single-precision floating-point
values.
Table 3–32 Conversion Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cvtpi2ps
|
CVTPI2PS
|
convert packed doubleword integers to packed single-precision floating-point
values
|
|
|
cvtps2pi
|
CVTPS2PI
|
convert packed single-precision floating-point values to packed doubleword integers
|
|
|
cvtsi2ss
|
CVTSI2SS
|
convert doubleword integer to scalar single-precision floating-point value
|
|
|
cvtss2si
|
CVTSS2SI
|
convert scalar single-precision floating-point value to a doubleword integer
|
|
|
cvttps2pi
|
CVTTPS2PI
|
convert with truncation packed single-precision floating-point values to packed
doubleword integers
|
|
|
cvttss2si
|
CVTTSS2SI
|
convert with truncation scalar single-precision floating-point value to scalar
doubleword integer
|
|
MXCSR State Management Instructions (SSE)
The MXCSR state management
instructions save and restore the state of the MXCSR control and status register.
Table 3–33 MXCSR State Management Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
ldmxcsr
|
LDMXCSR
|
load %mxcsr register
|
|
|
stmxcsr
|
STMXCSR
|
save %mxcsr register state
|
|
64–Bit SIMD Integer Instructions (SSE)
The SSE 64–bit SIMD
integer instructions perform operations on packed bytes, words, or doublewords in
MMX registers.
Table 3–34 64–Bit SIMD Integer Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
pavgb
|
PAVGB
|
compute average of packed unsigned byte integers
|
|
|
pavgw
|
PAVGW
|
compute average of packed unsigned byte integers
|
|
|
pextrw
|
PEXTRW
|
extract word
|
|
|
pinsrw
|
PINSRW
|
insert word
|
|
|
pmaxsw
|
PMAXSW
|
maximum of packed signed word integers
|
|
|
pmaxub
|
PMAXUB
|
maximum of packed unsigned byte integers
|
|
|
pminsw
|
PMINSW
|
minimum of packed signed word integers
|
|
|
pminub
|
PMINUB
|
minimum of packed unsigned byte integers
|
|
|
pmovmskb
|
PMOVMSKB
|
move byte mask
|
|
|
pmulhuw
|
PMULHUW
|
multiply packed unsigned integers and store high result
|
|
|
psadbw
|
PSADBW
|
compute sum of absolute differences
|
|
|
pshufw
|
PSHUFW
|
shuffle packed integer word in MMX register
|
|
Miscellaneous Instructions (SSE)
The following instructions control caching,
prefetching, and instruction ordering.
Table 3–35 Miscellaneous Instructions (SSE)
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
maskmovq
|
MASKMOVQ
|
non-temporal store of selected bytes from an MMX register into memory
|
|
|
movntps
|
MOVNTPS
|
non-temporal store of four packed single-precision floating-point values from
an XMM register into memory
|
|
|
movntq
|
MOVNTQ
|
non-temporal store of quadword from an MMX register into memory
|
|
|
prefetchnta
|
PREFETCHNTA
|
prefetch data into non-temporal cache structure and into a location close to
the processor
|
|
|
prefetcht0
|
PREFETCHT0
|
prefetch data into all levels of the cache hierarchy
|
|
|
prefetcht1
|
PREFETCHT1
|
prefetch data into level 2 cache and higher
|
|
|
prefetcht2
|
PREFETCHT2
|
prefetch data into level 2 cache and higher
|
|
|
sfence
|
SFENCE
|
serialize store operations
|
|
SSE2 Instructions
SSE2 instructions are an extension of the SIMD execution model introduced with the MMX technology and the SSE extensions. SSE2 instructions are divided into four subgroups:
-
Packed and scalar double-precision floating-point instructions
-
Packed single-precision floating-point conversion instructions
-
128–bit SIMD integer instructions
-
Instructions that provide cache control and instruction ordering functionality
SSE2 Packed and Scalar Double-Precision Floating-Point Instructions
The SSE2 packed and scalar double-precision floating-point instructions operate on double-precision floating-point operands.
SSE2 Data Movement Instructions
The SSE2 data movement instructions
move double-precision floating-point data between XMM registers and memory.
Table 3–36 SSE2 Data Movement Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
movapd
|
MOVAPD
|
move two aligned packed double-precision floating-point values between XMM registers
and memory
|
|
|
movhpd
|
MOVHPD
|
move high packed double-precision floating-point value to or from the high quadword
of an XMM register and memory
|
|
|
movlpd
|
MOVLPD
|
move low packed single-precision floating-point value to or from the low quadword
of an XMM register and memory
|
|
|
movmskpd
|
MOVMSKPD
|
extract sign mask from two packed double-precision floating-point values
|
|
|
movsd
|
MOVSD
|
move scalar double-precision floating-point value between XMM registers and
memory.
|
|
|
movupd
|
MOVUPD
|
move two unaligned packed double-precision floating-point values between XMM
registers and memory
|
|
SSE2 Packed Arithmetic Instructions
The SSE2 arithmetic instructions operate on packed
and scalar double-precision floating-point operands.
Table 3–37 SSE2 Packed Arithmetic Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
addpd
|
ADDPD
|
add packed double-precision floating-point values
|
|
|
addsd
|
ADDSD
|
add scalar double-precision floating-point values
|
|
|
divpd
|
DIVPD
|
divide packed double-precision floating-point values
|
|
|
divsd
|
DIVSD
|
divide scalar double-precision floating-point values
|
|
|
maxpd
|
MAXPD
|
return maximum packed double-precision floating-point values
|
|
|
maxsd
|
MAXSD
|
return maximum scalar double-precision floating-point value
|
|
|
minpd
|
MINPD
|
return minimum packed double-precision floating-point values
|
|
|
minsd
|
MINSD
|
return minimum scalar double-precision floating-point value
|
|
|
mulpd
|
MULPD
|
multiply packed double-precision floating-point values
|
|
|
mulsd
|
MULSD
|
multiply scalar double-precision floating-point values
|
|
|
sqrtpd
|
SQRTPD
|
compute packed square roots of packed double-precision floating-point values
|
|
|
sqrtsd
|
SQRTSD
|
compute scalar square root of scalar double-precision floating-point value
|
|
|
subpd
|
SUBPD
|
subtract packed double-precision floating-point values
|
|
|
subsd
|
SUBSD
|
subtract scalar double-precision floating-point values
|
|
SSE2 Logical Instructions
The SSE2 logical instructions operate on packed
double-precision floating-point values.
Table 3–38 SSE2 Logical Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
andnpd
|
ANDNPD
|
perform bitwise logical AND NOT of packed double-precision floating-point values
|
|
|
andpd
|
ANDPD
|
perform bitwise logical AND of packed double-precision floating-point values
|
|
|
orpd
|
ORPD
|
perform bitwise logical OR of packed double-precision floating-point values
|
|
|
xorpd
|
XORPD
|
perform bitwise logical XOR of packed double-precision floating-point values
|
|
SSE2 Compare Instructions
The SSE2 compare instructions compare packed
and scalar double-precision floating-point values and return the results of the comparison
to either the destination operand or to the EFLAGS register.
Table 3–39 SSE2 Compare Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cmppd
|
CMPPD
|
compare packed double-precision floating-point values
|
|
|
cmpsd
|
CMPSD
|
compare scalar double-precision floating-point values
|
|
|
comisd
|
COMISD
|
perform ordered comparison of scalar double-precision floating-point values
and set flags in EFLAGS register
|
|
|
ucomisd
|
UCOMISD
|
perform unordered comparison of scalar double-precision floating-point values
and set flags in EFLAGS register
|
|
SSE2 Shuffle and Unpack Instructions
The SSE2 shuffle and unpack
instructions operate on packed double-precision floating-point operands.
Table 3–40 SSE2 Shuffle and Unpack Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
shufpd
|
SHUFPD
|
shuffle values in packed double-precision floating-point operands
|
|
|
unpckhpd
|
UNPCKHPD
|
unpack and interleave the high values from two packed double-precision floating-point
operands
|
|
|
unpcklpd
|
UNPCKLPD
|
unpack and interleave the low values from two packed double-precision floating-point
operands
|
|
SSE2 Conversion Instructions
The SSE2 conversion instructions convert
packed and individual doubleword integers into packed and scalar double-precision
floating-point values (and vice versa). These instructions also convert between packed
and scalar single-precision and double-precision floating-point values.
Table 3–41 SSE2 Conversion Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cvtdq2pd
|
CVTDQ2PD
|
convert packed doubleword integers to packed double-precision floating-point
values
|
|
|
cvtpd2dq
|
CVTPD2DQ
|
convert packed double-precision floating-point values to packed doubleword integers
|
|
|
cvtpd2pi
|
CVTPD2PI
|
convert packed double-precision floating-point values to packed doubleword integers
|
|
|
cvtpd2ps
|
CVTPD2PS
|
convert packed double-precision floating-point values to packed single-precision
floating-point values
|
|
|
cvtpi2pd
|
CVTPI2PD
|
convert packed doubleword integers to packed double-precision floating-point
values
|
|
|
cvtps2pd
|
CVTPS2PD
|
convert packed single-precision floating-point values to packed double-precision
floating-point values
|
|
|
cvtsd2si
|
CVTSD2SI
|
convert scalar double-precision floating-point values to a doubleword integer
|
|
|
cvtsd2ss
|
CVTSD2SS
|
convert scalar double-precision floating-point values to scalar single-precision
floating-point values
|
|
|
cvtsi2sd
|
CVTSI2SD
|
convert doubleword integer to scalar double-precision floating-point value
|
|
|
cvtss2sd
|
CVTSS2SD
|
convert scalar single-precision floating-point values to scalar double-precision
floating-point values
|
|
|
cvttpd2dq
|
CVTTPD2DQ
|
convert with truncation packed double-precision floating-point values to packed
doubleword integers
|
|
|
cvttpd2pi
|
CVTTPD2PI
|
convert with truncation packed double-precision floating-point values to packed
doubleword integers
|
|
|
cvttsd2si
|
CVTTSD2SI
|
convert with truncation scalar double-precision floating-point values to scalar
doubleword integers
|
|
SSE2 Packed Single-Precision Floating-Point Instructions
The SSE2 packed single-precision
floating-point instructions operate on single-precision floating-point and integer
operands.
Table 3–42 SSE2 Packed Single-Precision Floating-Point
Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
cvtdq2ps
|
CVTDQ2PS
|
convert packed doubleword integers to packed single-precision floating-point
values
|
|
|
cvtps2dq
|
CVTPS2DQ
|
convert packed single-precision floating-point values to packed doubleword integers
|
|
|
cvttps2dq
|
CVTTPS2DQ
|
convert with truncation packed single-precision floating-point values to packed
doubleword integers
|
|
SSE2 128–Bit SIMD Integer Instructions
The SSE2 SIMD integer
instructions operate on packed words, doublewords, and quadwords contained in XMM
and MMX registers.
Table 3–43 SSE2 128–Bit SIMD Integer Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
movdq2q
|
MOVDQ2Q
|
move quadword integer from XMM to MMX registers
|
|
|
movdqa
|
MOVDQA
|
move aligned double quadword
|
|
|
movdqu
|
MOVDQU
|
move unaligned double quadword
|
|
|
movq2dq
|
MOVQ2DQ
|
move quadword integer from MMX to XMM registers
|
|
|
paddq
|
PADDQ
|
add packed quadword integers
|
|
|
pmuludq
|
PMULUDQ
|
multiply packed unsigned doubleword integers
|
|
|
pshufd
|
PSHUFD
|
shuffle packed doublewords
|
|
|
pshufhw
|
PSHUFHW
|
shuffle packed high words
|
|
|
pshuflw
|
PSHUFLW
|
shuffle packed low words
|
|
|
pslldq
|
PSLLDQ
|
shift double quadword left logical
|
|
|
psrldq
|
PSRLDQ
|
shift double quadword right logical
|
|
|
psubq
|
PSUBQ
|
subtract packed quadword integers
|
|
|
punpckhqdq
|
PUNPCKHQDQ
|
unpack high quadwords
|
|
|
punpcklqdq
|
PUNPCKLQDQ
|
unpack low quadwords
|
|
SSE2 Miscellaneous Instructions
The SSE2 instructions described below provide
additional functionality for caching non-temporal data when storing data from XMM
registers to memory, and provide additional control of instruction ordering on store
operations.
Table 3–44 SSE2 Miscellaneous Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
clflush
|
CLFLUSH
|
flushes and invalidates a memory operand and its associated cache line from
all levels of the processor's cache hierarchy
|
|
|
lfence
|
LFENCE
|
serializes load operations
|
|
|
maskmovdqu
|
MASKMOVDQU
|
non-temporal store of selected bytes from an XMM register into memory
|
|
|
mfence
|
MFENCE
|
serializes load and store operations
|
|
|
movntdq
|
MOVNTDQ
|
non-temporal store of double quadword from an XMM register into memory
|
|
|
movnti
|
MOVNTI
|
non-temporal store of a doubleword from a general-purpose register into memory
|
movntiq valid only under -xarch=amd64
|
|
movntpd
|
MOVNTPD
|
non-temporal store of two packed double-precision floating-point values from
an XMM register into memory
|
|
|
pause
|
PAUSE
|
improves the performance of spin-wait loops
|
|
Operating System Support
Instructions
The operating system support instructions provide
functionality for process management, performance monitoring, debugging, and other
systems tasks.
Table 3–45 Operating System Support Instructions
|
Solaris Mnemonic
|
Intel/AMD Mnemonic
|
Description
|
Notes
|
|
arpl
|
ARPL
|
adjust requested privilege level
|
|
|
clts
|
CLTS
|
clear the task-switched flag
|
|
|
hlt
|
HLT
|
halt processor
|
|
|
invd
|
INVD
|
invalidate cache, no writeback
|
|
|
invlpg
|
INVLPG
|
invalidate TLB entry
|
|
|
lar
|
LAR
|
load access rights
|
larq valid only under -xarch=amd64
|
|
lgdt
|
LGDT
|
load global descriptor table (GDT) register
|
|
|
lidt
|
LIDT
|
load interrupt descriptor table (IDT) register
|
|
|
lldt
|
LLDT
|
load local descriptor table (LDT) register
|
|
|
lmsw
|
LMSW
|
load machine status word
|
|
|
lock
|
LOCK
|
lock bus
|
|
|
lsl
|
LSL
|
load segment limit
|
lslq valid only under -xarch=amd64
|
|
ltr
|
LTR
|
load task register
|
|
|
rdmsr
|
RDMSR
|
read model-specific register
|
|
|
rdpmc
|
RDPMC
|
read performance monitoring counters
|
|
|
rdtsc
|
RDTSC
|
read time stamp counter
|
|
|
rsm
|
RSM
|
return from system management mode (SMM)
|
|
|
sgdt
|
SGDT
|
store global descriptor table (GDT) register
|
|
|
sidt
|
SIDT
|
store interrupt descriptor table (IDT) register
|
|
|
sldt
|
SLDT
|
store local descriptor table (LDT) register
|
sldtq valid only under -xarch=amd64
|
|
smsw
|
SMSW
|
store machine status word
|
smswq valid only under -xarch=amd64
|
|
str
|
STR
|
store task register
|
strq valid only under -xarch=amd64
|
|
sysenter
|
SYSENTER
|
fast system call, transfers to a flat protected model kernel at CPL=0
|
|
|
sysexit
|
SYSEXIT
|
fast system call, transfers to a flat protected mode kernal at CPL=3
|
|
|
verr
|
VERR
|
verify segment for reading
|
|
|
verw
|
VERW
|
verify segment for writing
|
|
|
wbinvd
|
WBINVD
|
invalidate cache, with writeback
|
|
|
wrmsr
|
WRMSR
|
write model-specific register
|
|
64–Bit AMD
Opteron Considerations
To assemble code for the AMD Opteron CPU, invoke the assembler with the -xarch=amd64 command line option. See the as(1) man page for additional information.
The following Solaris mnemonics are only valid when the -xarch=amd64 command line option is specified:
-
adcq
-
addq
-
andq
-
bsfq
-
bsrq
-
bswapq
-
btcq
-
btq
-
btrq
-
btsq
-
cltq
-
cmovaeq
-
cmovaq
-
cmovbeq
-
cmovbq
-
cmovcq
-
cmoveq
-
cmovgeq
-
cmovgq
-
cmovleq
-
cmovlq
-
cmovnaeq
-
cmovnaq
-
cmovnbeq
-
cmovnbq
-
cmovncq
-
cmovneq
-
cmovngeq
-
cmovngq
-
cmovnleq
-
cmovnlq
-
cmovnoq
-
cmovnpq
-
cmovnsq
-
cmovnzq
-
cmovoq
-
cmovpeq
-
cmovpoq
-
cmovpq
-
cmovsq
-
cmovzq
-
cmpq
-
cmpsq
-
cmpxchgq
-
cqtd
-
cqto
-
decq
-
divq
-
idivq
-
imulq
-
incq
-
larq
-
leaq
-
lodsq
-
lslq
-
movabs
-
movdq
-
movntiq
-
movq
-
movsq
-
movswq
-
movzwq
-
mulq
-
negq
-
notq
-
orq
-
popfq
-
popq
-
pushfq
-
pushq
-
rclq
-
rcrq
-
rolq
-
rorq
-
salq
-
sarq
-
sbbq
-
scasq
-
shldq
-
shlq
-
shrdq
-
shrq
-
sldtq
-
smswq
-
stosq
-
strq
-
subq
-
testq
-
xaddq
-
xchgq
-
xchgqA
-
xorq
The following Solaris mnemonics are not valid when the -xarch=amd64 command line option is specified:
-
aaa
-
aad
-
aam
-
aas
-
boundw
-
daa
-
das
-
into
-
jecxz
-
ldsw
-
lesw
-
popa
-
popaw
-
pusha
-
pushaw