|
| 以 PDF 格式下载本书 (600 KB)
Appendix E SPARC-V9 Instruction Set
This appendix describes changes made to the SPARC instruction
set due to the SPARC-V9 architecture. Application software for the 32-bit
SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.
This appendix is organized into the following sections:
E.1 SPARC-V9 Changes
The SPARC-V9 architecture differs from SPARC-V8 architecture in the following
areas, expanded below: registers, alternate space access, byte order, and
instruction set.
E.1.1 Registers
These registers have been deleted:
Table E–1
|
PSR
|
Processor State Register
|
|
TBR
|
Trap Base Register
|
|
WIM
|
Window Invalid Mask
|
These registers have been widened from 32 to 64 bits:
Table E–2
|
Integer registers
|
|
|
All state registers
|
FSR, PC, nPC, and Y
|
Note –
FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added
floating-point condition code) bits are added and the register widened to
64-bits.
These SPARC-V9 registers are within a SPARC-V8 register field:
Table E–3
|
CCR
|
Condition Codes Register
|
|
CWP
|
Current Window Pointer
|
|
PIL
|
Processor Interrupt Level
|
|
TBA
|
Trap Base Address
|
|
TT[MAXTL]
|
Trap Type
|
|
VER
|
Version
|
These are registers that have been added.
Table E–4
|
ASI
|
Address Space Identifier
|
|
CANRESTORE
|
Restorable Windows
|
|
CANSAVE
|
Savable windows
|
|
CLEANWIN
|
Clean Windows
|
|
FPRS
|
Floating-point Register State
|
|
OTHERWIN
|
Other Windows
|
|
PSTATE
|
Processor State
|
|
TICK
|
Hardware clock tick-counter
|
|
TL
|
Trap Level
|
|
TNPC[MAXTL]
|
Trap Next Program Counter
|
|
TPC[MAXTL]
|
Trap Program Counter
|
|
TSTATE[MAXTL]
|
Trap State
|
|
WSTATE
|
Windows State
|
Also, there are sixteen additional double-precision floating-point registers,
f[32] .. f[62]. These registers overlap (and are aliased with) eight additional
quad-precision floating-point registers, f[32] .. f[60]
The SPARC-V9, CWP register is decremented during a RESTORE instruction,
and incremented during a SAVE instruction. This is the opposite of PSR.CWP's
behavior in SPARC-V8. This change has no effect on nonprivileged instructions.
E.1.2 Alternate Space Access
Load- and store-alternate instructions
to one-half of the alternate spaces can now be included in user code. In SPARC-V9,
loads and stores to ASIs 0016 .. 7f16
are privileged; those to ASIs 8016 .. FF16 are nonprivileged. In SPARC-V8, access to alternate address
spaces is privileged.
E.1.3 Byte Order
SPARC-V9 supports both little- and big-endian
byte orders for data accesses only; instruction accesses are always performed
using big-endian byte order. In SPARC-V8, all data and instruction accesses
are performed in big-endian byte order.
E.2 SPARC-V9 Instruction Set Changes
Application software
written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.
E.2.1 Extended Instruction Definitions to Support the 64-bit Model
Table E–5
|
FCMP, FCMPE
|
Floating-Point Compare—can set any of
the four floating-point condition codes.
|
|
LDFSR, STFSR
|
Load/Store FSR- only affect low-order 32 bits
of FSR
|
|
LDUW, LDUWA
|
Same as LD, LDA in SPARC-V8
|
|
RDASR/WRASR
|
Read/Write State Registers - access additional
registers
|
|
SAVE/RESTORE
|
|
|
SETHI
|
|
|
SRA, SRL, SLL, Shifts
|
Split into 32-bit and 64-bit versions
|
|
Tcc
|
(was Ticc) Operates with either the 32-bit integer condition
codes (icc), or the 64-bit integer condition codes (xcc)
|
All other arithmetic operations operate on 64-bit operands and produce
64-bit results.
E.2.2 Added Instructions to Support 64 bits
Table E–6
|
F[sdq]TOx
|
Convert floating point to 64-bit word
|
|
FxTO[sdq]
|
Convert 64-bit word to floating point
|
|
FMOV[dq]
|
Floating-Point Move, double and quad
|
|
FNEG[dq]
|
Floating-point Negate, double and quad
|
|
FABS[dq]
|
Floating-point Absolute Value, double and
quad
|
|
LDDFA, STDFA, LDFA, STFA
|
Alternate address space forms of LDDF,
STDF, LDF, and STF
|
|
LDSW
|
Load a signed word
|
|
LDSWA
|
Load a signed word from an alternate space
|
|
LDX
|
Load an extended word
|
|
LDXA
|
Load an extended word from an alternate space
|
|
LDXFSR
|
Load all 64 bits of the FSR register
|
|
STX
|
Store an extended word
|
|
STXA
|
Store an extended word into an alternate space
|
|
STXFSR
|
Store all 64 bits if the FSR register
|
E.2.3 Added Instructions to Support High-Performance System Implementation
Table E–7
|
BPcc
|
Branch on integer condition code with prediction
|
|
BPr
|
Branch on integer register contents with prediction
|
|
CASA, CASXA
|
Compare and Swap from an alternate space
|
|
FBPfcc
|
Branch on floating-point condition code with prediction
|
|
FLUSHW
|
Flush windows
|
|
FMOVcc
|
Move floating-point register if condition code is satisfied
|
|
FMOVr
|
Move floating-point register if integer register satisfies
condition
|
|
LDQF(A), STQF(A)
|
Load/Store Quad Floating-point (in
an alternate space)
|
|
MOVcc
|
Move integer register if condition code is satisfied
|
|
MOVr
|
Move integer register if register contents satisfy condition
|
|
MULX
|
Generic 64-bit multiply
|
|
POPC
|
Population count
|
|
PREFETCH, PREFETCHA
|
Prefetch Data
|
|
SDIVX, UDIVX
|
Signed and Unsigned 64-bit divide
|
E.2.4 Deleted Instructions
Table E–8
|
Coprocessor loads and stores
|
|
|
RDTBR and WRTBR
|
TBR no longer exists. It is replaced
by TBA, which can be read/written with RDPR/WRPR instructions
|
|
RDWIM and WRWIM
|
WIM no longer exists. WIM has been
replaced by several register-window registers
|
|
REPSR and WRPSR
|
PSR no longer exists. It has been replaced
by several separate registers that are read/written with other instructions
|
|
RETT
|
Return from trap (replace by DONE/RETRY)
|
|
STDFQ
|
Store Double from Floating-point Queue (replaced by the
RDPR FQ instruction
|
E.2.5 Miscellaneous Instruction Changes
Table E–9
|
IMPDEPn
|
(Changed) Implementation-dependent instructions (replace
SPARC-V8 CPop instructions)
|
|
MEMBAR
|
(Added) Memory barrier (memory synchronization support)
|
E.3 SPARC-V9 Instruction Set Mapping
Table E–10
|
Opcode
|
Mnemonic
|
Argument List
|
Operation
|
Comments
|
|
BPA
|
ba{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
(Branch on cc with prediction)
Branch always
|
1
|
|
BPN
|
bn{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch never
|
0
|
|
BPNE
|
bne{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on not equal
|
not Z
|
|
BPE
|
be{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on equal
|
Z
|
|
BPG
|
bg{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on greater
|
not (Z or (N xor V))
|
|
BPLE
|
ble{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on less or equal
|
Z or (N xor V)
|
|
BPGE
|
bge{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on greater or equal
|
not (N xor V)
|
|
BPL
|
bl{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on less
|
N xor V
|
|
BPGU
|
bgu{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on greater unsigned
|
not (C or Z)
|
|
BPLEU
|
bleu{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on less or equal unsigned
|
C or Z
|
|
BPCC
|
bcc{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on carry clear (greater than or equal, unsigned)
|
not C
|
|
BPCS
|
bcs{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on carry set (less than, unsigned)
|
C
|
|
BPPOS
|
bpos{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on positive
|
not N
|
|
BPNEG
|
bneg{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on negative
|
N
|
|
BPVC
|
bvc{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on overflow clear
|
not V
|
|
BPVS
|
bvs{,a}
{,pt|,pn}
|
%icc or %xcc, label
|
Branch on overflow set
|
V
|
|
BRZ
|
brz{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register zero
|
Z
|
|
BRLEZ
|
brlez{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register less than or equal to zero
|
N or Z
|
|
BRLZ
|
brlz{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register less than zero
|
N
|
|
BRNZ
|
brnz{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register not zero
|
not Z
|
|
BRGZ
|
brgz{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register greater than zero
|
not (N or Z)
|
|
BRGEZ
|
brgez{,a}
{,pt|,pn}
|
regrs1, label
|
Branch on register greater than or equal to zero
|
not N
|
|
CASA
|
casa
casa
|
[regrs1]imm_asi,regrs2,regrd
[regrs1]%asi,regrs2,regrd
|
Compare and swap word from alternate space
|
|
|
CASXA
|
casxa
casxa
|
[regrs1]imm_asi,regrs2,regrd
[regrs1]%asi,regrs2,regrd
|
Compare
and swap extended from alternate space
|
|
|
FBPA
|
fba{,a}
{,pt|,pn}
|
%fccn, label
|
(Branch on cc with prediction)
Branch never
|
1
|
|
FBPN
|
fbn{,a}
{,pt|,pn}
|
%fccn, label
|
Branch always
|
0
|
|
FBPU
|
fbu{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered
|
U
|
|
FBPG
|
fbg{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on greater
|
G
|
|
FBPUG
|
fbug{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered or greater
|
G or U
|
|
FBPL
|
fbl{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on less
|
L
|
|
FBPUL
|
fbul{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered or less
|
L or U
|
|
FBPLG
|
fblg{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on less or greater
|
L or G
|
|
FBPNE
|
fbne{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on not equal
|
L or G or U
|
|
FBPE
|
fbe{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on equal
|
E
|
|
FBPUE
|
fbue{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered or equal
|
E or U
|
|
FBPGE
|
fbge{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on greater or equal
|
E or G
|
|
FBPUGE
|
fbuge{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered or greater
or equal
|
E or G or U
|
|
FBPLE
|
fble{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on less or equal
|
E or L
|
|
FBPULE
|
fbule{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on unordered or less
or equal
|
E or L or u
|
|
FBPO
|
fbo{,a}
{,pt|,pn}
|
%fccn, label
|
Branch on ordered
|
E or L or G
|
|
FLUSHW
|
flushw
|
|
Flush register windows
|
|
|
FMOVA
|
fmov
{s,d,q}a
|
%icc or %xcc, fregrs2, fregrd
|
(Move on integer cc)
Move always
|
1
|
|
FMOVN
|
fmov
{s,d,q}n
|
%icc or %xcc, fregrs2, fregrd
|
Move never
|
0
|
|
FMOVNE
|
fmov
{s,d,q}ne
|
%icc or %xcc, fregrs2, fregrd
|
Move if not equal
|
not Z
|
|
FMOVE
|
fmov
{s,d,q}e
|
%icc or %xcc, fregrs2, fregrd
|
Move if equal
|
Z
|
|
FMOVG
|
fmov
{s,d,q}g
|
%icc or %xcc, fregrs2, fregrd
|
Move if greater
|
not (Z or (N xor V))
|
|
FMOVLE
|
fmov
{s,d,q}le
|
%icc or %xcc, fregrs2, fregrd
|
Move if less or equal
|
Z or (N xor V)
|
|
FMOVGE
|
fmov
{s,d,q}ge
|
%icc or %xcc, fregrs2, fregrd
|
Move if greater or equal
|
not (N xor V)
|
|
FMOVL
|
fmov
{s,d,q}l
|
%icc or %xcc, fregrs2, fregrd
|
Move if less
|
N xor V
|
|
FMOVGU
|
fmov
{s,d,q}gu
|
%icc or %xcc, fregrs2, fregrd
|
Move if greater unsigned
|
not (C or Z)
|
|
FMOVLEU
|
fmov
{s,d,q}leu
|
%icc or %xcc, fregrs2, fregrd
|
Move if less or equal unsigned
|
C or Z
|
|
FMOVCC
|
fmov
{s,d,q}cc
|
%icc or %xcc, fregrs2, fregrd
|
Move if carry clear (greater or equal, unsigned)
|
not C
|
|
FMOVCS
|
fmov
{s,d,q}cs
|
%icc or %xcc, fregrs2, fregrd
|
Move if carry set (less than, unsigned)
|
C
|
|
FMOVPOS
|
fmov
{s,d,q}pos
|
%icc or %xcc, fregrs2, fregrd
|
Move if positive
|
not N
|
|
FMOVNEG
|
fmov
{s,d,q}neg
|
%icc or %xcc, fregrs2, fregrd
|
Move if negative
|
N
|
|
FMOVVC
|
fmov
{s,d,q}vc
|
%icc or %xcc, fregrs2, fregrd
|
Move if overflow clear
|
not V
|
|
FMOVVS
|
fmov
{s,d,q}vs
|
%icc or %xcc, fregrs2, fregrd
|
Move if overflow set
|
V
|
|
FMOVRZ
|
fmovr
{s,d,q}e
|
regrs1, fregrs2, fregrd
|
(Move f-p register on cc)
Move if register zero
|
|
|
FMOVRLEZ
|
fmovr
{s,d,q}lz
|
regrs1, fregrs2, fregrd
|
Move if register less than or equal zero
| |
|
FMOVRLZ
|
fmovr
{s,d,q}lz
|
regrs1, fregrs2, fregrd
|
Move if register less than zero
| |
|
FMOVRNZ
FMOVRGZ
FMOVRGEZ
|
fmovr
{s,d,q}ne
fmovr
{s,d,q}gz
fmovr
{s,d,q}gez
|
regrs1, fregrs2, fregrd
regrs1, fregrs2, fregrd
regrs1, fregrs2, fregrd
|
Move if register not zero
Move if register greater than zero
Move if register greater than or equal to zero
|
|
|
FMOVFA
FMOVFN
FMOVFU
FMOVFG
FMOVFUG
FMOVFL
FMOVFUL
FMOVFLG
FMOVFNE
FMOVFE
FMOVFUE
FMOVFGE
FMOVFUGE
FMOVFLE
FMOVFULE
FMOVFO
|
fmov{s,d,q}a
fmov{s,d,q}n
fmov{s,d,q}u
fmov{s,d,q}g
fmov{s,d,q}ug
fmov{s,d,q}l
fmov{s,d,q}ul
fmov{s,d,q}lg
fmov{s,d,q}ne
fmov{s,d,q}e
fmov{s,d,q}ue
fmov{s,d,q}ge
fmov{s,d,q}uge
fmov{s,d,q}le
fmov{s,d,q}ule
fmov{s,d,q}o
|
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
%fccn,fregrs2,fregrd
|
(Move on floating-point cc)
Move always
Move never
Move if unordered
Move if greater
Move if unordered or greater
Move if less
Move if unordered or less
Move if less or greater
Move if not equal
Move
if equal
Move if unordered or equal
Move if
greater or equal
Move if unordered or greater or equal
Move if less or equal
Move if unordered or less or equal
Move if ordered
|
1
0
U
G
G or
U
L
L or U
L or G
L or G or U
E
E or U
E or G
E or G or U
E or L
E or L or u
E or L or G
|
|
LDSW
LDSWA
|
ldsw
ldsw
|
[address], regrd
[regaddr] imm_asi, regrd
|
Load a signed word
Load
signed word from alternate space
|
|
|
LDX
LDXA
LDXFSR
|
ldx
ldxa
ldxa
ldx
|
[address], regrd
[regaddr] imm_asi, regrd
[reg_plus_imm] %asi, regrd
[address], %fsr
|
Load extended word
Load extended word from alternate space
Load floating-point state register
|
|
|
MEMBAR
|
membar
|
membar_mask
|
Memory barrier
|
|
|
MOVA
MOVN
MOVNE
MOVE
MOVG
MOVLE
MOVGE
MOVL
MOVGU
MOVLEU
MOVCC
MOVCS
MOVPOS
MOVNEG
MOVVC
MOVVS
|
mova
movn
movne
move
movg
movle
movge
movl
movgu
movleu
movcc
movcs
movpos
movneg
movvc
movvs
|
%icc or %xcc, reg_or_imm11,
regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11,
regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11,
regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11,
regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
%icc or %xcc, reg_or_imm11, regrd
|
(Move integer register on cc)
Move always
Move
never
Move if not equal
Move if equal
Move if greater
Move if less or equal
Move if
greater or equal
Move if less
Move if greater
unsigned
Move if less or equal unsigned
Move
if carry clear (greater or equal, unsigned)
Move if carry set
(less than, unsigned)
Move if positive
Move if
negative
Move if overflow clear
Move if overflow
set
|
1
0
not Z
Z
not (Z or (N xor V))
Z or (N xor V)
not (N xor V)
N xor V
not (C or Z)
C or Z
not C
C
not N
N
not V
V
|
|
MOVFA
MOVFN
MOVFU
MOVFG
MOVFUG
MOVFL
MOVFUL
MOVFLG
MOVFNE
MOVFE
MOVFUE
MOVFGE
MOVFUGE
MOVFLE
MOVFULE
MOVFO
|
mova
movn
movu
movg
movug
movl
movul
movlg
movne
move
movue
movge
movuge
movle
movule
movo
|
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
|
(Move on floating-point cc)
Move always
Move never
Move if unordered
Move if greater
Move if unordered or greater
Move if less
Move if unordered or less
Move if
less or greater
Move if not equal
Move if equal
Move if unordered or equal
Move if greater or
equal
Move if unordered or greater or equal
Move
if less or equal
Move if unordered or less or equal
Move if ordered
|
1
0
U
G
G or U
L
L or U
L or G
L or G or U
E E or U
E or G
E or G or U
E
or L
E or L or u
E or L or G
|
|
MOVRZ
MOVRLEZ
MOVRLZ
MOVRNZ
MOVRGZ
MOVRGEZ
|
movre
movrlez
movrlz
movrnz
movrgz
movrgez
|
regrs1, reg_or_imm10,regrd
regrs1, reg_or_imm10,regrd
regrs1, reg_or_imm10,regrd
regrs1, reg_or_imm10,regrd
regrs1, reg_or_imm10,regrd
regrs1, reg_or_imm10,regrd
|
(Move
register on register cc)
Move if register zero
Move if register less than or equal to zero
Move if register
less than zero
Move if register not zero
Move
if register greater than zero
Move if register greater than
or equal to zero
|
Z
N or Z
N
not Z
N
nor Z
not N
|
|
MULX
|
mulx
|
regrs1, reg_or_imm,regrd
|
(Generic 64-bit Multiply) Multiply
(signed or unsigned)
|
See SDIVX and UDIVX
|
|
POPC
|
popc
|
reg_or_imm, regrd
|
Population
count
|
|
|
PREFETCH
PREFETCHA
|
prefetch
prefetcha
prefetcha
|
[address], prefetch_dcn
[regaddr] imm_asi, prefetch_fcn [reg_plus_imm] %asi, prefetch_fcn
|
Prefetch data
Prefetch data from alternate space
|
See The SPARC architecture
manual, version 9
|
|
SDIVX
|
sdivx
|
regrs1, reg_or_imm,regrd
|
(64-bit signed divide) Signed Divide
|
See MULX and UDIVX
|
|
STX
STXA
STXFSR
|
stx
stxa
stxa
stx
|
regrd, [address]
regrd, [address] imm_asi
regrd, [reg_plus_imm] %asi %fsr, [address]
|
Store extended word
Store extended
word into alternate space
Store floating-point
register (all 64-bits)
|
|
|
UDIVX
|
udivx
|
regrs1, reg_or_imm, regrd
|
(64-bit unsigned divide) Unsigned
divide
|
See MULX and
SDIVX
|
E.4 SPARC-V9 Floating-Point Instruction Set Mapping
SPARC-V9 floating-point instructions
are shown in the following table.
Table E–11
|
SPARC
|
Mnemonic [Types of Operands are denoted by the following lower-case letters:i 32-bit integerx 64-bit integers singled doubleq quad]
|
Argument
List
|
Description
|
|
F[sdq]TOx
|
fstox
fdtox
fqtox
|
fregrs2, fregrd
fregrs2, fregrd
fregrs2, fregrd
|
Convert floating point to 64-bit integer
|
|
|
fstoi
fdtoi
fqtoi
|
fregrs2, fregrd
fregrs2, fregrd
fregrs2, fregrd
|
Convert floating-point to 32-bit integer
|
|
FxTO[sdq]
|
fxtos
fxtod
fxtoq
|
fregrs2, fregrd
fregrs2, fregrd
fregrs2, fregrd
|
Convert 64-bit integer to floating
point
|
|
|
fitos
fitod
fitoq
|
fregrs2, fregrd
fregrs2, fregrd
fregrs2, fregrd
|
Convert
32-bit integer to floating point
|
|
FMOV[dq]
|
fmovd
fmovq
|
fregrs2, fregrd
fregrs2, fregrd
|
Move double
Move quad
|
|
FNEG[dq]
|
fnegd
fnegq
|
fregrs2, fregrd
fregrs2, fregrd
|
Negate double
Negate quad
|
|
FABS[dq]
|
fabsd
fabsq
|
fregrs2, fregrd
fregrs2, fregrd
|
Absolute value double
Absolute value quad
|
|
LDFA
LDDFA
LDQFA
|
lda
lda
ldda
ldda
ldqa
ldqa
|
[regaddr] imm_asi, fregrd
[reg_plus_imm] %asi, fregrd
[regaddr] imm_asi, fregrd
[reg_plus_imm] %asi, fregrd
[regaddr] imm_asi, fregrd
[reg_plus_imm] %asi, fregrd
|
Load floating-point
register from alternate space
Load double floating-point register
from alternate space.
Load quad floating-point register from
alternate space
|
|
STFA
STDFA
STQFA
|
sta
sta
stda
stda
stqa
stqa
|
fregrd, [regaddr] imm_asi
fregrd,
[reg_plus_imm] %asi
fregrd, [regaddr] imm_asi
fregrd, [reg_plus_imm]
%asi
fregrd, [regaddr] imm_asi
fregrd, [reg_plus_imm]
%asi
|
Store floating-point
register to alternate space
Store double floating-point register
to alternate space
Store quad floating-point register to alternate
space
|
E.5 SPARC-V9 Synthetic Instruction-Set Mapping
Here is a mapping of synthetic
instructions to hardware equivalent instructions.
Table E–12
|
Synthetic Instruction
|
Hardware Equivalent(s)
|
Comment
|
|
cas
casl
casx
casxl
|
[regrsl], regrs2, regrd
[regrsl], regrs2, regrd
[regrsl], regrs2, regrd
[regrsl], regrs2, regrd
|
casa
casa
casxa
casxa
|
[regrsl]ASI_P, regrs2, regrd
[regrsl]ASI_P_L, regrs2, regrd
[regrsl]ASI_P, regrs2, regrd
[regrsl]ASI_P_L, regrs2, regrd
|
Compare
& swap (cas)
cas little-endian
cas extended
cas little-endian, extended
|
|
clrx
|
[address]
|
stx
|
%g0, [address]
|
Clear extended word
|
|
clruw
clruw
|
regrs1, regrd
regrd
|
srl
srl
|
regrs1, %g0, regrd
regrd, %g0, regrd
|
Copy and clear upper word
Clear upper word
|
|
iprefetch
|
label
|
bn, pt
|
%xcc, label
|
Instruction
prefetch,
|
|
mov
mov
mov
|
%y, regrd
%asrn, regrd
reg_or_imm, %asrn
|
rd
rd
wr
|
%y, regrd
%asrn, regrd
%g0, reg_or_imm, %asrn
|
|
|
ret
retl
|
|
jmpl
jmpl
|
%i7+8,
%g0
%o7+8, %g0
|
Return from subroutine
Return from leaf subroutine
|
|
setn
|
value, r1, r2
|
for -xarch=v9 same as setx value r1, r2
for -xarch=v8 same
as set value r2
|
|
|
setnhi
|
value, r1, r2
|
for -xarch=v9 same as setxhi value r1, r2
for -xarch=v8 same
as sethi value r2
|
|
|
setuw
|
value,regrd
|
sethi
or
sethi
or
|
%hi(value), regrd
%g0, value, regrd
%hi(value), regrd;
regrd, %lo(value), regrd
|
(value
& 3FF16)==0
when 0 ≤ value ≤ 4095
(otherwise)
Do not use setuw
in a DCTI delay slot.
|
|
setsw
|
value,regrd
|
sethi
or
sethi
sra
sethi
or
sethi
or
sra
|
%hi(value), regrd
%g0, value, regrd
%hi(value), regrd
regrd, %g0, regrd
%hi(value), regrd;
regrd, %lo(value), regrd
%hi(value), regrd;
regrd, %lo(value), regrd
regrd, %g0, regrd
|
value>=0 and (value & 3FF16)==0
-4096 ≤ value ≤ 4095
if (value<0) and
((value & 3FF)==0)
(otherwise, if value>=0)
(otherwise, if value<0)
Do not use setsw in a CTI delay
slot.
|
|
setx
|
value, r1, r2
|
sethi
or
sethi
or
sllx
or
|
%hh(value), r1
r1, %hm(value),
r1
%lm(value), r2
r2, %lo(value), r2
r1, 32, r1
r1, r2, r2
|
|
|
setxhi
|
value r1, r2
|
sethi
or
sethi
sllx
or
|
%hh(value), r1
r1, %hm(value), r1
%lm(value), r2
r1, 32, r1
r1, r2, r2
|
|
|
signx
signx
|
regrsl, regrd
regrd
|
sra
sra
|
regrsl, %g0, regrd
regrd, %g0, regrd
|
Sign-extend 32-bit value to 64 bits
|
E.6 UltraSPARC and VIS Instruction Set Extensions
This section describes
extensions that require SPARC-V9. The extensions support enhanced graphics
functionality and improved memory access efficiency.
Note –
SPARC-V9 instruction set extensions used in executables may not
be portable to other SPARC-V9 systems.
E.6.1 Graphics Data Formats
The overhead of converting to and
from floating-point arithmetic is high, so the graphics instructions are optimized
for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate
results are 16 or 32 bits.
E.6.2 Eight-bit Format
A 32-bit word contains pixels of four
unsigned 8-bit integers. The integers represent image intensity values ( , G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).
E.6.3 Fixed Data Formats
A 64-bit word contains four 16-bit
signed fixed-point values. This is the fixed 16-bit data format.
A 64-bit word contains two 8-bit signed fixed-point values. This is
the fixed 32-bit data format.
Enough precision and dynamic range (for filtering and simple image computations
on pixel values) can be provided by an intermediate format of fixed data values.
Pixel multiplication is used to convert from pixel data to fixed data. Pack
instructions are used to convert from fixed data to pixel data (clip and truncate
to an 8-bit unsigned value). The FPACKFIX instruction supports conversion
from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding
bit position. You should use floating-point data to perform complex calculations
needing more precision or dynamic range.
E.6.4 SHUTDOWN Instruction
All outstanding transactions are completed before the SHUTDOWN instruction
completes.
Table E–13
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
SHUTDOWN
|
shutdown
|
|
shutdown to enter power down mode
|
E.6.5 Graphics Status Register (GSR)
You use ASR 0x13 instructions RDASR and WRASR to access the Graphics
Status Register.
Table E–14
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
RDASR
WRASR
|
rdasr
wrasr
|
%gsr, regrd
regrs1, reg_or_imm, %gsr
|
read GSR
write GSR
|
E.6.6 Graphics Instructions
Unless otherwise specified, floating-point registers contain all instruction
operands. There are 32 double-precision registers. Single-precision floating-point
registers contain the pixel values, and double-precision floating-point registers
contain the fixed values.
The opcode space reserved for the Implementation-Dependent Instruction1
(IMPDEP1) instructions is where the graphics instruction set is mapped.
Partitioned add/subtract instructions perform two 32-bit or four 16-bit
partitioned adds or subtracts between the source operands corresponding fixed
point values.
Table E–15
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
FPADD16
FPADD16S
FPADD32
FPADD32S
FPSUB16
FPSUB16S
FPSUB32
FPSUB32S
|
fpadd16
fpadd16s
fpadd32
fpadd32s
fpsub16
fpsub16s
fpsub32
fpsub32s
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
four 16-bit
add
two 16-bit add
two 32-bit add
one
32-bit add
four 16-bit subtract
two 16-bit subtract
two 32-bit subtract
one 32-bit subtract
|
Pack instructions convert to a lower pixel or precision fixed format.
Table E–16
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
FPACK16
FPACK32
FPACKFIX
FEXPAND
FPMERGE
|
fpack16
fpack32
fpackfix
fexpand
fpmerge
|
fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs2, fregrd
fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
four 16-bit
packs
two 32-bit packs
four 16-bit packs
four 16-bit expands
two 32-bit merges
|
Partitioned multiply instructions have the following variations.
Table E–17
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
FMUL8x16
FMUL8x16AU
FMUL8x16AL
FMUL8SUx16
FMUL8ULx16
FMULD8SUx16
FMULD8ULx16
|
fmul8x16
fmul8x16au
fmul8x16al
fmul8sux16
fmul8ulx16
fmuld8sux16
fmuld8ulx16
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
8x16-bit partition
8x16-bit
upper partition
8x16-bit lower partition
upper 8x16-bit partition
lower unsigned 8x16-bit
partition
upper 8x16-bit partition
lower unsigned
8x16-bit partition
|
Alignment instructions have the following variations.
Table E–18
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
ALIGNADDRESS
ALIGNADDRESS_LITTLE
FALIGNDATA
|
alignaddr
alignaddrl
faligndata
|
regrs1, regrs2, regrd
regrs1, regrs2, regrd
fregrs1, fregrs2, fregrd
|
find misaligned data access address
same as above, but little-endian
do
misaligned data, data alignment
|
Logical operate instructions perform one of sixteen 64-bit logical operations
between rs1 and rs2 (in the standard
64-bit version).
Table E–19
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
FZERO
FZEROS
FONE
FONES
FSRC1
|
fzero
fzeros
fone
fones
fsrc1
|
fregrd
fregrd
fregrd
fregrd
fregrs1, fregrd
|
zero fill
zero
fill, single precision
one fill
one fill, single precision
copy src1
|
|
FSRC1S
FSRC2
FSRC2S
FNOT1
FNOT1S
|
fsrc1s
fsrc2
fsrc2s
fnot1
fnot1s
|
fregrs1, fregrd
fregrs2, fregrd
fregrs2, fregrd
fregrs1, fregrd
fregrs1, fregrd
|
copy src1, single precision
copy src2
copy src2, single precision
negate src1,
1's complement
same as above, single precision
|
|
FNOT2
FNOT2S
FOR
FORS
FNOR
|
fnot2
fnot2s
for
fors
fnor
|
fregrs2, fregrd
fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
negate src2, 1's complement
same as above, single precision
logical OR
logical OR, single precision
logical NOR
|
|
FNORS
FAND
FANDS
FNAND
FNANDS
|
fnors
fand
fands
fnand
fnands
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
logical NOR, single precision
logical
AND
logical AND, single precision
logical NAND
logical NAND, single precision
|
|
FXOR
FXORS
FXNOR
FXNORS
FORNOT1
|
fxor
fxors
fxnor
fxnors
fornot1
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
logical XOR
logical XOR, single precision
logical
XNOR
logical XNOR, single precision
negated src1 OR
src2
|
|
FORNOT1S
FORNOT2
FORNOT2S
FANDNOT1
|
fornot1s
fornot2
fornot2s
fandnot1
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
same as above, single precision
src1 OR negated src2
same as above, single precision
negated src1 AND src2
|
|
FANDNOT1S
FANDNOT2
FANDNOT2S
|
fandnot1s
fandnot2
fandnot2s
|
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
fregrs1, fregrs2, fregrd
|
same as above, single precision
src1 AND negated src2
same as above, single precision
|
Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)
Table E–20
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
FCMPGT16
FCMPGT32
FCMPLE16
FCMPLE32
|
fcmpgt16
fcmpgt32
fcmple16
fcmple32
|
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
|
4 16-bit compare, set rd if src1>src2
2 32-bit compare, set rd if src1>src2
4 16-bit compare,
set rd if src1≤src2
2 32-bit compare, set rd if src1≤src2
|
|
FCMPNE16
FCMPNE32
FCMPEQ16
FCMPEQ32
|
fcmpne16
fcmpne32
fcmpeq16
fcmpeq32
|
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
fregrs1, fregrs2, regrd
|
4 16-bit compare, set rd if src1≠src2
2
32-bit compare, set rd if src1≠src2
4 16-bit compare, set rd
if src1=src2
2 32-bit compare, set rd if src1=src2
|
Edge handling instructions handle the boundary conditions for parallel
pixel scan line loops.
Table E–21
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
EDGE8
EDGE8L
EDGE16
|
edge8
edge8l
edge16
|
regrs1, regrs2, regrd
regrs1, regrs2, regrd
regrs1, regrs2, regrd
|
8 8-bit edge boundary processing
same as above, little-endian
4 16-bit edge boundary processing
|
|
EDGE16L
EDGE32
EDGE32L
|
edge16l
edge32
edge32l
|
regrs1, regrs2, regrd
regrs1, regrs2, regrd
regrs1, regrs2, regrd
|
same as
above, little-endian
2 32-bit edge boundary processing
same as above, little-endian
|
Pixel component distance instructions are used for motion estimation
in video compression algorithms.
Table E–22
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
PDIST
|
pdist
|
fregrs1, fregrs2, fregrd
|
8 8-bit components, distance between
|
The three-dimensional array addressing instructions convert three- dimensional
fixed-point addresses (in rs1) to a blocked-byte address. The result is stored
in rd.
Table E–23
|
SPARC
|
Mnemonic
|
Argument List
|
Description
|
|
ARRAY8
ARRAY16
ARRAY32
|
array8
array16
array32
|
regrs1, regrs2, regrd
regrs1, regrs2, regrd
regrs1, regrs2, regrd
|
convert 8-bit 3-D address to blocked byte address
same as above, but 16-bit
same as above, but 32-bit
|
E.6.7 Memory Access Instructions
These memory access instructions are part of the SPARC-V9 instruction
set extensions.
Table E–24
|
SPARC
|
imm_asi
|
Argument List
|
Description
|
|
STDFA
STDFA
STDFA
STDFA
|
ASI_PST8_P
ASI_PST8_S
ASI_PST8_PL
ASI_PST8_SL
|
stda fregrd, [fregrs1] regmask, imm_asi
|
eight 8-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space, little endian
|
|
STDFA
STDFA
STDFA
STDFA
|
ASI_PST16_P
ASI_PST16_S
ASI_PST16_PL
ASI_PST16_SL
| |
four 16-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space, little endian
|
|
STDFA
STDFA
STDFA
STDFA
|
ASI_PST32_P
ASI_PST32_S
ASI_PST32_PL
ASI_PST32_SL
|
|
two 32-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space,
little endian
|
Note –
To select a partial store instruction, use one of the partial
store ASIs with the STDA instruction.
Table E–25
|
SPARC
|
imm_asi
|
Argument List
|
Description
|
|
LDDFA
STDFA
|
ASI_FL8_P
|
ldda [reg_addr] imm_asi, freqrd
stda freqrd, [reg_addr] imm_asi
|
8-bit load/store from/to:
primary address space
|
|
LDDFA
STDFA
|
ASI_FL8_S
|
ldda [reg_plus_imm] %asi, freqrd
stda [reg_plus_imm] %asi
|
secondary address space
|
|
LDDFA
STDFA
|
ASI_FL8_PL
|
|
primary address space, little endian
|
|
LDDFA
STDFA
|
ASI_FL8_SL
|
|
secondary address space,
little endian
|
|
LDDFA
STDFA
|
ASI_FL16_P
|
|
16-bit load/store from/to:
primary address space
|
|
LDDFA
STDFA
|
ASI_FL16_S
|
|
secondary address space
|
|
LDDFA
STDFA
|
ASI_FL16_PL
| |
primary address space,
little endian
|
|
LDDFA
STDFA
|
ASI_FL16_SL
|
|
secondary address space, little
endian
|
Note –
To select a short floating-point load and store instruction, use
one of the short ASIs with the LDDA and STDA instructions.
Table E–26
|
SPARC
|
imm_asi
|
Argument List
|
Description
|
|
LDDA
LDDA
|
ASI_NUCLEUS_QUAD_LDD
ASI_NUCLEUS_QUAD_LDD_L
|
[reg_addr] imm_asi, regrd
[reg_plus_imm] %asi, regrd
|
128-bit atomic load
128-bit atomic load, little
endian
|
|
LDDFA
STDFA
|
ASI_BLK_AIUP
|
ldda [reg_addr] imm_asi, freqrd
stda freqrd, [reg_addr] imm_asi
|
64-byte block load/store
from/to:
primary address space, user privilege
|
|
LDDFA
STDFA
|
ASI_BLK_AIUS
|
ldda [reg_plus_imm] %asi, freqrd
stda fregrd, [reg_plus_imm] %asi
|
secondary address space, user privilege.
|
|
LDDFA
STDFA
|
ASI_BLK_AIUPL
|
|
primary address space,
user privilege, little endian
|
|
LDDFA
STDFA
|
ASI_BLK_AIUSL
|
|
secondary address space,
user privilege little endian
|
|
LDDFA
STDFA
|
ASI_BLK_P
|
|
primary address space
|
|
LDDFA
STDFA
|
ASI_BLK_S
|
|
secondary address space
|
|
LDDFA
STDFA
|
ASI_BLK_PL
|
|
primary address space,
little endian
|
|
LDDFA
STDFA
|
ASI_BLK_SL
|
|
secondary address space,
little endian
|
|
LDDFA
STDFA
|
ASI_BLK_COMMIT_P
| |
64-byte block commit store
to primary address space
|
|
LDDFA
STDFA
|
ASI_BLK_COMMIT_S
|
|
64-byte block commit store to secondary
address space
|
Note –
To select a block load and store instruction, use one of the block
transfer ASIs with the LDDA and STDA instructions.
|