SPARC Assembly Language Reference Manual
  Search only this book
Download this book in PDF

SPARC-V9 Instruction Set

E

This appendix describes changes made to the SPARC instruction set due to the SPARC-V9 architecture. Application software for the 32-bit SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.
This appendix is organized into the following sections:
SPARC-V9 Changespage 71
SPARC-V9 Instruction Set Changespage 74
SPARC-V9 Instruction Set Mappingpage 77
SPARC-V9 Floating-Point Instruction Set Mappingpage 85
SPARC-V9 Synthetic Instruction-Set Mappingpage 87
SPARC-V9 Instruction Set Extensionspage 89

SPARC-V9 Changes

The SPARC-V9 architecture differs from SPARC-V8 architecture in the following areas, expanded below: registers, alternate space access, byte order, and instruction set.

Registers

These registers have been deleted:
Table E-1
PSR Processor State Register
TBR Trap Base Register
WIM Window Invalid Mask
These registers have been widened from 32 to 64 bits:
Table E-2
Integer registers
All state registers FSR, PC, nPC, and Y

Note - FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added floating-point condition code) bits are added and the register widened to 64-bits.

These SPARC-V9 registers are within a SPARC-V8 register field:
Table E-3
CCR Condition Codes Register
CWP Current Window Pointer
PILProcessor Interrupt Level
TBATrap Base Address
TT[MAXTL] Trap Type
VER Version
These are registers that have been added.
Table E-4
ASI Address Space Identifier
CANRESTORE Restorable Windows
CANSAVE Savable windows
Table E-4 (Continued)
CLEANWIN Clean Windows
FPRS Floating-point Register State
OTHERWIN Other Windows
PSTATE Processor State
TICK Hardware clock tick-counter
TL Trap Level
TNPC[MAXTL] Trap Next Program Counter
TPC[MAXTL] Trap Program Counter
TSTATE[MAXTL] Trap State
WSTATE Windows State
Also, there are sixteen additional double-precision floating-point registers, f[32] .. f[62]. These registers overlap (and are aliased with) eight additional quad-precision floating-point registers, f[32] .. f[60]
The SPARC-V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC-V8. This change has no effect on nonprivileged instructions.

Alternate Space Access

Load- and store-alternate instructions to one-half of the alternate spaces can now be included in user code. In SPARC-V9, loads and stores to ASIs 0016 .. 7f16 are privileged; those to ASIs 8016 .. FF16 are nonprivileged. In SPARC-V8, access to alternate address spaces is privileged.

Byte Order

SPARC-V9 supports both little- and big-endian byte orders for data accesses only; instruction accesses are always performed using big-endian byte order. In SPARC-V8, all data and instruction accesses are performed in big-endian byte order.

SPARC-V9 Instruction Set Changes

Application software written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.

Extended Instruction Definitions to Support the 64-bit Model

Table E-5
FCMP, FCMPEFloating-Point Compare--can set any of the four floating-point condition codes.
LDFSR, STFSRLoad/Store FSR- only affect low-order 32 bits of FSR
LDUW, LDUWASame as LD, LDA in SPARC-V8
RDASR/WRASR

SAVE/RESTORE

SETHI

Read/Write State Registers - access additional registers
SRA, SRL, SLL, ShiftsSplit into 32-bit and 64-bit versions
Tcc(was Ticc) Operates with either the 32-bit integer condition codes (icc), or the 64-bit integer condition codes (xcc)
All other arithmetic operations operate on 64-bit operands and produce 64-bit results.

Added Instructions to Support 64 bits

Table E-6
F[sdq]TOx Convert floating point to 64-bit word
FxTO[sdq] Convert 64-bit word to floating point
FMOV[dq] Floating-Point Move, double and quad
FNEG[dq] Floating-point Negate, double and quad
FABS[dq] Floating-point Absolute Value, double and quad
LDDFA, STDFA,
LDFA, STFA
Alternate address space forms of LDDF, STDF, LDF, and STF
LDSW Load a signed word
LDSWA Load a signed word from an alternate space
LDX Load an extended word
LDXA Load an extended word from an alternate space
LDXFSR Load all 64 bits of the FSR register
STX Store an extended word
STXA Store an extended word into an alternate space
STXFSR Store all 64 bits if the FSR register

Added Instructions to Support High-Performance System Implementation

Table E-7
BPcc Branch on integer condition code with prediction
BPr Branch on integer register contents with prediction
CASA, CASXA Compare and Swap from an alternate space
FBPfcc Branch on floating-point condition code with prediction
FLUSHW Flush windows
FMOVcc Move floating-point register if condition code is satisfied
FMOVr Move floating-point register if integer register satisfies condition
LDQF(A), STQF(A) Load/Store Quad Floating-point (in an alternate space)
Table E-7 (Continued)
MOVcc Move integer register if condition code is satisfied
MOVr Move integer register if register contents satisfy condition
MULX Generic 64-bit multiply
POPC Population count
PREFETCH,
PREFETCHA
Prefetch Data
SDIVX, UDIVX Signed and Unsigned 64-bit divide

Deleted Instructions

Table E-8
Coprocessor loads and stores
RDTBR and
WRTBR
TBR no longer exists. It is replaced by TBA, which can be
read/written with RDPR/WRPR instructions
RDWIM and WRWIMWIM no longer exists. WIM has been replaced by several register-window registers
REPSR and
WRPSR
PSR no longer exists. It has been replaced by several separate
registers that are read/written with other instructions
RETTReturn from trap (replace by DONE/RETRY)
STDFQStore Double from Floating-point Queue (replaced by the RDPR FQ instruction

Miscellaneous Instruction Changes

Table E-9
IMPDEPn(Changed) Implementation-dependent instructions
(replace SPARC-V8 CPop instructions)
MEMBAR(Added) Memory barrier (memory synchronization support)

SPARC-V9 Instruction Set Mapping

The following tables describe the SPARC-V9 instruction-set mapping.

Table E-10
OpcodeMnemonicArgument ListOperationComments
BPAba{,a}%icc or %xcc, label(Branch on cc with prediction) Branch always1
BPN{,pt|,pn} bn{,a}%icc or %xcc, labelBranch never0
BPNE{,pt|,pn} bne{,a}%icc or %xcc, labelBranch on not equalnot Z
BPE{,pt|,pn} be{,a}%icc or %xcc, labelBranch on equalZ
BPG{,pt|,pn} bg{,a}%icc or %xcc, labelBranch on greaternot (Z or (N
{,pt|,pn}xor V))
BPLEble{,a}%icc or %xcc, labelBranch on less or equalZ or (N xor V)
BPGE{,pt|,pn} bge{,a}%icc or %xcc, labelBranch on greater or equalnot (N xor V)
BPL{,pt|,pn} bl{,a}%icc or %xcc, labelBranch on lessN xor V
BPGU{,pt|,pn} bgu{,a}%icc or %xcc, labelBranch on greater unsignednot (C or Z)
BPLEU{,pt|,pn} bleu{,a}%icc or %xcc, labelBranch on less or equalC or Z
BPCC{,pt|,pn} bcc{,a}%icc or %xcc, labelunsigned

Branch on carry clear (greater

not C
BPCS{,pt|,pn} bcs{,a}%icc or %xcc, labelthan or equal, unsigned) Branch on carry set (less than,C

BPPOS
{,pt|,pn}
bpos{,a}

%icc or %xcc, label
unsigned)
Branch on positive

not N
BPNEG{,pt|,pn} bneg{,a}%icc or %xcc, labelBranch on negativeN
BPVC{,pt|,pn} bvc{,a}%icc or %xcc, labelBranch on overflow clearnot V

BPVS
{,pt|,pn}
bvs{,a}
{,pt|,pn}

%icc or %xcc, label

Branch on overflow set

V
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments
BRZbrz{,a}reg , labelBranch on register zeroZ

BRLEZ
{,pt|,pn}
brlez{,a}
rs1

reg , label rs1

Branch on register less than or

N or Z
BRLZ{,pt|,pn} brlz{,a}reg , label rs1equal to zero Branch on register less thanN

BRNZ
{,pt|,pn}
brnz{,a}

reg , label rs1
zero
Branch on register not zero

not Z
BRGZ{,pt|,pn} brgz{,a}reg , label rs1Branch on register greater thannot (N or Z)
BRGEZ{,pt|,pn} brgez{,a}reg , label rs1zero

Branch on register greater than

not N
{,pt|,pn}or equal to zero
CASAcasa[reg ]imm_asi,reg ,regCompare and swap word from

CASXA
casa
casxa
casxa
rs1......rs2 rd
[reg ]%asi,reg ,reg rs1....rs2 rd
[reg ]imm_asi,reg ,reg rs1......rs2 rd
[reg ]%asi,reg ,reg rs1....rs2 rd
alternate space
Compare and swap extended
from alternate space
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments
FBPAfba{,a}%fccn, label(Branch on cc with prediction) Branch always1
FBPN{,pt|,pn} fbn{,a}%fccn, labelBranch never0
FBPU{,pt|,pn} fbu{,a}%fccn, labelBranch on unorderedU
FBPG{,pt|,pn} fbg{,a}%fccn, labelBranch on greaterG
FBPUG{,pt|,pn} fbug{,a}%fccn, labelBranch on unordered or greaterG or U
FBPL{,pt|,pn} fbl{,a}%fccn, labelBranch on lessL
FBPUL{,pt|,pn} fbul{,a}%fccn, labelBranch on unordered or lessL or U
FBPLG{,pt|,pn} fblg{,a}%fccn, labelBranch on less or greaterL or G
FBPNE{,pt|,pn} fbne{,a}%fccn, labelBranch on not equalL or G or U
FBPE{,pt|,pn} fbe{,a}%fccn, labelBranch on equalE
FBPUE{,pt|,pn} fbue{,a}%fccn, labelBranch on unordered or equalE or U
FBPGE{,pt|,pn} fbge{,a}%fccn, labelBranch on greater or equalE or G
FBPUGE{,pt|,pn} fbuge{,a}%fccn, labelBranch on unordered or greaterE or G or U

FBPLE
{,pt|,pn}
fble{,a}

%fccn, label
or equal
Branch on less or equal

E or L
FBPULE{,pt|,pn} fbule{,a}%fccn, labelBranch on unordered or less orE or L or u

FBPO
{,pt|,pn}
fbo{,a}
{,pt|,pn}

%fccn, label
equal
Branch on ordered

E or L or G
FLUSHWflushw
Flush register windows
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments
FMOVAfmov%icc or %xcc, freg , freg(Move on integer cc) Move always1

FMOVN
{s,d,q}a
fmov
rs2 rd

%icc or %xcc, freg , freg rs2 rd

Move never

0
FMOVNE{s,d,q}n fmov%icc or %xcc, freg , freg rs2 rdMove if not equalnot Z
FMOVE{s,d,q}ne fmov%icc or %xcc, freg , freg rs2 rdMove if equalZ
FMOVG{s,d,q}e fmov%icc or %xcc, freg , freg rs2 rdMove if greaternot (Z or (N
{s,d,q}gxor V))
FMOVLEfmov%icc or %xcc, freg , freg rs2 rdMove if less or equalZ or (N xor V)
FMOVGE{s,d,q}le fmov%icc or %xcc, freg , freg rs2 rdMove if greater or equalnot (N xor V)
FMOVL{s,d,q}ge fmov%icc or %xcc, freg , freg rs2 rdMove if lessN xor V
FMOVGU{s,d,q}l fmov%icc or %xcc, freg , freg rs2 rdMove if greater unsignednot (C or Z)
FMOVLEU{s,d,q}gu fmov{s,d,%icc or %xcc, freg , freg rs2 rdMove if less or equal unsignedC or Z
FMOVCCq}leu fmov{s,d,%icc or %xcc, freg , freg rs2 rdMove if carry clear (greater ornot C
FMOVCSq}cc fmov{s,d,%icc or %xcc, freg , freg rs2 rdequal, unsigned) Move if carry set (less than,C

FMOVPOS
q}cs
fmov{s,d,

%icc or %xcc, freg , freg rs2 rd
unsigned)
Move if positive

not N
FMOVNEGq}pos fmov{s,d,%icc or %xcc, freg , freg rs2 rdMove if negativeN
FMOVVCq}neg fmov{s,d,%icc or %xcc, freg , freg rs2 rdMove if overflow clearnot V

FMOVVS
q}vc
fmov{s,d,
q}vs

%icc or %xcc, freg , freg rs2 rd

Move if overflow set

V
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments

FMOVRZ

fmovr

reg , freg , freg
(Move f-p register on cc)
Move if register zero



FMOVRLEZ

{s,d,q}e
fmovr
rs1 rs2 rd

reg , freg , freg rs1 rs2 rd


Move if register less than or


FMOVRLZ
{s,d,q}lz
fmovr

reg , freg , freg rs1 rs2 rd
equal zero
Move if register less than zero

FMOVRNZ{s,d,q}lz fmovrreg , freg , freg rs1 rs2 rdMove if register not zero
FMOVRGZ{s,d,q}ne fmovrreg , freg , freg rs1 rs2 rdMove if register greater than
FMOVRGEZ{s,d,q}gz fmovrreg , freg , freg rs1 rs2 rdzero

Move if register greater than or


{s,d,q}gezequal to zero
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments
FMOVFAfmov{s,d,%fccn,freg ,freg(Move on floating-point cc) Move always1

FMOVFN
q}a
fmov{s,d,
rs2..rd

%fccn,freg ,freg rs2..rd

Move never

0
FMOVFUq}n fmov{s,d,%fccn,freg ,freg rs2..rdMove if unorderedU
FMOVFGq}u fmov{s,d,%fccn,freg ,freg rs2..rdMove if greaterG
FMOVFUGq}g fmov{s,d,%fccn,freg ,freg rs2..rdMove if unordered or greaterG or U
FMOVFLq}ug fmov{s,d,%fccn,freg ,freg rs2..rdMove if lessL
FMOVFULq}l fmov{s,d,%fccn,freg ,freg rs2..rdMove if unordered or lessL or U
FMOVFLGq}ul fmov{s,d,%fccn,freg ,freg rs2..rdMove if less or greaterL or G
FMOVFNEq}lg fmov{s,d,%fccn,freg ,freg rs2..rdMove if not equalL or G or U
FMOVFEq}ne fmov{s,d,%fccn,freg ,freg rs2..rdMove if equalE
FMOVFUEq}e fmov{s,d,%fccn,freg ,freg rs2..rdMove if unordered or equalE or U
FMOVFGEq}ue fmov{s,d,%fccn,freg ,freg rs2..rdMove if greater or equalE or G
FMOVFUGEq}ge fmov{s,d,%fccn,freg ,freg rs2..rdMove if unordered or greater orE or G or U

FMOVFLE
q}uge
fmov{s,d,

%fccn,freg ,freg rs2..rd
equal
Move if less or equal

E or L
FMOVFULEq}le fmov{s,d,%fccn,freg ,freg rs2..rdMove if unordered or less orE or L or u

FMOVFO
q}ule
fmov{s,d,
q}o

%fccn,freg ,freg rs2..rd
equal
Move if ordered

E or L or G
LDSW
LDSWA
ldsw
ldsw
[address], regrd
[regaddr] imm_asi, regrd
Load a signed word
Load signed word from
alternate space

Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments
LDX
LDXA
ldx
ldxa
[address], regrd
[regaddr] imm_asi, reg
Load extended word
Load extended word from


LDXFSR
ldxa
ldx
rd
[reg_plus_imm] %asi, regrd
[address], %fsr
alternate space
Load floating-point state
register

MEMBARmembarmembar_maskMemory barrier
MOVAmova%icc or %xcc, reg_or_imm11, reg(Move integer register on cc) Move always1
MOVNmovnrd

%icc or %xcc, reg_or_imm11, regrd

Move never0
MOVNEmovne%icc or %xcc, reg_or_imm11, regrdMove if not equalnot Z
MOVEmove%icc or %xcc, reg_or_imm11, regrdMove if equalZ
MOVGmovg%icc or %xcc, reg_or_imm11, regrdMove if greaternot (Z or (N xor V))
MOVLEmovle%icc or %xcc, reg_or_imm11, regrdMove if less or equalZ or (N xor V)
MOVGEmovge%icc or %xcc, reg_or_imm11, regrdMove if greater or equalnot (N xor V)
MOVLmovl%icc or %xcc, reg_or_imm11, regrdMove if lessN xor V
MOVGUmovgu%icc or %xcc, reg_or_imm11, regrdMove if greater unsignednot (C or Z)
MOVLEUmovleu%icc or %xcc, reg_or_imm11, regrdMove if less or equal unsignedC or Z
MOVCCmovcc%icc or %xcc, reg_or_imm11, regrdMove if carry clear (greater ornot C
MOVCSmovcs%icc or %xcc, reg_or_imm11, regrdequal, unsigned) Move if carry set (less than,C

MOVPOS

movpos

%icc or %xcc, reg_or_imm11, regrd
unsigned)
Move if positive

not N
MOVNEGmovneg%icc or %xcc, reg_or_imm11, regrdMove if negativeN
MOVVCmovvc%icc or %xcc, reg_or_imm11, regrdMove if overflow clearnot V
MOVVSmovvs%icc or %xcc, reg_or_imm11, regrdMove if overflow setV
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments

MOVFA
MOVFN
MOVFU
MOVFG
MOVFUG
MOVFL
MOVFUL
MOVFLG

mova
movn
movu
movg
movug
movl
movul
movlg

%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,reg
(Move on floating-point cc)
Move always
Move never
Move if unordered
Move if greater
Move if unordered or greater
Move if less
Move if unordered or less
Move if less or greater

1
0
U
G
G or U
L
L or U
L or G
MOVFNE
MOVFE
MOVFUE
MOVFGE
MOVFUGE
movne
move
movue
movge
movuge
rd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,reg
Move if not equal
Move if equal
Move if unordered or equal
Move if greater or equal
Move if unordered or greater or
L or G or U
E
E or U
E or G
E or G or U

MOVFLE
MOVFULE

movle
movule
rd

%fccn,reg_or_imm11,regrd
%fccn,reg_or_imm11,regrd
equal
Move if less or equal
Move if unordered or less or

E or L
E or L or u

MOVFO

movo

%fccn,reg_or_imm11,regrd
equal
Move if ordered

E or L or G

MOVRZ
MOVRLEZ

movre
movrlez

reg , reg_or_imm10,reg rs1.......rd
reg , reg_or_imm10,reg
(Move register on register cc)
Move if register zero
Move if register less than or

Z
N or Z

MOVRLZ
MOVRNZ
MOVRGZ

movrlz
movrnz
movrgz
rs1.......rd

reg , reg_or_imm10,reg rs1.......rd
reg , reg_or_imm10,reg rs1.......rd
reg , reg_or_imm10,reg rs1.......rd
equal to zero
Move if register less than zero
Move if register not zero
Move if register greater than

N
not Z
N nor Z
MOVRGEZmovrgezreg , reg_or_imm10,regzero

Move if register greater than or

not N
rs1.......rdequal to zero

MULX

mulx

reg , reg_or_imm,reg rs1......rd
(Generic 64-bit Multiply)
Multiply (signed or unsigned)
See SDIVX
and UDIVX
POPCpopc reg_or_imm, regrdPopulation count
PREFETCH
PREFETCHA
prefetch
prefetcha
prefetcha
[address], prefetch_dcn
[regaddr] imm_asi, prefetch_fcn
[reg_plus_imm] %asi, prefetch_fcn
Prefetch data
Prefetch data from alternate
space
See The
SPARC
architecture
manual,
version 9
Table E-10 (Continued)
OpcodeMnemonicArgument ListOperationComments

SDIVX

sdivx

reg , reg_or_imm,reg rs1......rd
(64-bit signed divide)
Signed Divide

See MULX
and UDIVX
STX
STXA
stx
stxa
reg , [address] rd
reg , [address] imm_asi
Store extended word
Store extended word into


STXFSR
stxa
stx
rd
reg , [reg_plus_imm] %asi rd
%fsr, [address]
alternate space
Store floating-point register (all
64-bits)

UDIVX

udivx

reg , reg_or_imm, reg rs1......rd
(64-bit unsigned divide)
Unsigned divide

See MULX
and SDIVX

SPARC-V9 Floating-Point Instruction Set Mapping

SPARC-V9 floating-point instructions are shown in Table E-11 Table E-11 SPARC-V9 Floating-Point Instructions
SPARC* MnemonicArgument ListDescription
F[sdq]TOxfstox
fdtox
fqtox
fstoi
fdtoi
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg
Convert floating point to 64-bit
integer

Convert floating-point to 32-bit
integer

fqtoi
rs2 rd
freg , freg rs2 rd
* Types of Operands ar i 32-bit integer x 64-bit integer s single d double q quade denoted by the following lower-case letters:
Table E-11 (Continued)
SPARC* MnemonicArgument ListDescription
FxTO[sdq]fxtos
fxtod
fxtoq
fitos
fitod
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg
Convert 64-bit integer to floating
point

Convert 32-bit integer to floating
point

fitoq
rs2 rd
freg , freg rs2 rd
FMOV[dq]fmovd
fmovq
freg , freg rs2 rd
freg , freg rs2 rd
Move double
Move quad
FNEG[dq]fnegd
fnegq
freg , freg rs2 rd
freg , freg rs2 rd
Negate double
Negate quad
FABS[dq]fabsd
fabsq
freg , freg rs2 rd
freg , freg rs2 rd
Absolute value double
Absolute value quad
LDFAlda[regaddr] imm_asi, fregLoad floating-point register from

LDDFA
lda
ldda
rd
[reg_plus_imm] %asi, fregrd
[regaddr] imm_asi, freg
alternate space
Load double floating-point

LDQFA
ldda
ldqa
ldqa
rd
[reg_plus_imm] %asi, fregrd
[regaddr] imm_asi, fregrd
[reg_plus_imm] %asi, fregrd
register from alternate space.
Load quad floating-point register
from alternate space
STFAstafreg , [regaddr] imm_asiStore floating-point register to

STDFA
sta
stda
rd
freg , [reg_plus_imm] %asi rd
freg , [regaddr] imm_asi
alternate space
Store double floating-point

STQFA
stda
stqa
stqa
rd
freg , [reg_plus_imm] %asi rd
freg , [regaddr] imm_asi rd
freg , [reg_plus_imm] %asi rd
register to alternate space
Store quad floating-point register
to alternate space
* Types of Operands ar i 32-bit integer x 64-bit integer s single d double q quade denoted by the following lower-case letters:

SPARC-V9 Synthetic Instruction-Set Mapping

Here is a mapping of synthetic instructions to hardware equivalent instructions.

Table E-12
Synthetic Instruction
Hardware Equivalent(s)
Comment
cas
casl
casx
casxl
[reg ], reg , reg rsl rs2rd
[reg ], reg , reg rsl rs2rd
[reg ], reg , reg rsl rs2rd
[reg ], reg , reg rsl rs2rd
casa
casa
casxa
casxa
[reg ]ASI_P, reg , reg rsl...rs2rd
[reg ]ASI_P_L, reg , reg rsl.....rs2rd
[reg ]ASI_P, reg , reg rsl...rs2rd
[reg ]ASI_P_L, reg , reg rsl.....rs2rd
Compare & swap (cas)
cas little-endian
cas extended
cas little-endian,
extended
clrx[address]stx%g0, [address]Clear extended word
clruwreg , regsrlreg , %g0, regCopy and clear upper


clruw
rs1rd

reg


srl
rs1...rd

reg , %g0, reg

word
Clear upper word
rdrd...rd
iprefetchlabelbn, pt%xcc, labelInstruction prefetch,
mov
mov
mov
%y, regrd
%asrn, regrd
reg_or_imm, %asrn
rd
rd
wr
%y, regrd
%asrn, regrd
%g0, reg_or_imm, %asrn
ret
retl
jmpl
jmpl
%i7+8, %g0
%o7+8, %g0
Return from subroutine
Return from leaf
subroutine
setuwvalue,regrdsethi
or
sethi
or
%hi(value), regrd
%g0, value, regrd
%hi(value), regrd;
reg , %lo(value), reg rd......rd
(value & 3FF )==0 16
when 0 . value . 4095
(otherwise)
Do not use setuw in a
DCTI delay slot.
Table E-12 (Continued)
Synthetic Instruction
Hardware Equivalent(s)
Comment
setswvalue,regrdsethi%hi(value), regrdvalue>=0 and

or
sethi

%g0, value, regrd
%hi(value), regrd
(value & 3FF )==0 16
-4096 . value . 4095
if (value<0) and

sra
sethi

reg , %g0, reg rd...rd
%hi(value), reg
((value & 3FF)==0)

(otherwise, if value>=0)
or
sethi
rd;
reg , %lo(value), reg rd......rd
%hi(value), reg ;

(otherwise, if value<0)
or
sra
rd
reg , %lo(value), reg rd......rd
reg , %g0, reg rd...rd

Do not use setsw in a
CTI delay slot.
signx
signx
reg reg rsl,rd
regrd
sra
sra
reg %g0, reg rsl,...rd
reg %g0, reg rd,...rd
Sign-extend 32-bit value
to 64 bits

SPARC-V9 Instruction Set Extensions

This section describes extensions to the SPARC-V9 instruction set. The extensions support enhanced graphics functionality and improved memory access efficiency.

Note - SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.

Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (., G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.
A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.
Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.
Table E-13
SPARCMnemonicArgument ListDescription
SHUTDOWNshutdown
shutdown to enter power down mode

Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.
Table E-14
SPARCMnemonicArgument ListDescription
RDASR
WRASR
rdasr
wrasr
%gsr, regrd
reg , reg_or_imm, %gsr rs1
read GSR
write GSR

Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.
The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.
Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.
Table E-15
SPARCMnemonicArgument ListDescription
FPADD16
FPADD16S
FPADD32
FPADD32S
FPSUB16
FPSUB16S
FPSUB32
FPSUB32S
fpadd16
fpadd16s
fpadd32
fpadd32s
fpsub16
fpsub16s
fpsub32
fpsub32s
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
four 16-bit add
two 16-bit add
two 32-bit add
one 32-bit add
four 16-bit subtract
two 16-bit subtract
two 32-bit subtract
one 32-bit subtract
Pack instructions convert to a lower pixel or precision fixed format.
Table E-16
SPARCMnemonicArgument ListDescription
FPACK16
FPACK32
FPACKFIX
FEXPAND
FPMERGE
fpack16
fpack32
fpackfix
fexpand
fpmerge
freg , freg rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg , freg rs1 rs2 rd
four 16-bit packs
two 32-bit packs
four 16-bit packs
four 16-bit expands
two 32-bit merges
Partitioned multiply instructions have the following variations.
Table E-17
SPARCMnemonicArgument ListDescription
FMUL8x16
FMUL8x16AU
FMUL8x16AL
FMUL8SUx16
FMUL8ULx16
FMULD8SUx16
FMULD8ULx16
fmul8x16
fmul8x16au
fmul8x16al
fmul8sux16
fmul8ulx16
fmuld8sux16
fmuld8ulx16
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
8x16-bit partition
8x16-bit upper . partition
8x16-bit lower . partition
upper 8x16-bit partition
lower unsigned 8x16-bit partition
upper 8x16-bit partition
lower unsigned 8x16-bit partition
Alignment instructions have the following variations.
Table E-18
SPARCMnemonicArgument ListDescription
ALIGNADDRESS
ALIGNADDRESS_
LITTLE
FALIGNDATA
alignaddr
alignaddrl

faligndata
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd

freg , freg , freg rs1 rs2 rd
find misaligned data access address
same as above, but little-endian

do misaligned data, data alignment
Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).
Table E-19
SPARCMnemonicArgument ListDescription
FZERO
FZEROS
FONE
FONES
FSRC1
FSRC1S
FSRC2
FSRC2S
FNOT1
FNOT1S
FNOT2
FNOT2S
FOR
FORS
FNOR
FNORS
FAND
FANDS
FNAND
FNANDS
FXOR
FXORS
FXNOR
FXNORS
FORNOT1
FORNOT1S
FORNOT2
FORNOT2S
FANDNOT1
FANDNOT1S
FANDNOT2
FANDNOT2S
fzero
fzeros
fone
fones
fsrc1
fsrc1s
fsrc2
fsrc2s
fnot1
fnot1s
fnot2
fnot2s
for
fors
fnor
fnors
fand
fands
fnand
fnands
fxor
fxors
fxnor
fxnors
fornot1
fornot1s
fornot2
fornot2s
fandnot1
fandnot1s
fandnot2
fandnot2s
fregrd
fregrd
fregrd
fregrd
freg , freg rs1 rd
freg , freg rs1 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg rs1 rd
freg , freg rs1 rd
freg , freg rs2 rd
freg , freg rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
freg , freg , freg rs1 rs2 rd
zero fill
zero fill, single precision
one fill
one fill, single precision
copy src1
copy src1, single precision
copy src2
copy src2, single precision
negate src1, 1's complement
same as above, single precision
negate src2, 1's complement
same as above, single precision
logical OR
logical OR, single precision
logical NOR
logical NOR, single precision
logical AND
logical AND, single precision
logical NAND
logical NAND, single precision
logical XOR
logical XOR, single precision
logical XNOR
logical XNOR, single precision
negated src1 OR src2
same as above, single precision
src1 OR negated src2
same as above, single precision
negated src1 AND src2
same as above, single precision
src1 AND negated src2
same as above, single precision
Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)
Table E-20
SPARCMnemonicArgument ListDescription
FCMPGT16
FCMPGT32
FCMPLE16
FCMPLE32
FCMPNE16
FCMPNE32
FCMPEQ16
FCMPEQ32
fcmpgt16
fcmpgt32
fcmple16
fcmple32
fcmpne16
fcmpne32
fcmpeq16
fcmpeq32
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
freg , freg , reg rs1 rs2rd
4 16-bit compare, set rd if src1>src2
2 32-bit compare, set rd if src1>src2
4 16-bit compare, set rd if src1.src2
2 32-bit compare, set rd if src1.src2
4 16-bit compare, set rd if src1.src2
2 32-bit compare, set rd if src1.src2
4 16-bit compare, set rd if src1=src2
2 32-bit compare, set rd if src1=src2
Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.
Table E-21
SPARCMnemonicArgument ListDescription
EDGE8
EDGE8L
EDGE16
EDGE16L
EDGE32
EDGE32L
edge8
edge8l
edge16
edge16l
edge32
edge32l
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd
8 8-bit edge boundary processing
same as above, little-endian
4 16-bit edge boundary processing
same as above, little-endian
2 32-bit edge boundary processing
same as above, little-endian
Pixel component distance instructions are used for motion estimation in video compression algorithms.
SPARC-V9 Pixel Component Distance Instruction
SPARCMnemonicArgument ListDescription
PDISTpdistfreg , freg , freg rs1 rs2 rd8 8-bit components, distance between

The three-dimensional array addressing instructions convert three-dimensional fixed-point addresses (in rs1) to a blocked-byte address.The result is stored in rd.

Table E-22
SPARCMnemonicArgument ListDescription
ARRAY8array8reg , reg , regconvert 8-bit 3-D address to


ARRAY16
ARRAY32


array16
array32
rs1rs2rd

reg , reg , reg rs1rs2rd
reg , reg , reg rs1rs2rd

blocked byte address
same as above, but 16-bit
same as above, but 32-bit

Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.
Table E-23
SPARCimm_asiArgument ListDescription

STDFA
STDFA
STDFA
STDFA

ASI_PST8_P
ASI_PST8_S
ASI_PST8_PL
ASI_PST8_SL

stda freg , [freg ] reg imm_asi rd rs1mask,
eight 8-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space, little endian

STDFA
STDFA
STDFA
STDFA

ASI_PST16_P
ASI_PST16_S
ASI_PST16_PL
ASI_PST16_SL
four 16-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space, little endian

STDFA
STDFA
STDFA
STDFA

ASI_PST32_P
ASI_PST32_S
ASI_PST32_PL
ASI_PST32_SL
two 32-bit conditional stores to:
primary address space
secondary address space
primary address space, little endian
secondary address space, little endian

Note - To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.

Table E-24
SPARCimm_asiArgument ListDescription

LDDFA
STDFA

ASI_FL8_P

ldda [reg_addr] imm_asi, freqrd
stda freq , [reg_addr] imm_asi
8-bit load/store from/to:
primary address space
LDDFAASI_FL8_Srd

ldda [reg_plus_imm] %asi, freq

secondary address space
STDFArd stda [reg_plus_imm] %asi
LDDFA
STDFA
ASI_FL8_PL primary address space, little endian
LDDFA
STDFA
ASI_FL8_SL secondary address space, little endian

LDDFA
STDFA

ASI_FL16_P
16-bit load/store from/to:
primary address space
LDDFA
STDFA
ASI_FL16_S secondary address space
LDDFA
STDFA
ASI_FL16_PL primary address space, little endian
LDDFA
STDFA
ASI_FL16_SL secondary address space, little endian

Note - To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions

Table E-25
SPARCimm_asiArgument ListDescription
LDDAASI_NUCLEUS_[reg_addr] imm_asi, reg128-bit atomic load
LDDAQUAD_LDD

ASI_NUCLEUS_ QUAD_LDD_L

rd

[reg_plus_imm] %asi, regrd

128-bit atomic load, little endian
Table E-26
SPARCimm_asiArgument ListDescription

LDDFA
STDFA

ASI_BLK_AIUP

ldda [reg_addr] imm_asi, freqrd
stda freq , [reg_addr] imm_asi
64-byte block load/store from/to:
primary address space, user privilege
LDDFAASI_BLK_AIUSrd

ldda [reg_plus_imm] %asi, freq

secondary address space, user privilege.
STDFArd stda fregrd, [reg_plus_imm] %asi
LDDFAASI_BLK_AIUPL primary address space, user privilege,
STDFA little endian
LDDFAASI_BLK_AIUSL secondary address space, user privilege
STDFA little endian
LDDFA
STDFA
ASI_BLK_P primary address space
LDDFA
STDFA
ASI_BLK_S secondary address space
LDDFA
STDFA
ASI_BLK_PL primary address space, little endian
LDDFA
STDFA
ASI_BLK_SL secondary address space, little endian
LDDFAASI_BLK_COMMIT_P64-byte block commit store to primary
STDFA address space
LDDFAASI_BLK_COMMIT_S64-byte block commit store to secondary
STDFAaddress space

Note - To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.