SPARC Assembly Language Reference Manual
只搜寻这本书
以 PDF 格式下载本书 (600 KB)

Appendix E SPARC-V9 Instruction Set

This appendix describes changes made to the SPARC instruction set due to the SPARC-V9 architecture. Application software for the 32-bit SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.

This appendix is organized into the following sections:

E.1 SPARC-V9 Changes

The SPARC-V9 architecture differs from SPARC-V8 architecture in the following areas, expanded below: registers, alternate space access, byte order, and instruction set.

E.1.1 Registers

These registers have been deleted:

Table E–1

PSR

Processor State Register

TBR

Trap Base Register

WIM

Window Invalid Mask

These registers have been widened from 32 to 64 bits:

Table E–2

Integer registers

All state registers

FSR, PC, nPC, and Y


Note –

FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added floating-point condition code) bits are added and the register widened to 64-bits.


These SPARC-V9 registers are within a SPARC-V8 register field:

Table E–3

CCR

Condition Codes Register

CWP

Current Window Pointer

PIL

Processor Interrupt Level

TBA

Trap Base Address

TT[MAXTL]

Trap Type

VER

Version

These are registers that have been added.

Table E–4

ASI

Address Space Identifier

CANRESTORE

Restorable Windows

CANSAVE

Savable windows

CLEANWIN

Clean Windows

FPRS

Floating-point Register State

OTHERWIN

Other Windows

PSTATE

Processor State

TICK

Hardware clock tick-counter

TL

Trap Level

TNPC[MAXTL]

Trap Next Program Counter

TPC[MAXTL]

Trap Program Counter

TSTATE[MAXTL]

Trap State

WSTATE

Windows State

Also, there are sixteen additional double-precision floating-point registers, f[32] .. f[62]. These registers overlap (and are aliased with) eight additional quad-precision floating-point registers, f[32] .. f[60]

The SPARC-V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC-V8. This change has no effect on nonprivileged instructions.

E.1.2 Alternate Space Access

Load- and store-alternate instructions to one-half of the alternate spaces can now be included in user code. In SPARC-V9, loads and stores to ASIs 0016 .. 7f16 are privileged; those to ASIs 8016 .. FF16 are nonprivileged. In SPARC-V8, access to alternate address spaces is privileged.

E.1.3 Byte Order

SPARC-V9 supports both little- and big-endian byte orders for data accesses only; instruction accesses are always performed using big-endian byte order. In SPARC-V8, all data and instruction accesses are performed in big-endian byte order.

E.2 SPARC-V9 Instruction Set Changes

Application software written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.

E.2.1 Extended Instruction Definitions to Support the 64-bit Model

Table E–5

FCMP, FCMPE

Floating-Point Compare—can set any of the four floating-point condition codes.

LDFSR, STFSR

Load/Store FSR- only affect low-order 32 bits of FSR

LDUW, LDUWA

Same as LD, LDA in SPARC-V8

RDASR/WRASR

Read/Write State Registers - access additional registers

SAVE/RESTORE

SETHI

SRA, SRL, SLL, Shifts

Split into 32-bit and 64-bit versions

Tcc

(was Ticc) Operates with either the 32-bit integer condition codes (icc), or the 64-bit integer condition codes (xcc)

All other arithmetic operations operate on 64-bit operands and produce 64-bit results.

E.2.2 Added Instructions to Support 64 bits

Table E–6

F[sdq]TOx

Convert floating point to 64-bit word

FxTO[sdq]

Convert 64-bit word to floating point

FMOV[dq]

Floating-Point Move, double and quad

FNEG[dq]

Floating-point Negate, double and quad

FABS[dq]

Floating-point Absolute Value, double and quad

LDDFA, STDFA, LDFA, STFA

Alternate address space forms of LDDF, STDF, LDF, and STF

LDSW

Load a signed word

LDSWA

Load a signed word from an alternate space

LDX

Load an extended word

LDXA

Load an extended word from an alternate space

LDXFSR

Load all 64 bits of the FSR register

STX

Store an extended word

STXA

Store an extended word into an alternate space

STXFSR

Store all 64 bits if the FSR register

E.2.3 Added Instructions to Support High-Performance System Implementation

Table E–7

BPcc

Branch on integer condition code with prediction

BPr

Branch on integer register contents with prediction

CASA, CASXA

Compare and Swap from an alternate space

FBPfcc

Branch on floating-point condition code with prediction

FLUSHW

Flush windows

FMOVcc

Move floating-point register if condition code is satisfied

FMOVr

Move floating-point register if integer register satisfies condition

LDQF(A), STQF(A)

Load/Store Quad Floating-point (in an alternate space)

MOVcc

Move integer register if condition code is satisfied

MOVr

Move integer register if register contents satisfy condition

MULX

Generic 64-bit multiply

POPC

Population count

PREFETCH, PREFETCHA

Prefetch Data

SDIVX, UDIVX

Signed and Unsigned 64-bit divide

E.2.4 Deleted Instructions

Table E–8

Coprocessor loads and stores

RDTBR and WRTBR

TBR no longer exists. It is replaced by TBA, which can be read/written with RDPR/WRPR instructions

RDWIM and WRWIM

WIM no longer exists. WIM has been replaced by several register-window registers

REPSR and WRPSR

PSR no longer exists. It has been replaced by several separate registers that are read/written with other instructions

RETT

Return from trap (replace by DONE/RETRY)

STDFQ

Store Double from Floating-point Queue (replaced by the RDPR FQ instruction

E.2.5 Miscellaneous Instruction Changes

Table E–9

IMPDEPn

(Changed) Implementation-dependent instructions (replace SPARC-V8 CPop instructions)

MEMBAR

(Added) Memory barrier (memory synchronization support)

E.3 SPARC-V9 Instruction Set Mapping

Table E–10

Opcode

Mnemonic

Argument List

Operation

Comments

BPA

ba{,a}

{,pt|,pn}

%icc or %xcc, label

(Branch on cc with prediction)

Branch always

1

BPN

bn{,a}

{,pt|,pn}

%icc or %xcc, label

Branch never

0

BPNE

bne{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on not equal

not Z

BPE

be{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on equal

Z

BPG

bg{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on greater

not (Z or (N xor V))

BPLE

ble{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on less or equal

Z or (N xor V)

BPGE

bge{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on greater or equal

not (N xor V)

BPL

bl{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on less

N xor V

BPGU

bgu{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on greater unsigned

not (C or Z)

BPLEU

bleu{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on less or equal unsigned

C or Z

BPCC

bcc{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on carry clear (greater than or equal, unsigned)

not C

BPCS

bcs{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on carry set (less than, unsigned)

C

BPPOS

bpos{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on positive

not N

BPNEG

bneg{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on negative

N

BPVC

bvc{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on overflow clear

not V

BPVS

bvs{,a}

{,pt|,pn}

%icc or %xcc, label

Branch on overflow set

V

BRZ

brz{,a}

{,pt|,pn}

regrs1, label

Branch on register zero

Z

BRLEZ

brlez{,a}

{,pt|,pn}

regrs1, label

Branch on register less than or equal to zero

N or Z

BRLZ

brlz{,a}

{,pt|,pn}

regrs1, label

Branch on register less than zero

N

BRNZ

brnz{,a}

{,pt|,pn}

regrs1, label

Branch on register not zero

not Z

BRGZ

brgz{,a}

{,pt|,pn}

regrs1, label

Branch on register greater than zero

not (N or Z)

BRGEZ

brgez{,a}

{,pt|,pn}

regrs1, label

Branch on register greater than or equal to zero

not N

CASA

casa

casa

[regrs1]imm_asi,regrs2,regrd

[regrs1]%asi,regrs2,regrd

Compare and swap word from alternate space

CASXA

casxa

casxa

[regrs1]imm_asi,regrs2,regrd

[regrs1]%asi,regrs2,regrd

Compare and swap extended from alternate space

FBPA

fba{,a}

{,pt|,pn}

%fccn, label

(Branch on cc with prediction)

Branch never

1

FBPN

fbn{,a}

{,pt|,pn}

%fccn, label

Branch always

0

FBPU

fbu{,a}

{,pt|,pn}

%fccn, label

Branch on unordered

U

FBPG

fbg{,a}

{,pt|,pn}

%fccn, label

Branch on greater

G

FBPUG

fbug{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or greater

G or U

FBPL

fbl{,a}

{,pt|,pn}

%fccn, label

Branch on less

L

FBPUL

fbul{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or less

L or U

FBPLG

fblg{,a}

{,pt|,pn}

%fccn, label

Branch on less or greater

L or G

FBPNE

fbne{,a}

{,pt|,pn}

%fccn, label

Branch on not equal

L or G or U

FBPE

fbe{,a}

{,pt|,pn}

%fccn, label

Branch on equal

E

FBPUE

fbue{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or equal

E or U

FBPGE

fbge{,a}

{,pt|,pn}

%fccn, label

Branch on greater or equal

E or G

FBPUGE

fbuge{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or greater or equal

E or G or U

FBPLE

fble{,a}

{,pt|,pn}

%fccn, label

Branch on less or equal

E or L

FBPULE

fbule{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or less or equal

E or L or u

FBPO

fbo{,a}

{,pt|,pn}

%fccn, label

Branch on ordered

E or L or G

FLUSHW

flushw

Flush register windows

FMOVA

fmov

{s,d,q}a

%icc or %xcc, fregrs2, fregrd

(Move on integer cc)

Move always

1

FMOVN

fmov

{s,d,q}n

%icc or %xcc, fregrs2, fregrd

Move never

0

FMOVNE

fmov

{s,d,q}ne

%icc or %xcc, fregrs2, fregrd

Move if not equal

not Z

FMOVE

fmov

{s,d,q}e

%icc or %xcc, fregrs2, fregrd

Move if equal

Z

FMOVG

fmov

{s,d,q}g

%icc or %xcc, fregrs2, fregrd

Move if greater

not (Z or (N xor V))

FMOVLE

fmov

{s,d,q}le

%icc or %xcc, fregrs2, fregrd

Move if less or equal

Z or (N xor V)

FMOVGE

fmov

{s,d,q}ge

%icc or %xcc, fregrs2, fregrd

Move if greater or equal

not (N xor V)

FMOVL

fmov

{s,d,q}l

%icc or %xcc, fregrs2, fregrd

Move if less

N xor V

FMOVGU

fmov

{s,d,q}gu

%icc or %xcc, fregrs2, fregrd

Move if greater unsigned

not (C or Z)

FMOVLEU

fmov

{s,d,q}leu

%icc or %xcc, fregrs2, fregrd

Move if less or equal unsigned

C or Z

FMOVCC

fmov

{s,d,q}cc

%icc or %xcc, fregrs2, fregrd

Move if carry clear (greater or equal, unsigned)

not C

FMOVCS

fmov

{s,d,q}cs

%icc or %xcc, fregrs2, fregrd

Move if carry set (less than, unsigned)

C

FMOVPOS

fmov

{s,d,q}pos

%icc or %xcc, fregrs2, fregrd

Move if positive

not N

FMOVNEG

fmov

{s,d,q}neg

%icc or %xcc, fregrs2, fregrd

Move if negative

N

FMOVVC

fmov

{s,d,q}vc

%icc or %xcc, fregrs2, fregrd

Move if overflow clear

not V

FMOVVS

fmov

{s,d,q}vs

%icc or %xcc, fregrs2, fregrd

Move if overflow set

V

FMOVRZ

fmovr

{s,d,q}e

regrs1, fregrs2, fregrd

(Move f-p register on cc)

Move if register zero

FMOVRLEZ

fmovr

{s,d,q}lz

regrs1, fregrs2, fregrd

Move if register less than or equal zero

FMOVRLZ

fmovr

{s,d,q}lz

regrs1, fregrs2, fregrd

Move if register less than zero

FMOVRNZ

FMOVRGZ

FMOVRGEZ

fmovr

{s,d,q}ne

fmovr

{s,d,q}gz

fmovr

{s,d,q}gez

regrs1, fregrs2, fregrd

regrs1, fregrs2, fregrd

regrs1, fregrs2, fregrd

Move if register not zero

Move if register greater than zero

Move if register greater than or equal to zero

FMOVFA

FMOVFN

FMOVFU

FMOVFG

FMOVFUG

FMOVFL

FMOVFUL

FMOVFLG

FMOVFNE

FMOVFE

FMOVFUE

FMOVFGE

FMOVFUGE

FMOVFLE

FMOVFULE

FMOVFO

fmov{s,d,q}a

fmov{s,d,q}n

fmov{s,d,q}u

fmov{s,d,q}g

fmov{s,d,q}ug

fmov{s,d,q}l

fmov{s,d,q}ul

fmov{s,d,q}lg

fmov{s,d,q}ne

fmov{s,d,q}e

fmov{s,d,q}ue

fmov{s,d,q}ge

fmov{s,d,q}uge

fmov{s,d,q}le

fmov{s,d,q}ule

fmov{s,d,q}o

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

(Move on floating-point cc)

Move always

Move never

Move if unordered

Move if greater

Move if unordered or greater

Move if less

Move if unordered or less

Move if less or greater

Move if not equal

Move if equal

Move if unordered or equal

Move if greater or equal

Move if unordered or greater or equal

Move if less or equal

Move if unordered or less or equal

Move if ordered

1

0

U

G

G or U

L

L or U

L or G

L or G or U

E

E or U

E or G

E or G or U

E or L

E or L or u

E or L or G

LDSW

LDSWA

ldsw

ldsw

[address], regrd

[regaddr] imm_asi, regrd

Load a signed word

Load signed word from alternate space

LDX

LDXA

LDXFSR

ldx

ldxa

ldxa

ldx

[address], regrd

[regaddr] imm_asi, regrd

[reg_plus_imm] %asi, regrd

[address], %fsr

Load extended word

Load extended word from alternate space

Load floating-point state register

MEMBAR

membar

membar_mask

Memory barrier

MOVA

MOVN

MOVNE

MOVE

MOVG

MOVLE

MOVGE

MOVL

MOVGU

MOVLEU

MOVCC

MOVCS

MOVPOS

MOVNEG

MOVVC

MOVVS

mova

movn

movne

move

movg

movle

movge

movl

movgu

movleu

movcc

movcs

movpos

movneg

movvc

movvs

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

(Move integer register on cc)

Move always

Move never

Move if not equal

Move if equal

Move if greater

Move if less or equal

Move if greater or equal

Move if less

Move if greater unsigned

Move if less or equal unsigned

Move if carry clear (greater or equal, unsigned)

Move if carry set (less than, unsigned)

Move if positive

Move if negative

Move if overflow clear

Move if overflow set

1

0

not Z

Z

not (Z or (N xor V))

Z or (N xor V)

not (N xor V)

N xor V

not (C or Z)

C or Z

not C

C

not N

N

not V

V

MOVFA

MOVFN

MOVFU

MOVFG

MOVFUG

MOVFL

MOVFUL

MOVFLG

MOVFNE

MOVFE

MOVFUE

MOVFGE

MOVFUGE

MOVFLE

MOVFULE

MOVFO

mova

movn

movu

movg

movug

movl

movul

movlg

movne

move

movue

movge

movuge

movle

movule

movo

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

(Move on floating-point cc)

Move always

Move never

Move if unordered

Move if greater

Move if unordered or greater

Move if less

Move if unordered or less

Move if less or greater

Move if not equal

Move if equal

Move if unordered or equal

Move if greater or equal

Move if unordered or greater or equal

Move if less or equal

Move if unordered or less or equal

Move if ordered

1

0

U

G

G or U

L

L or U

L or G

L or G or U

E E or U

E or G

E or G or U

E or L

E or L or u

E or L or G

MOVRZ

MOVRLEZ

MOVRLZ

MOVRNZ

MOVRGZ

MOVRGEZ

movre

movrlez

movrlz

movrnz

movrgz

movrgez

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

(Move register on register cc)

Move if register zero

Move if register less than or equal to zero

Move if register less than zero

Move if register not zero

Move if register greater than zero

Move if register greater than or equal to zero

Z

N or Z

N

not Z

N nor Z

not N

MULX

mulx

regrs1, reg_or_imm,regrd

(Generic 64-bit Multiply) Multiply (signed or unsigned)

See SDIVX and UDIVX

POPC

popc

reg_or_imm, regrd

Population count

PREFETCH

PREFETCHA

prefetch

prefetcha

prefetcha

[address], prefetch_dcn [regaddr] imm_asi, prefetch_fcn [reg_plus_imm] %asi, prefetch_fcn

Prefetch data

Prefetch data from alternate space

See The SPARC architecture manual, version 9

SDIVX

sdivx

regrs1, reg_or_imm,regrd

(64-bit signed divide) Signed Divide

See MULX and UDIVX

STX

STXA

STXFSR

stx

stxa

stxa

stx

regrd, [address]

regrd, [address] imm_asi

regrd, [reg_plus_imm] %asi %fsr, [address]

Store extended word

Store extended word into alternate space

Store floating-point register (all 64-bits)

UDIVX

udivx

regrs1, reg_or_imm, regrd

(64-bit unsigned divide) Unsigned divide

See MULX and SDIVX

E.4 SPARC-V9 Floating-Point Instruction Set Mapping

SPARC-V9 floating-point instructions are shown in the following table.

Table E–11

SPARC

Mnemonic [Types of Operands are denoted by the following lower-case letters:i 32-bit integerx 64-bit integers singled doubleq quad]

Argument List

Description

F[sdq]TOx

fstox

fdtox

fqtox

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert floating point to 64-bit integer

fstoi

fdtoi

fqtoi

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert floating-point to 32-bit integer

FxTO[sdq]

fxtos

fxtod

fxtoq

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert 64-bit integer to floating point

fitos

fitod

fitoq

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert 32-bit integer to floating point

FMOV[dq]

fmovd

fmovq

fregrs2, fregrd

fregrs2, fregrd

Move double

Move quad

FNEG[dq]

fnegd

fnegq

fregrs2, fregrd

fregrs2, fregrd

Negate double

Negate quad

FABS[dq]

fabsd

fabsq

fregrs2, fregrd

fregrs2, fregrd

Absolute value double

Absolute value quad

LDFA

LDDFA

LDQFA

lda

lda

ldda

ldda

ldqa

ldqa

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

Load floating-point register from alternate space

Load double floating-point register from alternate space.

Load quad floating-point register from alternate space

STFA

STDFA

STQFA

sta

sta

stda

stda

stqa

stqa

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

Store floating-point register to alternate space

Store double floating-point register to alternate space

Store quad floating-point register to alternate space

E.5 SPARC-V9 Synthetic Instruction-Set Mapping

Here is a mapping of synthetic instructions to hardware equivalent instructions.

Table E–12

Synthetic Instruction

Hardware Equivalent(s)

Comment

cas

casl

casx

casxl

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

casa

casa

casxa

casxa

[regrsl]ASI_P, regrs2, regrd

[regrsl]ASI_P_L, regrs2, regrd

[regrsl]ASI_P, regrs2, regrd

[regrsl]ASI_P_L, regrs2, regrd

Compare & swap (cas)

cas little-endian

cas extended

cas little-endian, extended

clrx

[address]

stx

%g0, [address]

Clear extended word

clruw

clruw

regrs1, regrd

regrd

srl

srl

regrs1, %g0, regrd

regrd, %g0, regrd

Copy and clear upper word

Clear upper word

iprefetch

label

bn, pt

%xcc, label

Instruction prefetch,

mov

mov

mov

%y, regrd

%asrn, regrd

reg_or_imm, %asrn

rd

rd

wr

%y, regrd

%asrn, regrd

%g0, reg_or_imm, %asrn

ret

retl

jmpl

jmpl

%i7+8, %g0

%o7+8, %g0

Return from subroutine

Return from leaf subroutine

setn

value, r1, r2

for -xarch=v9 same as setx value r1, r2

for -xarch=v8 same as set value r2

setnhi

value, r1, r2

for -xarch=v9 same as setxhi value r1, r2

for -xarch=v8 same as sethi value r2

setuw

value,regrd

sethi

or

sethi

or

%hi(value), regrd

%g0, value, regrd

%hi(value), regrd;

regrd, %lo(value), regrd

(value & 3FF16)==0

when 0 ≤ value 4095

(otherwise)

Do not use setuw in a DCTI delay slot.

setsw

value,regrd

sethi

or

sethi

sra

sethi

or

sethi

or

sra

%hi(value), regrd

%g0, value, regrd

%hi(value), regrd

regrd, %g0, regrd

%hi(value), regrd;

regrd, %lo(value), regrd

%hi(value), regrd;

regrd, %lo(value), regrd

regrd, %g0, regrd

value>=0 and (value & 3FF16)==0

-4096 ≤ value ≤ 4095

if (value<0) and ((value & 3FF)==0)

(otherwise, if value>=0)

(otherwise, if value<0)

Do not use setsw in a CTI delay slot.

setx

value, r1, r2

sethi

or

sethi

or

sllx

or

%hh(value), r1

r1, %hm(value), r1

%lm(value), r2

r2, %lo(value), r2

r1, 32, r1

r1, r2, r2

setxhi

value r1, r2

sethi

or

sethi

sllx

or

%hh(value), r1

r1, %hm(value), r1

%lm(value), r2

r1, 32, r1

r1, r2, r2

signx

signx

regrsl, regrd

regrd

sra

sra

regrsl, %g0, regrd

regrd, %g0, regrd

Sign-extend 32-bit value to 64 bits

E.6 UltraSPARC and VIS Instruction Set Extensions

This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.


Note –

SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.


E.6.1 Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

E.6.2 Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (图形, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

E.6.3 Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.

A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.

Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

E.6.4 SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.

Table E–13

SPARC

Mnemonic

Argument List

Description

SHUTDOWN

shutdown

shutdown to enter power down mode

E.6.5 Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.

Table E–14

SPARC

Mnemonic

Argument List

Description

RDASR

WRASR

rdasr

wrasr

%gsr, regrd

regrs1, reg_or_imm, %gsr

read GSR

write GSR

E.6.6 Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.

The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.

Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.

Table E–15

SPARC

Mnemonic

Argument List

Description

FPADD16

FPADD16S

FPADD32

FPADD32S

FPSUB16

FPSUB16S

FPSUB32

FPSUB32S

fpadd16

fpadd16s

fpadd32

fpadd32s

fpsub16

fpsub16s

fpsub32

fpsub32s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit add

two 16-bit add

two 32-bit add

one 32-bit add

four 16-bit subtract

two 16-bit subtract

two 32-bit subtract

one 32-bit subtract

Pack instructions convert to a lower pixel or precision fixed format.

Table E–16

SPARC

Mnemonic

Argument List

Description

FPACK16

FPACK32

FPACKFIX

FEXPAND

FPMERGE

fpack16

fpack32

fpackfix

fexpand

fpmerge

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit packs

two 32-bit packs

four 16-bit packs

four 16-bit expands

two 32-bit merges

Partitioned multiply instructions have the following variations.

Table E–17

SPARC

Mnemonic

Argument List

Description

FMUL8x16

FMUL8x16AU

FMUL8x16AL

FMUL8SUx16

FMUL8ULx16

FMULD8SUx16

FMULD8ULx16

fmul8x16

fmul8x16au

fmul8x16al

fmul8sux16

fmul8ulx16

fmuld8sux16

fmuld8ulx16

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

8x16-bit partition

8x16-bit upper 图形 partition

8x16-bit lower 图形 partition

upper 8x16-bit partition

lower unsigned 8x16-bit partition

upper 8x16-bit partition

lower unsigned 8x16-bit partition

Alignment instructions have the following variations.

Table E–18

SPARC

Mnemonic

Argument List

Description

ALIGNADDRESS

ALIGNADDRESS_LITTLE

FALIGNDATA

alignaddr

alignaddrl

faligndata

regrs1, regrs2, regrd

regrs1, regrs2, regrd

fregrs1, fregrs2, fregrd

find misaligned data access address

same as above, but little-endian

do misaligned data, data alignment

Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).

Table E–19

SPARC

Mnemonic

Argument List

Description

FZERO

FZEROS

FONE

FONES

FSRC1

fzero

fzeros

fone

fones

fsrc1

fregrd

fregrd

fregrd

fregrd

fregrs1, fregrd

zero fill

zero fill, single precision

one fill

one fill, single precision

copy src1

FSRC1S

FSRC2

FSRC2S

FNOT1

FNOT1S

fsrc1s

fsrc2

fsrc2s

fnot1

fnot1s

fregrs1, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrd

fregrs1, fregrd

copy src1, single precision

copy src2

copy src2, single precision

negate src1, 1's complement

same as above, single precision

FNOT2

FNOT2S

FOR

FORS

FNOR

fnot2

fnot2s

for

fors

fnor

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

negate src2, 1's complement

same as above, single precision

logical OR

logical OR, single precision

logical NOR

FNORS

FAND

FANDS

FNAND

FNANDS

fnors

fand

fands

fnand

fnands

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical NOR, single precision

logical AND

logical AND, single precision

logical NAND

logical NAND, single precision

FXOR

FXORS

FXNOR

FXNORS

FORNOT1

fxor

fxors

fxnor

fxnors

fornot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical XOR

logical XOR, single precision

logical XNOR

logical XNOR, single precision

negated src1 OR src2

FORNOT1S

FORNOT2

FORNOT2S

FANDNOT1

fornot1s

fornot2

fornot2s

fandnot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

same as above, single precision

src1 OR negated src2

same as above, single precision

negated src1 AND src2

FANDNOT1S

FANDNOT2

FANDNOT2S

fandnot1s

fandnot2

fandnot2s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

same as above, single precision

src1 AND negated src2

same as above, single precision

Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)

Table E–20

SPARC

Mnemonic

Argument List

Description

FCMPGT16

FCMPGT32

FCMPLE16

FCMPLE32

fcmpgt16

fcmpgt32

fcmple16

fcmple32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1>src2

2 32-bit compare, set rd if src1>src2

4 16-bit compare, set rd if src1≤src2

2 32-bit compare, set rd if src1≤src2

FCMPNE16

FCMPNE32

FCMPEQ16

FCMPEQ32

fcmpne16

fcmpne32

fcmpeq16

fcmpeq32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1≠src2

2 32-bit compare, set rd if src1≠src2

4 16-bit compare, set rd if src1=src2

2 32-bit compare, set rd if src1=src2

Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.

Table E–21

SPARC

Mnemonic

Argument List

Description

EDGE8

EDGE8L

EDGE16

edge8

edge8l

edge16

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

8 8-bit edge boundary processing

same as above, little-endian

4 16-bit edge boundary processing

EDGE16L

EDGE32

EDGE32L

edge16l

edge32

edge32l

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

same as above, little-endian

2 32-bit edge boundary processing

same as above, little-endian

Pixel component distance instructions are used for motion estimation in video compression algorithms.

Table E–22

SPARC

Mnemonic

Argument List

Description

PDIST

pdist

fregrs1, fregrs2, fregrd

8 8-bit components, distance between

The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.

Table E–23

SPARC

Mnemonic

Argument List

Description

ARRAY8

ARRAY16

ARRAY32

array8

array16

array32

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

convert 8-bit 3-D address to blocked byte address

same as above, but 16-bit

same as above, but 32-bit

E.6.7 Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.

Table E–24

SPARC

imm_asi

Argument List

Description

STDFA

STDFA

STDFA

STDFA

ASI_PST8_P

ASI_PST8_S

ASI_PST8_PL

ASI_PST8_SL

stda fregrd, [fregrs1] regmask, imm_asi

eight 8-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian

STDFA

STDFA

STDFA

STDFA

ASI_PST16_P

ASI_PST16_S

ASI_PST16_PL

ASI_PST16_SL

four 16-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian

STDFA

STDFA

STDFA

STDFA

ASI_PST32_P

ASI_PST32_S

ASI_PST32_PL

ASI_PST32_SL

two 32-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian


Note –

To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.


Table E–25

SPARC

imm_asi

Argument List

Description

LDDFA

STDFA

ASI_FL8_P

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

8-bit load/store from/to:

primary address space

LDDFA

STDFA

ASI_FL8_S

ldda [reg_plus_imm] %asi, freqrd

stda [reg_plus_imm] %asi

secondary address space

LDDFA

STDFA

ASI_FL8_PL

primary address space, little endian

LDDFA

STDFA

ASI_FL8_SL

secondary address space, little endian

LDDFA

STDFA

ASI_FL16_P

16-bit load/store from/to:

primary address space

LDDFA

STDFA

ASI_FL16_S

secondary address space

LDDFA

STDFA

ASI_FL16_PL

primary address space, little endian

LDDFA

STDFA

ASI_FL16_SL

secondary address space, little endian


Note –

To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.


Table E–26

SPARC

imm_asi

Argument List

Description

LDDA

LDDA

ASI_NUCLEUS_QUAD_LDD

ASI_NUCLEUS_QUAD_LDD_L

[reg_addr] imm_asi, regrd

[reg_plus_imm] %asi, regrd

128-bit atomic load

128-bit atomic load, little endian

LDDFA

STDFA

ASI_BLK_AIUP

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

64-byte block load/store from/to:

primary address space, user privilege

LDDFA

STDFA

ASI_BLK_AIUS

ldda [reg_plus_imm] %asi, freqrd

stda fregrd, [reg_plus_imm] %asi

secondary address space, user privilege.

LDDFA

STDFA

ASI_BLK_AIUPL

primary address space, user privilege, little endian

LDDFA

STDFA

ASI_BLK_AIUSL

secondary address space, user privilege little endian

LDDFA

STDFA

ASI_BLK_P

primary address space

LDDFA

STDFA

ASI_BLK_S

secondary address space

LDDFA

STDFA

ASI_BLK_PL

primary address space, little endian

LDDFA

STDFA

ASI_BLK_SL

secondary address space, little endian

LDDFA

STDFA

ASI_BLK_COMMIT_P

64-byte block commit store to primary address space

LDDFA

STDFA

ASI_BLK_COMMIT_S

64-byte block commit store to secondary address space


Note –

To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.