Mike Schmit's Top Ten Rules

for

Pairing Pentium Instructions

1. Both instructions must be simple.

2. Shifts or rotates can only pair in the U pipe.
   (SHL, SHR, SAL, SAR, ROL, ROR, RCL or RCR)

3. ADC and SBB can only pair in the U pipe.

4. JMP, CALL and Jcc can only pair in the V pipe. (Jcc = jump on condition code).

5. Neither instruction can contain BOTH a displacement and an immediate operand. For example:

mov     [bx+2], 3  ; 2 is a displacement, 3 is immediate

mov     mem1, 4    ; mem1 is a displacement, 4 is immediate

6. Prefixed instructions can only pair in the U pipe. This includes extended instructions that start with 0Fh except for the special case of the 16-bit conditional jumps of the 386 and above. Examples of prefixed instructions:

mov     ES:[bx], 

mov     eax, [si]  ; 32-bit operand in 16-bit code segment

mov     ax, [esi]  ; 16-bit operand in 32-bit code segment

7. The U pipe instruction must be only 1 byte in length or it will not pair until the second time it executes from the cache.

8. There can be no read-after-write or write-after-write register dependencies between the instructions except for special cases for the flags register and the stack pointer (rules 9 and 10).

mov     ebx, 2   ; writes to EBX

add     ecx, ebx ; reads EBX and ECX, writes to ECX

                ; EBX is read after being written, no pairing

mov     ebx, 1   ; writes to EBX

mov     ebx, 2   ; writes to EBX

                 ; write after write, no pairing

9. The flags register exception allows an ALU instruction to be paired with a Jcc even though the ALU instruction writes the flags and Jcc reads the flags. For example:

cmp     al, 0    ; CMP modifies the flags

je      addr     ; JE reads the flags, but pairs

dec     cx       ; DEC modifies the flags

jnz     loop1    ; JNZ reads the flags, but pairs

10. The stack pointer exception allows two PUSHes or two POPs to be paired even though they both read and write to the SP (or ESP) register.

push    eax      ; ESP is read and modified

push    ebx      ; ESP is read and modified, but still pairs 

Simple Instructions (for Pentium pairing)

The following is a list of simple instructions, as required by rule #1 above.

Instruction format 16-bit example     32-bit example

------------------------------------------------------------

MOV reg, reg       mov ax, bx         mov eax, edx

MOV reg, mem       mov ax, [bx]       mov eax, [edx]

MOV reg, imm       mov ax, 1          mov eax, 1

MOV mem, reg       mov [bx], ax       mov [edx], eax

MOV mem, imm       mov [bx], 1        mov [edx], 1

alu reg, reg         add ax, bx         cmp eax, edx

alu reg, mem       add ax, [bx]       cmp eax, [edx]

alu reg, imm       add ax, 1          cmp eax, 1

alu mem, reg       add [bx], ax       cmp [edx], eax

alu mem, imm       add [bx], 1        cmp [edx], 1



where alu = add, adc, and, or, xor, sub, sbb, cmp, test



INC  reg           inc  ax            inc  eax

INC  mem           inc  var1          inc  [eax]

DEC  reg           dec  bx            dec  ebx

DEC  mem           dec  [bx]          dec  var2

PUSH reg           push ax            push eax

POP  reg           pop  ax            pop  eax

LEA  reg, mem      lea  ax, [si+2]    lea  eax, [eax+4*esi+8]

JMP  near          jmp  label         jmp  lable2

CALL near          call proc          call proc2

Jcc  near          jz   lbl           jnz  lbl2



where Jcc = ja, jae, jb, jbe, jg, jge, jl, jle, je, jne, jc, js,

            jnp, jo, jp, jnbe, jnb, jnae, jna, jnle, jnl, jnge,

            jng, jz, jnz, jnc, jns, jpo, jno, jpe



NOP                nop                nop

shift reg, 1       shl  ax, 1         rcl  eax, 1

shift mem, 1       shr  [bx], 1       rcr  [ebx], 1

shift reg, imm     sal  ax, 2         rol  esi, 2

shift mem, imm     sar  ax, 15        ror  [esi], 31



where shift = shl, shr, sal, sar, rcl, rcr, rol, ror

Notes:

Home Page    e-mail to Quantasm     Order form    Site Map