Multiplier Example

This program multiply two numbers:
<0>     in1                     // Input X
<1>     in2                     // Input Y
<2>     1                       // Def. const: 1
        {                       // store
<3>        (-3)                 // X:=in1
<4>        0                    // Z:=0
<5>        SUB [-2],(-1)        // X:=X-1
<6>        ADD [-2],(-2)        // Z:=Z+Y
           JNF [-2],-2          // JMP <5> if not <5> carry
        }                       // restore
<7>     OUT [-3]                // output previos Z


Syntax



Pipelineing

The pipelineing is done by the compiler based on the hardware information. Example of program with pipelineing:
<0>     in1                     // Input X
<1>     in2                     // Input Y
<2>     1                       // Def. const: 1
        {                       // store
<3>        (-3)                 // X:=in1, no carry
<4>        0                    // Z:=0
<5>        SUB [-2],(-1)        // X:=X-1
<6>        ADD [-2],(-2)        // Z:=Z+Y
           JNF [-4],-2          // JMP <5> if not <5> carry
        }                       // restore
<7>     OUT [-5]                // output previos Z


Multiprocessing

Multiple instruction issue is done by placing {, }, labels, jumps and calls at adresses dividable by the number of processors. Example:
<0>     in1                     // Input X
<1>     in2                     // Input Y
<2>     1                       // Def. const: 1
<3>     NOP
        {                       // store
<4>        (-4)                 // X:=in1, no carry
<5>        0                    // Z:=0
<6>        NOP
<7>        NOP

<8>        SUB [-4],(-2)        // X:=X-1
<9>        ADD [-4],(-3)        // Z:=Z+Y
<10>       NOP                  // No dependencies
<11>       NOP                  // No dependencies
           JNF [-4],-4          // JMP <8> if not <8> carry
        }                       // restore
<12>    OUT [-7]                // output previos Z


Folding Loops out

Example:
<0>     in1                     // Input X
<1>     in2                     // Input Y
<2>     1                       // Def. const: 1
<3>     NOP
        {                       // store
<4>        NOP
<5>        NOP
<6>        (-4)                 // X:=in1, no carry
<7>        0                    // Z:=0

<8>        SUB [-2],(-2)        // X:=X-1
<9>        ADD [-2],(-3)        // Z:=Z+Y
<10>       SUB [-2],(-2)        // X:=X-1
<11>       ADD [-2],(-3)        // Z:=Z+Y
           JNF [-4]or[-2],-4    // JMP <8> if not <8>* or <10> carry
        }                       // restore
<12>    [-5]                    // Result previos Z
<13>    NOP
<14>    NOP
<15>    NOP                     
        JNF [-8],+4             // JMP if not * condition in <11>
<16>    [-11]                   // Result previos Z
<17>    NOP
<18>    NOP
<19>    NOP                     

<20>    OUT [-4]                // output
Improving by not use recent flag's. Example:
<0>     in1                     // Input X
<1>     in2                     // Input Y
<2>     1                       // Def. const: 1
<3>     NOP
        {                       // store
<4>        NOP                  // no carry
<5>        NOP
<6>        (-4)                 // X:=in1, no carry
<7>        0                    // Z:=0

<8>        SUB [-2],(-2)        // X:=X-1
<9>        ADD [-2],(-3)        // Z:=Z+Y
<10>       SUB [-2],(-2)        // X:=X-1
<11>       ADD [-2],(-3)        // Z:=Z+Y
           JNF [-8]or[-6],-4    // JMP <8> if not <8>* or <10> carry
        }                       // restore
<12>    [-9]                    // Result previos Z
<13>    NOP
<14>    NOP
<15>    NOP
        JNF [-12],+4            // JMP if not * condition in <11>
<16>    [-15]                   // Result previos Z
<17>    NOP
<18>    NOP
<19>    NOP

<20>    OUT [-4]                // output
More instructions issued simultanously is expected to cause a significant higher performance because most instructions may be issued without dependencies, and because the typical delay is limited by the forward delay only, and not the cycle time.


[Asynchronous memory cell] | [The Processor Background].
© 1996-1997 and 1998, Jens Dyekjær Madsen.
E-Mail address: Guestbook.