How to make PIC's with high clockspeed.


This page shows, how to seperate even and odd instruction memory and thereby give higher troughput from memory of PIC's, and also how to use more address decoders to make the decodeing as fast as possible, and choose between the possible addresses in last moment.

It is possible to divide the code memory in even and odd part, and if they have each own address decoder, or more address decoders to give higher access to next address, is it possible to obtain higher speed.

cycle   EVEN ADR        ODD ADR
1       00 norm.        01 norm.        ; Issue 00 and 01 simultanously.
2       02 norm.        03 norm.        ; Issue 02 and 03
3       04 jpev 10      05 ign.         ; Issue 04 (ign. 05)
4       10 norm.        11 norm.        ; Issue 10 and 11
5       12 jpod 15      13 ign.         ; Issue 12 (ign. 13)
6       16 norm.        15 norm.        ; Issue 15 and 16
7       18 jpev 22      17 norm.        ; Issue 17 and 18
8       22 norm.        xx ign.         ; Issue 22 (ign. xx,odd)
9       24 norm.        23 norm.        ; Issue 23 and 24
10      26 jpod 29      25 norm.        ; Issue 25 and 26
11      xx ign.         29 norm.        ; Issue 29 (ign. xx,even)
12      30 norm.        31 norm.        ; Issue 30 and 31
13      32 norm.        33 norm.        ; Issue 32 and 33
14      34 norm.        35 jpod 41      ; Issue 34 and 35
15      xx ign.         41 norm.        ; Issue 41 (ign. xx,even)
16      42 norm.        43 norm.        ; Issue 42 and 43
17      44 norm.        45 jpev 50      ; Issue 44 and 45
18      50 norm.        xx ign.         ; Issue 50 (ign. xx,odd)
19      52 norm.        51 norm.        ; Issue 51 and 52
20      54 norm.        53 norm.        ; Issue 53 and 54
21      56 ign.         55 jpod 63      ; Issue 55 and 56
22      64 norm.        63 norm.        ; Issue 63 and 64
23      66 norm.        65 norm.        ; Issue 65 and 66
24      68 ign.         67 jpev 80      ; Issue 67 (ign. 68)
25      80 norm.        81 norm.        ; Issue 80 and 81
26      82 norm.        83 norm.        ; Issue 82 and 83

        norm: normal single cycle instruction
        ign:  ignored instruction after jmp
        jpev: GOTO, but to even address only
        jpod: GOTO, but to odd address only
The example above, shows how easy it is to read two indstructions at same time, and how the indstruction order (even/odd swap) have been updated depending on the jumps.

0     1      swap   operation
norm. norm.   X     ev:=ev+2; od:=od+2;
jpev. ign.    0     ev:=adr; od:=adr+1; (swap:=0)
jpod. ign.    0     ev:=adr+1; od:=adr; swap:=1;
jpev. norm.   1     ev:=adr; od:=adr-1,ign; (swap:=1)
jpod. norm.   1     ev:=adr-1,ign; od:=adr; swap:=0;
norm. jpod.   0     ev:=adr-1,ign; od:=adr; (swap:=0)
norm. jpev.   0     ev:=adr; od:=adr-1,ign; swap:=1
ign.  jpod.   1     ev:=adr+1; od:=adr; (swap:=1)
ign.  jpev.   1     ev:=adr; od:=adr+1; swap:=0;

If first jmp ignored:
jpev jpev     0     ev:=adr; od:=adr-1,ign; swap:=1
jpev jpod     0     ev:=adr-1,ign; od:=adr; (swap:=0)
jpod jpev     0     ev:=adr; od:=adr-1,ign; swap:=1
jpod jpod     0     ev:=adr-1,ign; od:=adr; (swap:=0)
jpev jpev     1     ev:=adr; od:=adr-1,ign; (swap:=1)
jpev jpod     1     ev:=adr; od:=adr-1,ign; (swap:=1)
jpod jpev     1     ev:=adr-1,ign; od:=adr; swap:=0;
jpod jpod     1     ev:=adr-1,ign; od:=adr; swap:=0;

If first jmp issued:
jpev jpev     0     ev:=adr; od:=adr+1; (swap:=0)
jpev jpod     0     ev:=adr; od:=adr+1; (swap:=0)
jpod jpev     0     ev:=adr+1; od:=adr; swap:=1;
jpod jpod     0     ev:=adr+1; od:=adr; swap:=1;
jpev jpev     1     ev:=adr; od:=adr+1; swap:=0;
jpev jpod     1     ev:=adr+1; od:=adr; (swap:=1)
jpod jpev     1     ev:=adr; od:=adr+1; swap:=0;
jpod jpod     1     ev:=adr+1; od:=adr; (swap:=1)

PIC16C84 uses two cycles for jump as above. The previos "test" could be a BTFSC or a BTFSS and they are fast. Or it could be a DECFSZ or an INCFSZ which have been changed to compare with 1 before the instruction is issued, and not zero after indstruction issued. And for INCFSZ with 255 before indstruction is issued, and not zero after indstruction. This makes the operations faster, and then is it easy to implement as a part of the next address calculation, without much reduction of speed. A small improvenents (for PIC's) may be done using multiple address decoders, and choose one of the addresses depending on the branch condition in last moment. This is a signigicant improvent for most chips, that not have the "simple" branch instructions. It depend on memory speed, how much it helps.

The addresses is looked up at the same time, as they are represented on the outputs from the indstruction memory. The final decision is done later, after the address decodeing of the possible next addresses here.

The interresting for PIC's is to devide the memory in even and odd part memory, and to give possibility to address both parts independendly. This makes it possible to get double troughput from the memory, and it will be able to continue to use the orginal PIC cycles.

The metodes above is most for processors where timing need to bee kept, else do much better methodes exist. (Imagine, just use 100 cycles for a jump in a non expected direction. Or to use 12 Cycles for an add?? In fact, it is possible to make instruction lookahead, and then will anyone could be able to reach any speed, but not in conditoinal jumps in different direction than expected. Any other instruction works.) The designtask of any compatible processor is to keep the timing and cycles per instruction. Else is it possible to reach any speed at all. Example of an asynchronous processor. Instruction look ahead works as carry lookahead, in fact it is the same. The only difference is that if you take "instruction lookahead" and then use multiple instructions to get bigger words, then it actualy gives same structure as carry look ahead do (for add). However, it uses a lot of transistors, because it may require multiple ALU's.

The idea with the PIC above, is to show that it is possible to keep timing, and also to read two instructions at same time. That means, that full-compatibility even at any jump. (or branch). Also, that issue of two instructions at same time is possible and full compatible. The time to issue the two instructions is same as a random access for the instruction memory, and the ALU have about same to do both instructions. Typical the low bits arrive before the high bits, and that makes two ALU's fast together. However, shifts and swap nibbles takes up time here, and swap is worse, but do it faster. The area of the PIC with two instructions issued at same time, is not much, because of it is only a small part of the total PIC that also have instruction memory inside etc. And this memory does not take up much extra space, even that it is devided into two parts. (Only a small address decoder extra.)


It is a small error in text above. This will bee corrected in future. (Error in jmp issued / jmp ignored tables.) Also, the tables will be updated, to use a better methode for the PIC. ( only even address, need to have an +2 adder) A0 will be used for this, and add two to the even addresses.