Reworking the assembler

I'm in rehabilitation after the operation. I use the spare time to rework the microcode assembler. It was a 2-pass assembler with a not-so-good approach for code placement (fitting). Now it is a single-pass assembler which is faster in most cases and much faster in all others and which tiles the microcode 'threads' much better than before.


In Hospital

2010-06-22 – 2010-06-08

In hospital for operation of my herniated disk. After back pains suddenly disapeared one month ago i developped paralysis of some muscles in the left leg.


A good friend of mine died unexpectedly.
Rest in Peace, Gômmel.


New: a c-style compiler

The zoo of programs became an additional member: a c-style compiler. Internally it generates an 'intermediate' instruction format which is converted by an 'assembler' step into assembler opcodes which the microcode assembler understands. Later the assembler step will be replaced by an opcode assembler to create ram-loadable programs. The compiler will take some while to debug...


Short Survey on Programs

The microcode software is under development. I'm juggling with 5 or 6 types oft software:
- microcode
- millicode (yes, that's new!)
- bytecode (new too)
- assembler opcodes in ram
- the microcode assembler
- the microcode emulator (new)
- vipsi script interpreter (found the second bug in it after 5 years) B-)
So, slowly, what are they all about?

Vipsi Script Interpreter

The thing of software which also delivers this web site. The microcode assembler and the microcode emulator are written in this script language too.


That's obvious, hopefully. The codes actually executed by the control unit. All living starts here. After reset it also is obligued to setup the machine and load drivers and program. There are no other roms in the CPU: the microcode also contains the BIOS.

Assembler Opcodes

That's what the microcode is going to implement: Opcodes for a program running in ram. All opcodes are actually addresses of tiny programs in the microcode rom which load the next opcode when they are done:
:ADD ( a b -- c )
    add (hp--)
    jp  (ip++)

Microcode Assembler

The piece of software which reads the microcode sources and creates the hex files for downloading into the microcode roms. It eases my live with flow control instructions, compound instructions, checking, postponing and fitting the code into the roms. That's all not trivial and all very error prone. See what it creates from the above code example:
:ADD ( a b -- c )
    tst_add (hp)            // 1
    add     (hp--)          // 2
    ld      cmd,(ip++)      // 3
    nop     :bit15          // 4
1: add is too slow, because the carry can ripple through 16 bits until the result has settled. Therefore 'add' and similar opcodes must add a wait cycle, where the inputs are applied but the result is not yet latched.
3: This is, basically, what 'jp' in microcode does: it loads a new address into the microcode address counter, instead of incrementing it.
4: But this is what is required too:
- The next opcode is already fetched by the control unit and will be executed before the jump takes effect
- The feedback line (aka 'flag') 'bit15' is selected to determine in which code plane the jump will actually jump to. In a simple case this would be flag '0' (unconditional plane 0) or flag '1' (unconditional plane 1). But here i use flag 'bit15', which forwards bit 15 from the data bus, so that the jump destination will be in plane 0 (1) if bit 15 of the jump address is cleared (set).

Microcode Emulator

This is the newest piece of software. I'm currently no longer downloading the microcode to the real CPU but to an emulator written in vipsi. The emulator code has ~ 100 lines netto. Debugging stuff and comments increase it to ~ 400 lines currently. The simulator mcsim.vs is located in Software/Microcode/ aside the microcode source.


One of the best things which are possible in the K1 CPU's microcode is to jump to microcode addresses via table. Huh great! Great?
At first it seems impossible to do this. You cannot read microcode cells at arbitrary locations. There is no other source for addressing than the microcode address counter. It is hard to put a string of text into microcode. But, pretty soon, i felt it would be simpler to write something like
    pushstr "abc"
instead of:
    push 'a'
    push 'b'
    push 'c'
This could be solved similar to this:
    jsr pushstr
    dw  'a','b','c',0
    jp  p3
    jsr allocstr
    pop a0              // get return address
p1: ld  cmd,a0++        // jump
    ld  d0,ival :bit15  // already loaded microcode
 // --- 
p3: tst d0
    jp  !z,p2
    jsr storechar
    jp  p1
p2: ...
I tried to keep the example as short as possible. Microcode execution starts with a subroutine call to pushstr. pushstr pops the return address from the return stack which points to the 'a'. Now it loads the microcode address counter with this address (line p1:) but the next microcode is already loaded and is executed next: This loads register D0 with an immediate value which follows in the next microcode. This next microcode is the 'a' where A0 pointed to. Then the microcode runs through the other data bytes which just consume time but do nothing. Finally it reaches the 'jp p3' which jumps back to the string reading loop. Next time the jump hits the next char, because a0 is incremented each time.
Though this example has lots of room for improvement it shows how 'the impossible' thing works. With a little tweaking microcode can read constant data from the microcode rom.
The next step was to realize, that this works even better if the constant data is not 'data' but code:
    ld  alu,jptab1
    add alu,d0
    add alu,d0
    jp  alu
    jp  proc1
    jp  proc2
    jp  ...
Step 3 was to realize, that not the whole 'jp' instruction must be in the table, but only the address:
    ld  alu,jptab1
    add alu,d0
    ld  cmd,alu
    ld  cmd,ival  :bit15
    dw  proc1     :bit15
    dw  proc2     :bit15
    dw  ...
The trick are the two 'ld cmd' instructions in succession. It's a combination of the ideas in 'step 1' and 'step 2'.
Ok, and if, now, we use the address register 'ip' (instruction pointer, sic!) to point to the 'table' of addresses and if the addresses only point to routines which, after having done their job, jump to the next address in this list, we have in microcode exactly what assembler opcodes in ram do, and i called this millicode. It is much easier to write programs in millicode than in microcode assembler. :-) Though a little bit slower, due to the linking jumps and because i use a forth-style approach of argument passing on a data stack, like i do for assembler opcodes in ram.
After adding some glue to the microcode assembler, millicode programs look like this:
:DictShrinkToFit ( dict -- )
    Millicoded          // switch from microcode to millicode
    mc  Dup,DictWordsGet
    mc  Addq dict_word0 
    mc  Resize,Drop
    mc  Next            // kind of return
Yes, not only do i have a Forth CPU, but it's microcode is also programmed in Forth! :-)


Last thing to explain in this list is the 'bytecode'. This is a subset or at least very similar to assembler opcodes, only that instead of the opcodes it uses enumerated codes for instructions. Bytecode is the code loaded from driver eeproms on the K1-bus extension cards and is translated to assembler opcodes by the microcoded boot loader after reset.


First Time Power On

2010-04-01: The frontpanel software is making progress: Meanwhile i can upload microcode to the i2c eeproms or directly to the microcode upload header of the CPU! This also required a small modification to the microcode assembler, to generate a combined file for the three eproms. I am using CoolTerm on my Mac with an usbserial adapter for uploading the hex files. Also, implementation of the XON/XOFF protocol proofed to be useful. :-)

2010-04-05: During Easter holydays i made a great leap and put it all together!
Off course, there were problems. E.g. when my highly valued power supply unit did only deliver 4.4 instead of 5.1 Volts. And became quite warm. First i had problems to measure the current drawn by the CPU, because the PSU always switched off when i cramped the amperemeter into the fuse socket. Measure leads too high impedance... But i was lucky to revive an old pair of leads and see: 3.2 ampere. Off limit for the PSU. Why? Hu hu hu. Some explorations later: i had connected the ALU data registers in reverse direction in the circuit drawing. Impressive what they can survive! I made a dead-cockroach-type operation to both of them. (100 points for those who know what i'm citing! :-))
The ALU Registers [on the left] after Operation

But still the power drawn is a little bit suspicious. The CPU now draws 350mA when stopped and up to 1.3A when running at full speed. But at some points the power consumption rises from 0.5A to 1.0A and above while slowly idling through some random microcode found in the eproms. No explanation yet, but i decided to finish the frontpanel's debugger and start from there. If there is some cause to draw more than 0.5A unexpectedly there will be some error. Hopefully.
So, as said, i worked on the debugger. I implemented routines to read in the sensor ring, which provides information on all three busses (data, address and control). And a preliminary single stepper. Occasionally i could alredy verify that the address incrementer seems to work. Yeah. Tomorrow more...

2010-04-08: It seems that i have more problems with the assembler than with the CPU now. B-) Slowly but shurely i am writing test routines for registers and functions and so far they all work. Except if the assembler fails to generate correct code. Data registers, ALU functions, shift registers, address registers and address increment/decrement up and running. Ram test ok! And on we go...

2010-04-10: I am excessively testing the CPU. At one point i probably had a non-connect in a socket (the CY input to the SR register) which now seems to be fixed. And i was hunting an error in the z-flag result:
    tst $0008 : z
    jp  1,error
jumped to error sometimes, but when i uploaded hundreds of tests into the microcode rams it always failed at the same microcode address, if it failed, which was different for every special code mix. What was going wrong here?
Nothing. It was ok. I will have to put a wait cycle here (or eventually it works when i replace the condition selector, a 74HC151 by it's AC counterpart.) The 'error' was the impression, that the immediate value came from the 'ival' registers on the control unit. But these are no registers, but drivers. And they put the data from the microcode rams on the bus, thus i had to add the latency from the microcode rams to the total round-trip time and – alas! – that was tight. VERY tight...

For the records: Power drawn by the CPU is ~350mA@5V when halted and ~800mA@5V when running at 16MHz.
2010-04-11 to 2010-04-15: Webside down, because iMac down. PSU failure.
2010-04-16: Resuming testing.


The New Front Panel

I finished the new frontpanel board: Exposed, etched, drilled, soldered and connected. The software for the ATmega32 is under development.
Front panel showing the K1-16/16 Logo


I/O Board and Microcode Upload Header


I have ordered the I/O board from LeitOn.
The new front panel is still under development.
The microcode upload header is designed and routed.


The I/O board is finished:
K1-16/16 Expansion Bus Board
 The microcode upload header is finished:
The Microcode Upload Header ontop of the Control Unit