2016-04-20

Firmware Download and Access to IDE Board

Two topics in this post:

  • Firmware download
  • Access IDE board, IDE and CF devices

Firmware download

After my explorations into CRC generation, i worked on the firmware download code. This code has to run in RAM, because the eeprom can't be read while it is busy writing a block of data into it's cells.

I tried to keep things simple, and so the program flow looks like this:
write_eeprom.s

1. wait for SIO output to become empty
2. disable interrupts
3. receive magic header bytes
   if they are wrong: bail out
4. copy code from rom into ram
   jump to 6.

in ram:
5. Retry: receive bytes until magic header detected
6. in a loop:
7.    receive 64 bytes of data (last block may be shorter)
      update the crc after each byte
8.    write block of data into eeprom
9.    wait while eeprom busy
10. loop
11. receive and check crc:
    error: print a message, flush input, wait for a key and retry at 5
    ok:    print a message, flush input, wait for a key and reset

Already complicated enough.

ad 1: I wait for the SIO output to become empty, because there may be (and typically are) some bytes left in the output buffer, and as soon as i disable interrupts they will never be sent. This resulted in truncated "last messages".

ad 7: The program receives blocks of 64 bytes, which is the eeprom's block size, writes them into the eeprom and reuses the receive buffer for the next block. It does not keep the bytes around until all bytes are received, so the whole eeprom can be reprogrammed, though i have only 32k of RAM minus approx. 1k for code and buffers available.

The program actively polls the UART and retrieves the bytes as soon as they pop up in the UART queue. Then it immediately updates the CRC, so no extra loop is needed for CRC calculation.

The data is written into the eeprom despite the CRC is not yet checked, which can only be done at the end of the transmission.

ad 9: Waiting for the eeprom takes up to 10 ms (acc. to Atmel docs). During this time i do not poll the UART, for simplicity. And i do not use any flow control. Currently the UART is programmed to 9600 Baud, which means 960 characters per second or 9.6 characters per 10 ms. The UART has an input queue of 16 bytes: The UART is doing my job! :-)

ad 11: Up to now i never had a CRC error.

Overall workflow is now like this:

1. Work on the source, compile & assemble the rom.
2. Launch the new rom in the emulator
   The emulator silently creates a download file from the bare rom file.
3. Connect to the board (most times CoolTerm _is_ connected…)   
4. Select "download firmware" from the menu presented by the board
5. upload a "Textfile" in CoolTerm
6. type A letter to reboot

The upload including writing to the eeprom now takes exactly as long as uploading the rom itself: 1 second per 960 bytes. Currently roughly 20 seconds.

Kio: "See Stager, that's how it works!"

Next i made a modification to the rom, uploaded it and found, that it no longer booted.

Stager: "See Kio, you still need me…" :-(

Access the IDE board

Now that turn-around cycles are much faster and less painful, i started first tests to access the IDE board.

I talked to the i²c eeprom on the board... and it answered! :-)

Then i collected all my knowledge about IDE, which mostly centers around the DivIDE emulation in zxsp, and made up my mind on what to send to and read from the IDE device at first.

The test program looks roughly like this:

1. disable interrupts (in c: you remember the __critical bug?)
2. select the IDE board
3. read status and error register (from master, which is selected after reset)
4. print something
5. wait for ready and !busy in the status register
6. check that the device is not unexpectedly waiting for data
7. finally: issue command "IDENTIFY"
8. wait for data request in the status register, bail out on unexpected state
9. read 256 words into a sector buffer
10. read status and error register
11. print the sector data in hex
12. inspect and print various fields in the sector data

ad 1: Currently i'm back to the sdcc 3.6 nightly build, though it produced the so much slower code than version 3.4.

ad 2: As you may know, if you have followed this blog for the last years B-) boards attached to the K1 bus must be "selected" and from that on any i/o goes to that board. This really is a nice method and i'm really happy with it. I just must make sure that i'm not interrupted while i'm working with a device, as the interrupt (sic!) will probably leave some other board selected…

ad 9: I really read 256 words, not 512 bytes. (Though, in essence off course i do read 512 bytes.) The K1 bus is a 16 bit bus and the Z80 board contains two 8-bit buffers for sending and receiving the high word. Then reading 16 bit values works like this:

Implementation of a function for c:

;  uint16 in_w( uint8 addr ) __z88dk_fastcall;
;
_in_w::
    ld    a,l            ; a = register address
    or    a,k1_rd_data   ; add bits to access the bus
    ld    c,a            
    in    l,(c)       ; read the low byte
    ld    c,k1_rd_hi     ; access the high-byte register
    in    h,(c)          ; read the high byte from the high-byte register
    ret

The function is marked as __z88dk_fastcall, which is really funny. z88dk is the (only?) competitor of sdcc in the field of Z80. __z88dk_fastcall means, that the argument to the function, which must have exactly one argument, is passed in l, hl or hlde, depending on size, and not on the stack. In my opinion this should be the default.

PQI DOM, CF cards and Seagate ST1

For some reason it first didn't work, but then, out of a sudden, i got good looking data from a device. The only thing i did to make it happen was, to scrutinize the circuit diagram for errors. And as soon as i could prove there was no error, reality was modified to match my expectations. Check.

Off course the ascii texts of model name etc. were byte-swapped in the first version. Probably most people fall into this pitfall. The 4-byte values were calculated wrong in one place but correctly in another. After a few iterations the output from the "built-in" PQI DiskOnModule looked like this:

$00: 045A 02EE 0000 0008 0000 0210 0020 0002 EE00 0000 2020 2020 2020 2020 2020 2020
$10: 2020 2020 2020 2020 0002 0002 0004 6462 3031 2E32 3061 5051 4920 4944 4520 4469
$20: 736B 4F6E 4D6F 6475 6C65 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 0001
$30: 0000 0200 0000 0200 0000 0001 02EE 0008 0020 EE00 0002 0100 EE00 0002 0000 0000
$40: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
     ...
$F0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

model name = PQI IDE DiskOnModule                    
serial number =                     
firmware revision = db01.20a
fixed disk
ATA version = 0
LBA supported
Default capacity (sectors) = 192000
default C/H/S = 750/8/32
default capacity (sectors) = 192000
current C/H/S = 750/8/32
current capacity (sectors) = 192000
device supports PIO mode 3 or DMA mode 1 or above

Next reading from the slave, a Seagate ST1 hard disk in the CF card slot, did not work. The ST1 always reported !ready.

I thought there could be a severe problem with the pin assignment of the CF card slot, but after double checking, it was ok. Next i suspected a missing pull-up on the /CS line, which discriminates between master and slave, but these inputs have an internal pull-up.
Then i checked the ST1: with the help of an USB-CF-Card adapter i attached it to my Mac and it spined up. I watched some very cool short videos which i had saved on the drive a few years ago:

Do ya know any of them?

Then i rummaged around for some Compact "Flash" cards and came up with a 256 MB and a 16 MB one (the latter one i actually don't own. Hi Axel, do you miss your 16 MB card? I just found it… :-))

I tested them.

And they worked:

$00: 848A 02B7 0000 000F 0000 0200 0030 0007 A2B0 0000 5830 3130 3220 3230 3033 3130
$10: 3237 3032 3536 3137 0002 0002 0004 5265 7620 332E 3030 4869 7461 6368 6920 5858
$20: 4D32 2E33 2E30 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 0001
$30: 0000 0200 0000 0100 0000 0001 02B7 000F 0030 A2B0 0007 0100 A2B0 0007 0000 0000
     ...
$F0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

model name = Hitachi XXM2.3.0                        
serial number = X0102 20031027025617
firmware revision = Rev 3.00
removable medium
ATA version = 0
LBA supported
Default capacity (sectors) = 500400
default C/H/S = 695/15/48
default capacity (sectors) = 500400
current C/H/S = 695/15/48
current capacity (sectors) = 500400
device supports PIO mode 3 or DMA mode 1 or above

and

$00: 848A 00F4 0000 0004 4000 0200 0020 0000 7A00 0000 3932 3130 3336 3230 3130 3938
$10: 3939 3039 3132 3831 0002 0002 0004 5631 2E30 3220 2020 4C45 5841 5220 4154 4120
$20: 464C 4153 4820 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 0001
$30: 0000 0200 0000 0200 0000 0003 00F4 0004 0020 7A00 0000 0100 7A00 0000 0000 0000
     ...
$70: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$80: 0000 75D0 0075 A800 75B8 0078 0174 55F6 08F4 B800 FA08 F4F5 F0E6 B5F0 10F4 F5F0
$90: 08B8 00F5 7801 B455 E674 0001 A700 7455 01A7 90C0 5974 01F0 E4F0 90C0 2D8D E6A6
$A0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
     ...
$F0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

model name = LEXAR ATA FLASH                         
serial number = 92103620109899091281
firmware revision = V1.02   
removable medium
ATA version = 0
LBA supported
Default capacity (sectors) = 31232
default C/H/S = 244/4/32
default capacity (sectors) = 31232
current C/H/S = 244/4/32
current capacity (sectors) = 31232
device supports PIO mode 3 or DMA mode 1 or above
device supports Ultra DMA

But the ST1 refused to become ready.

I played around with the master/slave setting and found, that the "jumper" on the PQI module probably enforced master for the device. This had to be taken into account when changing the master/slave jumper on the IDE board. (I first thought it had something to do with power supply, because this is an IDE module and the IDE bus normally provides no power, but i was wrong.)

Finally i jumpered the CF card adapter as master, pulled the master "master" jumper on the PQI module, and the CF card and the PQI module still answered, with roles swapped, and, unbelievable, the Seagate ST1 answered too! So, to make the ST1 work, i need to set the CF card adapter – the "removable" medium – to master and the "fixed" PQI module to slave? I tried with an empty CF card slot and the PQI module still answered - as slave. Is this IDE standard? (Actually i know very little about CF, IDE and so on, the official documents cost money for download or membership. So i get what can be found with the help of aunt Google.)

$00: 848A 12ED 0000 0010 7E00 0200 003F 004A 8530 0000 2020 2020 2020 2020 2020 344D
$10: 4430 3433 4B53 2020 0003 0100 0004 332E 3034 2020 2020 5354 3632 3532 3131 4346
$20: 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 8010
$30: 0000 0B00 0000 0200 0000 0007 12ED 0010 003F 8530 004A 0100 8530 004A 0000 0407
$40: 0003 0078 0078 0078 0078 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$50: 0000 0000 7069 500C 4000 7069 100C 4000 0007 0000 0000 4040 0000 400D 8080 0000
$60: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
     ...
$90: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$A0: 814A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$B0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
     ...
$F0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

model name = ST625211CF                              
serial number =           4MD043KS  
firmware revision = 3.04    
removable medium
ATA version = 0
command sets supported = $7069 500C
command sets enabled   = $7069 100C
LBA supported
Default capacity (sectors) = 4883760
default C/H/S = 4845/16/63
default capacity (sectors) = 4883760
current C/H/S = 4845/16/63
current capacity (sectors) = 4883760
device supports PIO mode 3 or DMA mode 1 or above
device supports Ultra DMA
max sectors per R/W MULTIPLE = 16
CFA power mode = 33098

Data transfer to and from the IDE disks is not really fast with a Z80 processor, the best what i can get is:

ld      c,k1_wr_data
    ...
    ; send 1 word / 2 bytes in a loop 
    ; or unroll as long as you like    
    ld      e,(hl++)
    ld      a,(hl++)
    out     (k1_wr_hi),a    ; store byte in the high-byte register
    out     (c),e           ; send 1 word / 2 bytes to the device

This takes 25 cc per byte, or, if inc hl can be replaced with inc l only, 21 cc; plus loop and setup overhead. The system has a clock frequency of 6 MHz, which means i can transfer 32 kB (the whole RAM) in 32k * 25 cc = 819200 cc or, at 6 MHz, in 1/7 sec. Ok, the CPU is slow, but the RAM is small as well. :-)

A single sector can be transferred in 512 * 25 cc = 12800 cc or, at 6 MHz, in 1/450 sec. This is also very fine because this means, that i can disable interrupts for a whole sector i/o without losing timer and SIO interrupts, as the timer interrupt is at 100 Hz currently. I could also increase the timer frequency to 300 or 400 Hz to allow a SIO speed of up to 38.4 kBaud, if i desired, but CPU time consumption will go up as well and with 100 Hz it's already around 2% with an idle SIO. And with very intelligent use of INI or OUTI a few cycles for at least one IN or OUT could be saved.

It should be noted that the CF cards can be operated in a byte-wide mode. Then INIR and OTIR should be usable and one byte transferred in 16 cc.

Conclusion

All boards work and now everything left to do is software.

p.s.: Still not yet written to the IDE devices. Surprise ahead? (Shiver…)

No comments:

Post a Comment