Difference between revisions of "JTAG explorer toy"

Revision as of 20:41, 6 January 2009

Fig.1: batman. "Holy mackerel Batman! JTAG really is THE SHIT!!1"

Fig.2: JTAG boundary-scan chain. A host talks to the TAP (Test Access Port) controller using 4 lines, shifting data in and out.

Fig.3: 2 JTAG boundary-scan cells. One cell sits between core logic and an input pin, and the other sits between core logic and an output pin.

Fig.4: TAP-controller states. The TMS- and TCK-lines are used to navigate through states where e.g. latching, writing, reading and updating of instruction and data occurs. The '0' and '1' values in the diagram correspond to TMS values at risig clock-edge.

Fig.5: TAP-port timing diagram. TMS- and TDI-inputs are sampled on rising edge of TCK, and TDO (output) changes on falling edge of TCK.

Fig.6: toy-board schematics. The only slightly interesting bits are the 4 TAP-lines going between the MCU's, and the fact they share mutually-exclusive use of the serial Tx-line (the Rx-line is always shared).

Fig.7: 'before'. The big MCU there (ATmega32) is horribly oversized both in size and ability for this application, but ok, was all I had.

Fig.8: 'after'. Finished toy-board. The funny thing is that when it's finished and toying has been done, it'll totally useless.

What is this, and, why..?

I heard a lot about JTAG, but never got to do anything with it, mainly because I CBA, and I didn't actually had to do anything with it.

This is of course bad, because...

JTAG is THE SHIT -- like Batman!

...or well, it's nice anyway.

Disclaimer

I interpreted the ATmega32-datasheet and (quite) some on-line docs as best as I could; however, if there's a bug/flaw somewhere, let me know please.

Note that this thing is totally useless, so questions/comments regarding this issue will be ignored. Xmas-season, too much time, so there :-)

Overview/summary

The idea for me was to make a simple toy-board with 2 MCU's on it -- one acting as JTAG host/master, and the other being JTAG-victim.

The PC talks to the master and slave through a serial protocol, to read/set pins, and so initiate JTAG-actions; master talks to slave only through its JTAG-port.

The master itself is not JTAG-enabled, but drives/reads the slave's JTAG-port I/O-pins.

What I would like to see

As I understood, JTAG offers a nice 'backdoor' into a (slave-)chip's state, and so that's what I would like to see; I'd like to...

put it in, and take it out of reset (reset/suspend/resume)
decouple core logic from I/O pins, and read/set them from boundary cells instead
and some more, basically toy with it.

How JTAG works (as I understood it)

JTAG normally uses a single master/host to read/set states of one or more chips. There may be separate (JTAG) communication-channels between the host and each chip, or chips can be daisy-chained, so that the system's interface is kept simple and small.

Of course this is not complete; for more details, browse the lovely Internet.

Basic idea

Communication comes down to selecting which data to operate on, and then reading/writing that data.

Single-chip setup

In a 1-chip JTAG-setup, the chip contains an instruction-register and a number of task-specific data-registers. The host can shift bits into each register; when a bit is shifted in, a bit falls out at the other end. Shifting always occurs in the same direction.

The host selects which data-register is active by entering an instruction-code, and then operates on the corresponding data-register. Writing/reading of both instruction- and data-registers occurs in the same way, by shifting bits in/out.

Multiple-chip setup

This works like above, except the complete system (more chips together) can be viewed as a number of big registers (where different parts/offsets in a register may live on different chips).

Bus-description

A chip offers access to its JTAG-subsystem through a port (more about this in another section), consisting of 4 (or 5) I/O-lines:

TCK (Test ClocK) will be used to clock bits in/out of the device, and initiate mode-change.
TMS (Test Mode Select) is used to traverse through the TAP-controller's state-machine. It is sampled on rising edge of TCK. Values in the state-machine diagram indicate values for TMS.
TDI (Test Data In) contains values to be shifted into the device. It is sampled on rising edge of TCK.
TDO (Test Data out) will contain bits shifted out of the device. It changes on falling edge of TCK.

In short,...

TCK is used for every single action,
TMS is used to explicitly navigate, or stay put, in the TAP-controller's state-machine; the state machine always applies, so TMS can never be ignored,
TDI and TDO are only relevant when actually shifting bits in/out of the chip. Bits shifted in or out of the chip are always passed via these 2 lines.

See Fig.5 for an idea about timing of these signal.

Daisy-chaining chips

The TDI- and TDO-lines may be used to daisy-chain chips together -- chip #1's TDO is connected to chip #2's TDI. The host then shifts bits into chip #1's TDI, and catches bits falling out of the TDO of the last chip in the chain.

In such a daisy-chain, all chips share the TCK- and TMS-lines. Although sharing a clock may be normal, the common TMS-line was quite surprising to yours truly. But it still makes sense!

TAP-controller

The TAP (Test Access Port) controller is part of every JTAG-enabed chip, and basically implements a state machine. The TMS-line navigates between states on each rising edge of TCK.

Instructions and data

As mentioned before, the TAP-controller has 1 (fixed-size) instruction-register and multiple, arbitrarily-sized data-registers. Although there are many data-registers, only one can be operated on at a time.

Poetic sidenote: one of the beautiful things, IMHO, about JTAG is that the chip 'connects' one of its data-registers between TDI and TDO on request. Subsequent operations then take place on that DR. This is no different than address-/data-selection (in that order) elsewhere, but this 'connecting' actually makes the bit-path through the chip (from TDI to TDO shorter and longer on request!

State-machine

See Fig.4: each state leads to 2 states (one of which may be itself); the selection is done by the value of TMS at the next clock pulse.

For example, to write a specific register, the following must be done (in 'pseudo-actions' -- see the notes about 'last bit', below):

navigate to state 'Shift-IR' to clock in instruction-bits, by clocking in the right sequence of values from TMS,
clock in all instruction-bits,
update the instruction-register by moving to state 'Update-IR',
navigate to state 'Shift-DR' to clock in data-bits,
clock in all data-bits.

Some states 'do 1 thing, once', e.g. have the chip fill a parallel shift-register to be subsequently shifted out on TDO, but other states 'keep on doing something' during multiple TCK-cycles. To keep them in that state, TMS must be set accordingly.

A state-transition takes place after the data shifted in at that clock-pulse, if relevant, is processed. In other words, bits shifted in always apply to the current state, not the possibly new one indicated by TMS at that clock-pulse. So, if N bits must be shifted in at a state, this is how it's done:

enter the state by setting TMS and raising TCK,
clock the 1st (N-1) bits in, keeping TMS so that the state is not left,
set TMS to navigate to the next state, and clock the last bit in.

Analogously, for a positive clock pulse, the bit falling out out at TDO applies to the new state, if there was a state-change at the rising edge of TCK.

Boundary-scan chain

JTAG can be used to read/stimulate I/O at a chip's pins. The chip's core logic can be decoupled from its I/O-pins as well; see Fig.3 for an idea: 'boundary-scan cells' sit in between the chip's core logic and the actual pin.

A chain of these cells ('boundary-scan chain') may be selected to sit between TDI and TDO by issueing the proper instruction, so that current/new values to/from the outside world or the chip's core logic can be shifted in or out.

Together with daisy-chaining multiple chips, this can give the developer a system where chips ('inside the socket') and e.g. PCB-tracks (between sockets) can be tested, with only 4 pins dedicated for this purpose!

JTAG-toy hardware

So ok, enough about that -- time to solder! As mentioned before, the proto-board to play with all this technology will have 2 MCU's: a JTAG-slave and a JTAG-master. The PC initiates commands to either MCU through a serial link (at a whopping 1200 bps, plenty of time to handle commands -- CBA to write proper interrupt-handlers and do proper buffering ;-)

Wiring and MCU-overkill

The slave-MCU is a big ATmega32; although we will only use about 10 I/O pins, it was the only JTAG-capable Atmel MCU I had. So there. For master I used the wonderful ATtiny2313.

As can be seen in the schematics in Fig.6, for performing actual tests using boundary-scan chain, there are dedicated fixed-direction I/O-lines...

'AB', running from master ('A') to slave ('B'),
'PQ', running from slave ('P') to master ('Q'),
'XY', running from slave ('X') to itself ('Y').

These lines have LEDs so I can see what's going on. Apart from the 4 other lines to the slave's JTAG-port, the master also controls the slave's reset-line.

Shared serial line to PC

From an idea by MHL, both master and slave share the Tx- and Rx-lines to a MAX232 receiver/driver. Both always receive, but only one can/should transmit at a time.

Since communication between PC and board is done in a query/response fashion, the MCU for which a query was intended, enbles its transmitter from software; its Tx-line then changes from hi-Z to active, and it sends its reply. The master and slave have a totally disjunct command-set for this to work.

Software

There are 3 'softwares' here: 2 run on master- and slave-MCU, and one runs ont the PC. The PC software is really not necessary for anything, except that I got a bit mad trying to use minicom sensibly.

MCU-part

The software for both master- and slave-MCU's is quite similar; they both implement a simple command-interpreter. No interrupts are used at all.

Basically, command-bytes sent from the host are read from serial port and are processed immediately; on error, chars until next newline are eaten. Upon receiving a newline after a successful command, a reply is sent back to the host. This basically means that, for a long/compound command, stuff actually happens when you type it, not when you press Enter.

Master

See the source if interested; basically it handles the following commands:

command:	reply:	description:
`r`Pin	`m>` Pin`:`State	read I/O-pin status, where Pin is one of '`a`' (outgoing end of test-line 'AB'), '`q`' (incoming end of test-line 'PQ'), '`r`' (slave-reset pin), or one of '`c`', '`m`', '`i`' or '`o`', for JTAG-pins `TMC`, `TMS`, `TDI` and `TDO`, respectively; State is one of '`0`' (low) or '`1`' (high).
`w`PinState	`m>` Pin`:`Oldstate`->`Newstate	write I/O pin status, where Pin can be one of '`m`', '`c`', '`i`', '`r`' or '`a`' as in the previous command. Note that you can read more pins that you can write. State, Oldstate and Newstate as State in the previous command.
`j((`[`mi`]State)\|`^`)*	`m>`( state)* `ok`	Meh... '(', ')', '[', ']' and the Kleene-star ('') are meant as regexp symbols. Use letters for pins, as described above; use '`^`' to emit a positive pulse on `TCK`. The value at `TDI` is clocked in at the rising edge; values of `TDO` at each falling edge edge are collected and output as State**, in order of clock pulses.
#.*	(nothing)	Comment; everything up to and including the next newline char(s) will be eaten/ignored.

Note that the master controls the slave's TAP-lines and reset-line. It's so friggin' simple I'll stop before embarassing myself even more.

Some examples (MCU-reply not shown):

# Negative pulse on slave's (low-active) reset-line
wr0
wr1

# Read incoming end of test-line 'PQ'
rq

# Enter some imaginary state (0->1 on TMS-line), 
# clock 3 bits ('010') over TDI, and leave state (1->0 on TMS-line).
# Note that we clock the last bit *while* leaving the state.
jm1^
ji0^i1^
jm0i0^

Slave

Equally boring; see C'n'P source fore details. It's basically a stripped version of the master's code. The implemented command-set is also very similar:

command:	reply:	description:
`r`Pin	`s>` Pin`:`State	See description Master command-set; Pin can be one of '`b`' (incoming end of test-line 'AB'), '`p`' (outgoing end of test-line 'PQ'), and '`x`' and '`y`' for incoming, respectively outgoing ends of test-line 'XY'.
`w`PinState	`s>` Pin`:`Oldstate`->`Newstate	See description Master command-set; Pin can only be '`p`' or '`x`'.
#.*	(nothing)	Comment; everything up to and including the next newline char(s) will be eaten/ignored.

Note that the slave cannot read its own TAP-lines. I often used 'rb' to see if the slave is alive or not. Since the slave can read/write lines like the master, and nothing else, no examples here.

Serial terminal emulator

This runs on PC, and is a sorry excuse for a terminal, really. It does the following:

read line from stdin
echo line to stdout and send over serial port
sleep 500 ms
receive all pending incoming stuff from serial port and echo to stdout
goto 1 untill EOF

Here is the source; it's not really open for discussion. In fact, it's not at all open for discussion. Couldn't even be arsed to not use a hardcoded /dev/ttyS1 and 1200bps :-)

An example of use:

sh$ cat write_this.txt
# On master: read slave's reset-pin status
rb

# On slave: write '0' to pin 'X' (test-line 'XY')
wx0

# On master: clock 4 '1's in current TAP-state, and catch output
j^^^^

sh$ ./wr < write_this.txt
# On master: read slave's reset-pin status
rb
m> b:1

# On slave: write '0' to pin 'X' (test-line 'XY')
wx0
s> x:1->0

# On master: clock 4 '1's in current TAP-state, and catch output
j^^^^
m> 1010 ok

sh$

work in progress, tumdedum...