This is an overview of the internals of the program tiny-morse-decoder.c. The sections in this document mirror the sections in the source file. It is suggested to read this document and the source code side by side. The comments in the source provide a lower level, more fine grained documentation, athough there is significant overlap between those comments and this document.
The microcontroller is clocked at F_CPU
= 9.6 MHz by
its internal RC oscillator.
The data path is: input → edge detector → tokenizer → decoder → UART → output.
The ATtiny13A has a single timer, which is used both for driving the
software UART and for timekeeping. It is driven by the system clock
prescaled by a factor 8. It is configured to count from 0 to
TIMER_TOP
= 124, then reset to zero and start again. The reset
frequency is then
9.6 MHz ÷ 8 ÷ 125 = 9.6 kHz.
which is also the output baud rate. The timer can generate up to two interrupts per period:
TIM0_COMPA
, which is always active, is used for counting “system tics”TIM0_COMPB
, which is active only when transmitting, drives the software UART.
The available timekeeping functions are:
tics()
: returns the current time is system tics. This rolls over to zero every 6.83 seconds but, owing to the rules of modular arithmetics, the rollover is not an issue.expired()
: returnstrue
if a timeout has expired,false
otherwise. This function performs an unsigned to signed conversion which can potentially overflow, invoking implementation-defined behavior. However, gcc does the only sensible thing, which is to not change the internal representation of the number, effectively reducing it modulo 2bit width.delay()
does what it promises. Since this is a blocking function, it should only be used during program startup.
The input pins PB0
and PB1
are read as a binary number between 0 and
3, which is then used to index an array of “dot times” corresponding to
the available speeds. The “dot time”, or expected length of a dot, is
the base unit of Morse keying timing. Then, the global variables
delay_1u
, delay_2u
and delay_3u
are initialized with multiples of
this base unit. These variables are themselves in system tics.
Since the default is to leave the pins floating, and a floating pin reads as binary 1, the default (slowest) speed is at index 3 (binary: 11).
The edge detector monitors the pin connected to the telegraph key and
delivers RISE
and FALL
events which are fed to the tokenizer. It
implements an asymmetric debouncer based on the notion that the key can
only bounce when it is in the down position:
- when the key is up, a
FALL
event is generated as soon as the input pin readsLOW
, the key is then assumed to be in the down position - when the key is down, a
HIGH
input reading is assumed to be a mechanical bounce - a
RISE
event is generated only after the input has been consistentlyHIGH
for 10 ms, at which point the key is assumed to be up.
The edge detector is a finite state machine implementing the following state diagram:
In the above diagram, states with a gray background are those where the
key is assumed to be down. Transitions are labeled with the event that
triggers them. Transitions that generate an output event are labeled
trigger / output, where trigger is the input event that
triggers the transition, and output is the generated output event. For
example, the transition from the “up” state to the “down” state is
triggered by the input pin reading LOW
and generates a FALL
event.
The source code of get_edge()
is essentially a literal translation of
the above state diagram.
The tokenizer translates the RISE
and FALL
events generated by the
edge detector into a stream of symbols among: DOT
, DASH
,
END_OF_CHAR
and END_OF_WORD
. Note that END_OF_WORD
is always
preceded by END_OF_CHAR
. The tokenizer works by timing the delays
between the RISE
and FALL
events, and comparing these delays to
specific thresholds.
The standard Morse timing is:
element | duration |
---|---|
dot | 1 |
dash | 3 |
interelement gap | 1 |
intercharacter gap | 3 |
interword gap | 7 |
where the duration of a dot is the base time unit. Then, the thresholds for discriminating a dot from a dash, and for classifying gaps, are set to the averages of the expected durations:
threshold | duration |
---|---|
dot / dash | 2 |
interelement / intercharacter gap | 2 |
intercharacter / interword gap | 5 |
The tokenizer is also a state machine:
Note that the DOT
and DASH
symbols are generated on RISE
events
(when the key goes up), whereas the END_OF_CHAR
and END_OF_WORD
symbols are generated on timeouts (while the key is up). Note also that
a long gap generates an END_OF_CHAR
symbol two time units after the
RISE
event, followed, three time units later, by an END_OF_WORD
symbol.
The function tokenize()
is essentially a literal translation of the
above state diagram.
The decoder translates the above mentioned symbols into a string of ASCII charaters. Each character is translated by following these steps:
- the symbols are converted to a bit stream per the rules:
DOT
→ 1DASH
→ 0, 1
- the bit stream is read backwards (lest significant bit first) as a binary number
- this “code number” is looked-up in the array
morse_code
- if found, adding 32 (ASCII space) to the array index gives the
character's ASCII code, with the exception that index 0 encodes
'_'
rather than space.
Example: The sequence (DASH
, DOT
, DASH
) gives the bit stream
(0, 1, 1, 0, 1) which, read backwards, is the binary number 10110
= decimal 22. This number can be found in the morse_code
array at index 43. Then, 43 + 32 is 75, which is the ASCII
code for the letter K.
In practice, steps 1 and 2 are merged into a single step: the bit
stream is built as an integer to start with, using bit-level operations.
Steps 3 and 4 are perfomed by the function code_to_char()
upon
receiving the END_OF_CHAR
symbol. The END_OF_WORD
symbol is simply
translated to an ASCII space.
This technique allows for a very compact representation of the Morse code as an array of 59 16-bit integers:
- 54 of these integers encode all the Morse codes known by the program
- 5 zeros stand for characters with no known Morse representation (#, %, *, <, and >).
The morse_code
array was obviously not written by hand: it was
generated by the the program mkcode.c, available in the tools
directory.
This is a software implementation of an asynchronous serial transmitter. The term “UART”, which stands for “universal asynchronous receiver-transmitter”, is a misnomer, since this is neither universal (it only supports 9600/8N1) nor a receiver. It is used here nevertheless only because it is familiar.
This transmitter is built on a 16-bit shift register. The function
uart_putchar()
is used to transmit a character: it takes the 8-bit
character to be transmitted, adds the start and stop bits, and puts it
in the shift register. The binary pattern in the shift register is then
0000001XXXXXXXX0
where XXXXXXXX
are the eight bits of the character, the rightmost 0
is the start bit, and the leftmost 1 is the stop bit. Once the shift
register has been initialized, the interrupt TIM0_COMPB
is enabled,
which then fires at the serial bit rate.
The bits in the shift register are then sent, one at at time and least significant first, by the corresponding interrupt service routine. Once the stop bit has been shifted out, the shift register is all zeros, at which point the interrupt is disabled.
The main program does the required initializations, then sends the “invitation to transmit” code to the LED, then goes into an infinite data-processing loop. This loop is a straightforward implementation of the data pipeline: edge detector → tokenizer → decoder → UART:
for (;;) {
edge_t edge = get_edge();
symbol_t sym = tokenize(edge);
char c = decode(sym);
if (c)
uart_putchar(c);
}
Everything in this loop is non blocking. Most of the time:
get_edge()
returnsNO_EDGE
tokenize()
returnsNO_SYMBOL
decode()
returns0
- nothing is sent to the UART.
It is important that the loop is non-blocking because the state machines
get_edge()
and tokenize()
have to run often enough in order to
properly handle their timeout-trigered transitions.