diff --git a/intro_to_micros_notes.pdf b/intro_to_micros_notes.pdf index 2e54579..423a7fb 100644 Binary files a/intro_to_micros_notes.pdf and b/intro_to_micros_notes.pdf differ diff --git a/tex/adc.tex b/tex/adc.tex index d62c838..4377145 100644 --- a/tex/adc.tex +++ b/tex/adc.tex @@ -1,7 +1,7 @@ \chapter{Analogue to Digital Converter} Before discussing an Analogue to Digital Converter (ADC), first consider the simple GPIO pin configured as an input. -The pin `reads' the voltage applied to it and produces a binary number (a 1 or a 0) which indicates whether the applied voltage is a high or low. Technically this GPIO pin is an analogue to digital converter: it takes an analogue voltage and produces a binary number representing that voltage. +The pin "reads" the voltage applied to it and produces a binary number (a 1 or a 0) which indicates whether the applied voltage is a high or low. Technically this GPIO pin is an analogue to digital converter: it takes an analogue voltage and produces a binary number representing that voltage. However, having only 1 bit to represent the voltage applied to the pin means that we get a very poor approximation of the voltage. For example, we cannot tell the different between 2 V and 3 V being applied to the pin: both of those voltages are considered a logic 1. For this reason we do not typically refer to a GPIO pin as an ADC, rather we refer to it as a digital input. @@ -13,11 +13,11 @@ \section{Transfer Function} A transfer function is the mathematical relationship between the input voltage and the output value of the ADC. Different ADCs with different architectures have different transfer functions. What is discussed here is the transfer function for the ADC used inside our STM32F051 microcontroller. -First note that because we have a finite number of possible numberical outputs of the ADC which must map to the full voltage range which the ADC operates over, each numerical output of the ADC corresponds to a \emph{range} of input voltage. +First note that because we have a finite number of possible numberical outputs of the ADC which must map to the full voltage range which the ADC operates over, each numerical output of the ADC corresponds to a \emph{range} of input voltages. The number of bits determines the number of possible numerical outputs or \emph{quantization intervals} which the system has. A quantisation interval is an input voltage range which produces a certain digital output. The supply rail is divided up into \(N = 2^M\) quantisation intervals where \(M\) is the number of bits of the ADC. A change of one \emph{least significant bit} of the value of the digital output corresponds to going to the next quantisation interval. -As stated, there are \(N\) quantisation intervals. An ADC will typically digitise voltages between \SI{0}{\volt} (ground) and some upper limit, \(V_{ref}\) which is often set to the supply voltage of the microcontroller. If we have a reference voltage \(V_{ref}\) volts, then each quantisation interval is \(Q = \frac{V_{ref}}{N}\) volts wide. +As stated, there are \(N\) quantisation intervals. An ADC will typically digitise voltages between \SI{0}{\volt} (ground), and some upper limit \(V_{ref}\), which is often set to the supply voltage of the microcontroller. If we have a reference voltage \(V_{ref}\) volts, then each quantisation interval is \(Q = \frac{V_{ref}}{N}\) volts wide. We realise that each digital output corresponds to a \emph{range} of input voltages. How big is the range? Seeing as the ADC can only work in multiple of a lsb, the range must be one lsb wide. The ADC in the STM32 has been structured such that the midpoint of the range which produced a digital output of \(k\) is equal to \(V_{k} = k \times \frac{V_{ref}}{N}\). Hence, the range of input voltage corresponding to a digital output of \(k\) is: \(kQ - 0.5Q\) to \(kQ + 0.5Q\). Intuitively, this is \(k\) lsbs with half a lsb uncertainty each way. @@ -28,14 +28,14 @@ \section{Transfer Function} \begin{figure} \centering \includegraphics[width=0.5\textwidth]{adc_transfer.png} - \caption{Example graph of ADC transfer function for 3-bit ADC} + \caption{Example graph of ADC transfer function for 3-bit ADC.} \label{fig:adc-transfer-graph} \end{figure} \section{Example Calculation} -For an example of this, let's consider the case of a 3 bit ADC running off of a Vref of \SI{4.0}{\volt}. One lsb has a value of \(V_{lsb} = \frac{\SI{4}{\volt}}{2^3} = \SI{0.5}{\volt}\). -Half a lsb, or the uncertainy around each value is hence \(\frac{\SI{0.5}{\volt}}{2} = \SI{0.25}{\volt}\) +For an example of this, let's consider the case of a 3 bit ADC running off of a \(V_{ref}\) of \SI{4.0}{\volt}. One lsb has a value of \(V_{lsb} = \frac{\SI{4}{\volt}}{2^3} = \SI{0.5}{\volt}\). +Half a lsb, or the uncertainty around each value is hence \(\frac{\SI{0.5}{\volt}}{2} = \SI{0.25}{\volt}\) Hence, the input voltage range for a digital output of \(k\) is equal to \((k \times \SI{0.5}{\volt}) \pm \SI{0.25}{\volt}\). All input/output values for this example ADC are shown in \autoref{tab:3-bit-adc}. \\ @@ -53,20 +53,20 @@ \section{Example Calculation} 110 & 3.0 & 2.75 to 3.25 \\ 111 & 3.5 & 3.25 and above\\ \end{tabu} -\caption{Numerical output vs applied voltage band for a 3 bit ADC running off of \SI{4}{\volt}} +\caption{Numerical output vs applied voltage band for a 3 bit ADC running off of \SI{4}{\volt}.} \label{tab:3-bit-adc} \end{table} We see, therefore, that in general to calculate the corresponding input voltage range for a certain digital output: -\begin{itemize} +\begin{enumerate} \item Calculate the size of each quantisation interval (aka: value of one lsb): \(Q = \frac{V_{ref}}{2^M}\) volts. \item The output \(k\) means we go up \(k\) quantisation intervals to the midpoint: \(kQ\) \item Add the uncertainty each way, half a lsb: \(\pm 0.5Q\) -\end{itemize} +\end{enumerate} \section{ADC errors and calibration} Inside the STM32F051 is a Switched Capacitor Successive Approximation Register Analogue to Digital Converter (SC SAR ADC). -This ADC architecture consists of an array of capacitors, which can be selectively switched to GND or Vref. +This ADC architecture consists of an array of capacitors, which can be selectively switched to GND or \(V_{ref}\). Each capacitor has a binary relationship to the next one, meaning that it's half the size. In other words, the first cap has a value of \(C\), the second has a value of \(\frac{C}{2}\), the third has a value of \(\frac{C}{4}\), then \(\frac{C}{8}\) etc. The total capacitance of the array is typically a few picofarads. @@ -76,7 +76,7 @@ \section{ADC errors and calibration} \centering \includegraphics[page=5, clip=true, trim=75 360 75 280, width=\textwidth]{CD00211314.pdf} % left, bottom, right, top -\caption{Internal workings of a SC SAR ADC. Note the analogue components: capacitors and comparator} +\caption{Internal workings of a SC SAR ADC. Note the analogue components: capacitors and comparator.} \label{fig:adc-component-level} \end{figure} @@ -116,14 +116,14 @@ \subsection{Enabling} \subsection{Channel} The ADC is not limited to just reading from one fixed pin; it can select which pin to use from a number of possible sources, or \emph{channels}. -The ADC\_CHSELR register controls which channel is selected. On our device there are 10 different pins which can be selected for use as an ADC channel. In order to see which ADC channel a pin corresponds to, consult Table 13 of the Datasheet. This table shows which ADC channel a pin is connected to in the `Additional functions' column. For example, PB1 is connected to ADC channel 9. +The ADC\_CHSELR register controls which channel is selected. On our device there are 10 different pins which can be selected for use as an ADC channel. In order to see which ADC channel a pin corresponds to, consult Table 13 of the Datasheet. This table shows which ADC channel a pin is connected to in the "Additional functions" column. For example, PB1 is connected to ADC channel 9. -Be careful not to set multiple channels in the ADC\_CHSELR simultaneously. If you do this, the ADC will scan through each of the channels. Unless you know what you're doing, this is probably not what you want and it will confuse you. +\emph{Be careful not to set multiple channels in the ADC\_CHSELR simultaneously. If you do this, the ADC will scan through each of the channels. Unless you know what you're doing, this is probably not what you want and it will confuse you.} \subsection{Pin Mode} We know that by default a pin will operate in \emph{Input} mode. That is, it will digitise the applied voltage to a 1 bit number which will set/clear a bit in the GPIOx\_IDR. The component which does the digitising is the Schmitt trigger. -This is now how we want the pin to function when using an ADC. When we want to use a pin as an ADC channel, we want the raw analogue voltage to be passed on to the ADC for digitising. In order to achieve this, the pin should be put into \emph{Analogue} mode. Here, the Schmitt trigger is disabled and the pin is made accessible to analogue peripherals (such as the ADC). The structure of a pin in analogue mode can be seen in \autoref{fig:pin_analogue_mode}. Note the top of the diagram where it can clearly be seen that the raw analogue voltage is sent off to the ADC peripheral. +This is not how we want the pin to function when using an ADC. When we want to use a pin as an ADC channel, we want the raw analogue voltage to be passed on to the ADC for digitising. In order to achieve this, the pin should be put into \emph{Analogue} mode. Here, the Schmitt trigger is disabled and the pin is made accessible to analogue peripherals (such as the ADC). The structure of a pin in analogue mode can be seen in \autoref{fig:pin_analogue_mode}. Note the top of the diagram where it can clearly be seen that the raw analogue voltage is sent off to the ADC peripheral. \begin{figure} \centering @@ -141,7 +141,7 @@ \subsection{Pin Mode} \subsection{Resolution and Alignment} Our ADC can operate in one of 4 different resolutions: 6-bit, 8-bit, 10-bit or 12-bit. The resolutions which it will use is set by the RES bits in the ADC\_CFGR1. A higher resolution allows a better approximation of the real applied voltage, while a lower resolution will allow the ADC to perform the conversions faster as it has less work to do. -The numerical output of the ADC is made available in the ADC\_DR (data register). This register is 16 bits wide. So, how will the result of the ADC conversion (which is less than 16 bits) be presented in the ADC\_DR? This is a question of data alignment and is controlled by the ALIGN bit in the ADC\_CFGR1. The structure of the ADC\_DR for all combinations of resolution and alignment is shown in \autoref{fig:adc_res_align}. Which one should you use? Depends entirely on your application. +The numerical output of the ADC is made available in the ADC\_DR (data register). This register is 16 bits wide. So, how will the result of the ADC conversion (which is less than 16 bits) be presented in the ADC\_DR? This is a question of data alignment and is controlled by the ALIGN bit in the ADC\_CFGR1. The structure of the ADC\_DR for all combinations of resolution and alignment is shown in \autoref{fig:adc_res_align}. Which one should you use? This depends entirely on your application. \begin{figure} \centering @@ -161,7 +161,7 @@ \subsection{Performing Conversions} \item Wait for the EOC bit in the ADC\_ISR to go high \item Read the result from the ADC\_DR. The result is in a format as defined by the resolution and alignment. \end{enumerate} -Note that by reading from the ADC\_DR the EOC flag is automatically cleared. If you do not read the contents of the ADC\_DR the EOC flag will not be cleared which may cause issues the next time to start a conversion. +Note that by reading from the ADC\_DR the EOC flag is automatically cleared. If you do not read the contents of the ADC\_DR the EOC flag will not be cleared which may cause issues the next time you try to start a conversion. diff --git a/tex/branching.tex b/tex/branching.tex index 56d7b93..dd6fefd 100644 --- a/tex/branching.tex +++ b/tex/branching.tex @@ -2,7 +2,7 @@ \chapter{Branching} Branching refers to the ability to alter the order of execution of code. Ordinarily the instructions which are coded and then placed into flash are executed sequentially: one after the other in the order which they appear in flash. However, this is highly limiting. Branching allows us to execute instructions which can cause the CPU to jump to executing any instruction in the program (sort of). \section{Implementation of a Branch} -Seeing as the program counter (PC) entirely specifies which instruction is going to be executed next (by holding the address of the instruction) it is relatively simple in concept to get the CPU to execute a specific instruction: write the address of that instruction to the PC. Unfortunately there is a complication. +Seeing as the program counter entirely specifies which instruction is going to be executed next (by holding the address of the instruction), it is relatively simple in concept to get the CPU to execute a specific instruction: write the address of that instruction to the PC. Unfortunately there is a complication. Due to our instructions being 16 bits wide, it is not possible to hold the address of an instruction to branch to as immediate data seeing due to addresses being 32 bits (you can't fit 32 bits of operand into a 16 bit instruction!). To overcome this, a technique called relative branching is employed. This means that the address of the instruction which the CPU branches to is equal to the contents of a certain register plus or minus a certain amount. Seeing as the PC is already pointing to the general area in memory where instructions live, the PC is most often use as the base address register. This means that the branch instruction causes the PC to take on a value equal to the current value of the PC plus/minus some amount. diff --git a/tex/c_intro.tex b/tex/c_intro.tex index d804639..95b799f 100644 --- a/tex/c_intro.tex +++ b/tex/c_intro.tex @@ -1,12 +1,12 @@ \chapter{Introduction to C} \section{Advantages} -C is a programming language which was conceived in the early 1970's. +C is a programming language which was conceived in the early 1970s. C is a more abstract language than assembly. -That is to say, it hides a lot of the complexity associated with assembly, and allows the programmer two write code which is more focused on the task which must be carried out rather than the details of how it is carried out. +That is to say, it hides a lot of the complexity associated with assembly, and allows the programmer to write code which is more focused on the task which must be carried out rather than the details of how it is carried out. This abstracted nature of C provides us with the main advantage of C over assembly: portability. C is abstracted away from machine code, so we don't have to worry about instruction sets when writing our code. -The C compiler figures out the correct assembly instructions to carry out the operation you want to do and generating the assembly code to do it. +The C compiler figures out the correct assembly instructions to carry out the operation you want to do and generates the assembly code to do it. There are so many different instruction sets out there. Every manufacturer of microprocessors has their own instruction set. @@ -33,10 +33,10 @@ \section{Advantages} The balance between portability and performance is one of the main reasons why C is still one of most popular language today\footnote{As per \url{http://spectrum.ieee.org/computing/software/the-2015-top-ten-programming-languages}}, over 40 years after it was thought up. \section{Safety} -C and assembly are both unsafe programming languages. That is, they access to arbitrary memory addresses. This can cause catastrophic system failures if you (for example) accidentally overwrite some critical data such as the return address of a stack frame. +C and assembly are both unsafe programming languages. That is, they allow access to arbitrary memory addresses. This can cause catastrophic system failures if you (for example) accidentally overwrite some critical data such as the return address of a stack frame. The C compiler does attempt to provide warnings in the event that it detect that you are doing strange things, such dereferencing an uninitialised pointer or passing variables of the incorrect type to a function. However, the compiler is not able to warn for a subset of potentially dangerous actions and these warnings are often just ignored by sloppy programmers. -You may argue what it would be better to work with a safe programming language which does not allow us to create pointers to arbitrary memory addresses. However, for working with microcontroller it is essential that we do have the ability to modify specific memory addresses as that is how we interface with the peripherals. +You may argue that it would be better to work with a safe programming language which does not allow us to create pointers to arbitrary memory addresses. However, for working with microcontroller it is essential that we do have the ability to modify specific memory addresses as that is how we interface with the peripherals. In essence, we require the language to allow us full control over our system but this means we must program carefully or we risk creating faults which could be very difficult to track down. diff --git a/tex/c_operators.tex b/tex/c_operators.tex index 68e5d09..f2f1b8a 100644 --- a/tex/c_operators.tex +++ b/tex/c_operators.tex @@ -41,6 +41,7 @@ \subsection{Bitwise Logic} \textasciicircum & bitwise XOR. Binary\\ \textasciitilde & bitwise inverse (NOT). Unary\\ \end{tabu} +\caption{Commonly used bitwise operators.} \end{table} \subsection{Boolean Logic} @@ -55,6 +56,7 @@ \subsection{Boolean Logic} || & logical OR. Binary. \\ ! & logical inverse (NOT). Unary\\ \end{tabu} +\caption{Commonly used boolean operators.} \end{table} \section{Bit shift Operations} diff --git a/tex/c_vars.tex b/tex/c_vars.tex index 29d50e1..fba28fb 100644 --- a/tex/c_vars.tex +++ b/tex/c_vars.tex @@ -45,7 +45,7 @@ \subsection{Basic Types} \subsection{Pointer Types} We often want to deal with the memory addresses of variables or access specific addresses (such as in the case of interfacing with peripherals). For this, we can use pointer types. A pointer holds the address of some data. In other words, it point to some data. The amount of data which is pointed to by the pointer is specified by the type. This means we need to specify two things when defining pointer types: we need to specify that the variable is a pointer type and we need to specify how much data is being pointed to. -We specify that it is a pointer using the asterisk. It is good practice to put the asterisk next to the variable name rather than next to the type. +We specify that it is a pointer by using the asterisk symbol. It is good practice to put the asterisk next to the variable name rather than next to the type. For example, a pointer to 1 byte of data would be defined as follows\footnote{We specify the type again in brackets to do an explicit type cast. This is just to prevent a compiler warning by telling the compiler that we are intentionally (rather than accidentally) assigning a number to a pointer. Without the typecast, we would be assigning a basic type to a pointer type which would generate a warning.}. \begin{lstlisting}[language=C] @@ -53,7 +53,7 @@ \subsection{Pointer Types} \end{lstlisting} Here, we are saying that the memory allocated to \texttt{foo} should point to the byte of data located at address. -Our address space runs from 0x0000 0000 to 0xFFFF FFFF which means that we need 4 bytes to store an address. Hence, no matter what the size of the data being pointed to is, a pointer type variable will always occupy 4 bytes in memory\footnote{This is architecture dependant though. A system which had a smaller or larger address space would require a different amount of data to be allocated for pointers.}. +Our address space runs from 0x0000 0000 to 0xFFFF FFFF which means that we need 4 bytes to store an address. Hence, no matter what the size of the data being pointed to is, a pointer type variable will always occupy 4 bytes in memory\footnote{This is architecture dependant though. A system which has a smaller or larger address space would require a different amount of data to be allocated for pointers.}. Once we have a variable of type pointer, we are able to access the data being pointed to by the pointer with the dereference operator which is also an asterisk! This means that the asterisk has two possible meanings depending on context. If it's used during a variable declaration, it means the variable is of pointer type. If it is used on an existing variable of pointer type is means that we are accessing the data pointed to by the pointer, rather than accessing the pointer itself. @@ -112,7 +112,7 @@ \section{Statically Allocated Variables} A statically allocated is one which has a fixed spot in memory which is decided on at build time. The variable exists for the entire duration of the program and the location in memory which is allocated to that variable at compile time never changes or gets deallocated. No other variable will be allocated to that space. Static variables are either allocated outside of functions (ie: global variables) or are allocated inside a function with the \texttt{static} qualifier keyword. -At compile lime (well, technically link time), statically allocated variables get allocated memory at the \emph{start} of RAM. Statically allocated variables can be initialised to an explicit value or can have no initialisation value provided. If no initialisation values is provided, they will be default initialised to 0. Note that this memory initialisation must be performed before the code which access the variables gets run. This will be discussed more later. +At compile time (well, technically link time), statically allocated variables get allocated memory at the \emph{start} of RAM. Statically allocated variables can be initialised to an explicit value or can have no initialisation value provided. If no initialisation values is provided, they will be default initialised to 0. Note that this memory initialisation must be performed before the code which accesses the variables gets run. This will be discussed more later. \subsection{.data and .bss} Statically allocated variables are split up into two memory sections: the .data section for variables which are initialised to non-zero values and the .bss section for variables which are initialised to zero. @@ -164,9 +164,9 @@ \section{Code Implementation} \section{Arrays} An array is a group of 1\footnote{As per C99 you can have zero-length array or flexible length arrays.} or more elements, each being of the same data type. -The array occupies an amount of memory equal to the sum of the memory required for each element. An array should be define in one of two different ways: +The array occupies an amount of memory equal to the sum of the memory required for each element. An array should be defined in one of two different ways: \begin{enumerate} - \item Specifying the value for each element as an initialiser. In this case you do not need to specify the size of the array as it will be calculated by the compiler by counting the number if element values supplied. + \item Specifying the value for each element as an initialiser. In this case you do not need to specify the size of the array as it will be calculated by the compiler by counting the number of element values supplied. \item By specifying the size of the array and no initial values. Here the memory is allocated but no values is set for each element. If the array is a static variable, each element will be initialised to 0 by the startup code. If the array is an automatic variable each element will just be garbage by default. \end{enumerate} @@ -174,7 +174,7 @@ \section{Arrays} \begin{lstlisting}[language=C] -// The compiler allocates 5 bytes sets each to the specified value +// The compiler allocates 5 bytes, each set to the specified value int8_t foo[] = {0xAA, 0x42, 0x69, 0x55, 0xF0}; // The compiler allocates 15 words and leaves values as default @@ -186,7 +186,7 @@ \section{Arrays} This means that you cannot write to the variable name, as it makes no sense to modify the address of element 0 after it has already been defined. Should you wish to access the element, you should first \emph{dereference} the variable which will give you access to the data. Should you wish to access element 1, you should add 1 to the address of element 0 and then deference that. -This will always work independent of the size of the elements of the array due to arithmetic operations applied to pointer types working in multiplies of the size of the data being pointed to by the pointer, as discussed earlier. +This will always work independent of the size of the elements of the array due to arithmetic operations applied to pointer types working in multiples of the size of the data being pointed to by the pointer, as discussed earlier. There is a nice shorthand for accessing elements of arrays; the square bracket operator. This operator just dereferences the pointer plus some number. The following two are equivalent. \begin{lstlisting}[language=C] @@ -197,7 +197,7 @@ \section{Arrays} Note that although the square bracket operator is most useful on array types, it is valid on any pointer types. \section{Structs} -An array is limited in that each element of the array must be of the same type. A struct seeks to overcome this by allowing you to create a custom data structure which has elements (now called `members') of different data types. Additionally, rather than accessing the members via an index as you would do for an array, you access the members via their names. +An array is limited in that each element of the array must be of the same type. A struct seeks to overcome this by allowing you to create a custom data structure which has elements (now called "members") of different data types. Additionally, rather than accessing the members via an index as you would do for an array, you access the members via their names. Our main use for structs will be in interfacing with peripherals. A peripheral can be modelled as a struct where each register has a name and is of a potentially different size to the other registers in the peripheral. @@ -220,7 +220,7 @@ \section{Structs} struct myStruct foo; \end{lstlisting} -The variable \texttt{foo} is now of a variable of type \texttt{struct myStruct}. What can you do with this variable? Well, just like how you can access the elements of an array, you can access the members of a struct. This is done via the member access operator, the full stop / dot. +The variable \texttt{foo} is now of a variable of type \texttt{struct myStruct}. What can you do with this variable? Well, just like how you can access the elements of an array, you can access the members of a struct. This is done via the member access operator, the full stop/dot. \begin{lstlisting}[language=C] foo.member1 = 0xAABBCCDD; diff --git a/tex/coding.tex b/tex/coding.tex index 61c9604..f797b6f 100644 --- a/tex/coding.tex +++ b/tex/coding.tex @@ -2,7 +2,7 @@ \chapter{Coding} \section{Assembly} -In order to get the CPU to do some of what we've discussed above, it needs to have code loaded onto it to run. We write code in a language called assembly. Assembly is a human-readable language. A program is made up of a sequence of instruction; each instruction gets executed by the CPU. It's quite easy to see what each instruction does by reading the program. The complete instruction set is located in the Programming Manual. You must be familiar with this document! Examples of instruction which carry out the tasks listed above are: +In order to get the CPU to do some of what we've discussed above, it needs to have code loaded onto it to run. We write code in a language called assembly. Assembly is a human-readable language. A program is made up of a sequence of instructions; each instruction gets executed by the CPU. It's quite easy to see what each instruction does by reading the program. The complete instruction set is located in the Programming Manual. You must be familiar with this document! Examples of instruction which carry out the tasks listed in \autoref{sec:programmer's_model_of_the_CPU} are: \begin{enumerate} \item \texttt{ADDS R6, R0, R1} \item \texttt{MOV R0, R3} @@ -10,15 +10,15 @@ \section{Assembly} \item \texttt{MOVS R5, \#42} \end{enumerate} -Our CPU has an instruction set which is around 55 instructions big. An expanded discussion of instruction sets can be found in \autoref{sec:instruction_sets} +Our CPU has an instruction set which is around 55 instructions big. An expanded discussion of instruction sets can be found in \autoref{sec:instruction_sets}. \section{Compiling} The CPU does not have the ability to understand our nice English words like \textit{ADD} or \textit{MOV}. The CPU only has the ability to understand binary data. Assembly code must be compiled to machine code. A machine code instruction is a binary string, 16 bits long consisting of the operation code (opcode) and the data which it must operate on (operand). -For example, assume that we wanted to ascertain the machine code representation of the instruction \texttt{ADDS R6, R0, R1}. An extract from the ARMv6-M Architecture Reference Manual is shown in \autoref{fig:adds_encoding} where Rd is the destination register and Rm and Rn are the source registers of the add. It can easily be seen that the instruction would compile to\texttt{ 0001100 001 000 110 = 0x1846}. The fixed bits at the start of the instruction are the opcode. This tells the CPU it's an ADD instruction it must do. The other three sets of three bits are the operands which specify the registers which the CPU must use in the ADD instruction. +For example, assume that we wanted to ascertain the machine code representation of the instruction \texttt{ADDS R6, R0, R1}. An extract from the ARMv6-M Architecture Reference Manual is shown in \autoref{fig:adds_encoding} where \texttt{Rd} is the destination register and \texttt{Rm} and \texttt{Rn} are the source registers of the \texttt{ADD}. It can easily be seen that the instruction would compile to\texttt{ 0001100 001 000 110 = 0x1846}. The fixed bits at the start of the instruction are the opcode. This tells the CPU it's an \texttt{ADD} instruction it must do. The other three sets of three bits are the operands which specify the registers which the CPU must use in the \texttt{ADD} instruction. \begin{figure} \centering \includegraphics[width=0.7\textwidth]{./fig/adds_encoding.png} - \caption{An encoding of the ADDS instruction} + \caption{An encoding of the ADDS instruction.} \label{fig:adds_encoding} \end{figure} The opcodes for each instruction are detailed in the ARMv6-M Architecture Reference Manual. @@ -36,12 +36,12 @@ \section{Executing Code} \section{Some Useful Instructions} \subsection{MOV} -MOV or the variant MOVS is useful for moving data within the CPU. The instruction can either be used to move (a better word would be 'copy') the contents of one register to another register or some immediate data encoded in the instruction into a register. There are hence two ways which the instrucion can be used. Either MOVS Rd, \#imm which will move the 8-bit number specified by \#imm into the destination register Rd. The 8-bit number will be moved into the LSB of the register and the other bits will be set to 0. Example: MOVS R0, \#0xAA. Or, the other way is between two registers. MOVS Rd, Rm will copy the contents of Rm into Rd. It will copy all 32 bits. +\texttt{MOV} or the variant \texttt{MOVS} is useful for moving data within the CPU. The instruction can either be used to move (a better word would be 'copy') the contents of one register to another register or some immediate data encoded in the instruction into a register. There are hence two ways which the instrucion can be used. Either \texttt{MOVS Rd, \#imm} which will move the 8-bit number specified by \texttt{\#imm} into the destination register \texttt{Rd}. The 8-bit number will be moved into the lsb of the register and the other bits will be set to 0. Example: \texttt{MOVS R0, \#0xAA}. Or, the other way is between two registers. \texttt{MOVS Rd, Rm} will copy the contents of \texttt{Rm} into \texttt{Rd}. It will copy all 32 bits. \subsection{LDR, STR} -LDR and STR copy data from memory into the CPU and from the CPU into memory respectively. Loading and storing are such key aspects of our CPU that they are discussed in their own chapter: \autoref{chap:load_store}. +\texttt{LDR} and \texttt{STR} copy data from memory into the CPU and from the CPU into memory respectively. Loading and storing are such key aspects of our CPU that they are discussed in their own chapter: \autoref{chap:load_store}. \subsection{ANDS, ORRS, EORS} -These are all bitwise operations which operate on the contents of registers. ANDS is a bitwise AND, ORRS is a bitwise OR, EORS and a bitwise exclusive OR. These three instructions all have the same format, for example: ANDS Rd, Rn, Rm where Rn and Rm are the two source registers which get anded together and Rd is the destination register where the result is stored. Note that Rd must be the same register as Rn. Hence, this instruction will always overwrite one of its source registers with the result. +These are all bitwise operations which operate on the contents of registers. \texttt{ANDS} is a bitwise \texttt{AND}, \texttt{ORRS} is a bitwise \texttt{OR}, \texttt{EORS} is a bitwise exclusive \texttt{OR}. These three instructions all have the same format, for example: \texttt{ANDS Rd, Rn, Rm} where \texttt{Rn} and \texttt{Rm} are the two source registers which get anded together and \texttt{Rd} is the destination register where the result is stored. Note that \texttt{Rd} must be the same register as \texttt{Rn}. Hence, this instruction will always overwrite one of its source registers with the result. TODO: expand this section or move this content into more appropriate sections. diff --git a/tex/conditional_branch.tex b/tex/conditional_branch.tex index 1da83c2..a06aa7e 100644 --- a/tex/conditional_branch.tex +++ b/tex/conditional_branch.tex @@ -4,13 +4,13 @@ \chapter{Conditional Branching} \section{Application Program Status Register} The APSR is a special CPU register. It does not have a register number like the other registers and cannot be read or written by normal instructions. However this is a critically important register as it is the source of the conditions for the conditional branching. The APSR holds 4 flags: \begin{description} -\item[Negative (N):] Set if the result of the last operations has was negative. In other words, the msb was a 1. This flag only has a meaning when treating data as signed numbers. +\item[Negative (N):] Set if the result of the last operations has was negative. In other words, the most significant bit (msb) was a 1. This flag only has a meaning when treating data as signed numbers. \item[Zero (Z):] Set if all bits of the last operations were 0. -\item[Carry/Borrow (C):] Set if an \emph{unsigned} overflow occurred. Ie: the actual result of the computation exceeded the bounds of the 32-bit register when treated as an unsigned number. -\item[Two's Compliment Overflow (V):] Set if a \emph{signed} overflow occurred. Ie: the actual result of the computation exceeded the bounds of the 32-bit register when treated as a signed number. +\item[Carry/Borrow (C):] Set if an \emph{unsigned} overflow occurred, i.e. the actual result of the computation exceeded the bounds of the 32-bit register when treated as an unsigned number. +\item[Two's Compliment Overflow (V):] Set if a \emph{signed} overflow occurred, i.e. the actual result of the computation exceeded the bounds of the 32-bit register when treated as a signed number. \end{description} -Together, these flags provide us with an abundance of information about the result of computations. We are able to ascertain basically any information about the relationship between arbitrary numbers by examining these flags. Not all instructions set the APSR flags. It is necessary to examine the details of the instruction in the Programming Manual in order to see whether the instruction sets the flags. Furthermore it may be necessary to examine the detailed workings of the instruction in the ARMv6-M Reference Manual in order to see which flags are set and how the settings of those flags is determined. However, in general instructions which set the flags have an \texttt{S} at the end of their name. Again (in general) arithmetic operations set/clear all APSR flags while logic operations set/clear only the \texttt{N} or \texttt{Z} flags. +Together, these flags provide us with an abundance of information about the result of computations. We are able to ascertain basically any information about the relationship between arbitrary numbers by examining these flags. Not all instructions set the APSR flags. It is necessary to examine the details of the instruction in the Programming Manual in order to see whether the instruction sets the flags. Furthermore it may be necessary to examine the detailed workings of the instruction in the ARMv6-M Reference Manual in order to see which flags are set and how the settings of those flags is determined. However, in general, instructions which set the flags have an \texttt{S} at the end of their name. Again (in general) arithmetic operations set/clear all APSR flags while logic operations set/clear only the \texttt{N} or \texttt{Z} flags. \section{Overflow Flags} While the Z and N flags are simple to understand, the overflow flags (especially signed but also unsigned) are more tricky. Let's explore them in a bit of detail. @@ -25,34 +25,34 @@ \subsection{Unsigned Numbers and the C Flag} What happens if we attempt to exceed these limits? An overflow occurs. If we attempt to perform a computation where the true result of the computation is outside of the limits imposed by the finite number of bits in the CPU an overflow occurs. This overflow of the unsigned limits is signalled by the CPU through setting the C flag high. \subsection{Signed Numbers and the V Flag} -We've just seen that when interpreting a sequence of bits as unsigned the minimum value is 0. This is often not sufficient as we may want the capability to represent negative numbers. Enter signed numbers. Here, the weights of the msb is $-(2^{n})$ while all the other bit keep their positive weights. +We've just seen that when interpreting a sequence of bits as unsigned the minimum value is 0. This is often not sufficient as we may want the capability to represent negative numbers. Enter signed numbers. Here, the weight of the msb is $-(2^{n})$ while all the other bit keep their positive weights. -This means we have different limits. The largest value which can be represented by a 32-bit signed number is when all of the positive bits are set and the negative bit is clear: 0x7FFF FFFF or 2 147 483 647. The smallest value which can be represented by a 32-bit signed number is when all of the positive bits are clear and the negative bit is set: 0x8000 0000 or -2 147 483 648. +This means we have different limits. The largest value which can be represented by a 32-bit signed number is when all of the positive bits are set and the negative bit is clear: 0x7FFFFFFF or 2 147 483 647. The smallest value which can be represented by a 32-bit signed number is when all of the positive bits are clear and the negative bit is set: 0x80000000 or -2 147 483 648. Again, what happens if we attempt to execute a computation where the actual result is outside of the limits of what the 32 bits can hold when interpreted as signed numbers? The CPU signals this error to us with the Two's compliment overflow flag: V. The CPU itself has absolutely no idea whether you as the programmer want to treat your data as signed or unsigned numbers. It just takes sequences of bits and performs arithmetic or logic operations on the bits. Hence, to cater for both possible cases (the bits should be treated as signed or the bits should be treated as unsigned) the CPU sets or clears both the C and V flag after computations. If you want your numbers to be treated as unsigned you should be interested in the state of the C flag. If you want your numbers to be treated as signed you should be interested in the V flag. \section{Compare Instruction} -One of the key instructions used in the context of conditional branching is the compare (\texttt{CMP}) instruction. This instruction essentially subtracts two values from each other, disregards the result but updates the flags depending on the result. \texttt{CMP} takes either two registers or a register and an immediate value as operands. The CMP instruction is most often used to set the conditions which the conditional branch will depend on. This is due to the fact that a subtraction tells us a lot about the relationship between two numbers. For example, if the result of a subtraction sets the zero flag we know that the numbers being compared (subtracted) have the same value. Similarly, if the result of the subtraction of B from A clears the V flag it tell us that A is larger than B when viewed as signed numbers. +One of the key instructions used in the context of conditional branching is the compare (\texttt{CMP}) instruction. This instruction essentially subtracts two values from each other, disregards the result but updates the flags depending on the result. \texttt{CMP} takes either two registers or a register and an immediate value as operands. The \texttt{CMP} instruction is most often used to set the conditions which the conditional branch will depend on. This is due to the fact that a subtraction tells us a lot about the relationship between two numbers. For example, if the result of a subtraction sets the zero flag we know that the numbers being compared (subtracted) have the same value. Similarly, if the result of the subtraction of B from A clears the V flag it tell us that A is larger than B when viewed as signed numbers. The format of the \texttt{CMP} instruction is one of: \begin{lstlisting}[fontadjust=true,frame=trBL] CMP Rn, Rm CMP Rn, #imm8 \end{lstlisting} -In the first case, the value of Rm is subtracted from Rn. In the seconds case, the 8-bit immediate number is subtracted from Rn. +In the first case, the value of \texttt{Rm} is subtracted from \texttt{Rn}. In the seconds case, the 8-bit immediate number is subtracted from \texttt{Rn}. \subsection{A note on the implementation of the subtract operation} -In order to minimize the hardware cost of the ALU circuitry, the subtract operations is implemented by adding the bitwise inverse of Rm to Rn, plus 1. You don't really have to worry about this other than to note that this implementation explains why the C or V flag is set when the numbers being compared are equal. For example, the subtraction of the number 42 from the number 42 corresponds to the addition of the numbers 42 and 4294967253 and 1. It should be apparent to you that this result is zero, but sets the carry flag. +In order to minimize the hardware cost of the ALU circuitry, the subtract operation is implemented by adding the bitwise inverse of Rm to Rn, plus 1. You don't really have to worry about this other than to note that this implementation explains why the C or V flag is set when the numbers being compared are equal. For example, the subtraction of the number 42 from the number 42 corresponds to the addition of the numbers 42 and 4294967253 and 1. It should be apparent to you that this result is zero, but sets the carry flag. \section{Condition Code Suffixes} -The branch (\texttt{B}) instruction is able to take optional condition code suffixes which specify whether or not the instruction will be executed depending on the state of the flags in the APRS. +The branch (\texttt{B}) instruction is able to take optional condition code suffixes which specify whether or not the instruction will be executed depending on the state of the flags in the APSR. These suffixes are shown in \autoref{fig:cc_suff}. A suffix can be appended to the \texttt{B} instruction to turn it into a conditional branch. For example, \texttt{BEQ} will be taken if the result of the last computation produced a zero result. Similarly, \texttt{BNE} will be taken if the result was non-zero. -The mnemonics for the suffixes are closely related to the compare operation. For example, the BGT (branch if greater than when treated as signed numbers) will be taken if the Rn operand of the CMP instruction is greater than the Rm operand when treated as signed numbers. This is why the CMP and B\{cc\} instructions go so well together. -Note that the mnemonic is testing how Rn is related to the immediate number or Rm. So if the condition is some arithmetic relationship, it's asking whether Rn has that property compared to Rm/imm. +The mnemonics for the suffixes are closely related to the compare operation. For example, the \texttt{BGT} (branch if greater than when treated as signed numbers) will be taken if the \texttt{Rn} operand of the \texttt{CMP} instruction is greater than the \texttt{Rm} operand when treated as signed numbers. This is why the \texttt{CMP} and \texttt{B\{cc\}} instructions go so well together. +Note that the mnemonic is testing how \texttt{Rn} is related to the immediate number or \texttt{Rm}. So if the condition is some arithmetic relationship, it's asking whether \texttt{Rn} has that property compared to \texttt{Rm}/\texttt{imm}. \begin{figure} \centering @@ -63,11 +63,12 @@ \section{Condition Code Suffixes} \end{figure} \section{Branching Based on Individual Bits} + Consider the case where we want to take a branch conditional on the case of a push button being pressed or not pressed. A push button is connected to a single pin which constitutes a single bit in the GPIO\_IDR. Hence, we need a way to make our branch conditional on a single bit being high or low. Put another way, we want to exclude all of the other bits in the IDR from influencing the branch. -In order to achieve this we have to do two steps: +In order to achieve this we have to perform two steps: \begin{enumerate} -\item Mask out the bits which we are not interested in. Specifically, set them all to zero. This is done just as we saw earlier in \autoref{sec:set_clear_individual_bits}. We AND all of the bits with 0 except for the bit which we are interested in which we AND with 1. -\item Compare the result of the mask with 0. If the bit which we are interested in was 0 then the result of the AND will be 0. If the bit that we are interested in was 1 then the result of the AND will be non-zero. Note that this compare does not actually have to be done as the AND instruction sets or clears the zero flag. +\item Mask out the bits which we are not interested in. Specifically, set them all to zero. This is done as we will see later in \autoref{sec:set_clear_individual_bits}. We \texttt{AND} all of the bits with 0 except for the bit which we are interested in which we \texttt{AND} with 1. +\item Compare the result of the mask with 0. If the bit which we are interested in was 0 then the result of the \texttt{AND} will be 0. If the bit that we are interested in was 1 then the result of the \texttt{AND} will be non-zero. Note that this compare does not actually have to be done as the \texttt{AND} instruction sets or clears the zero flag. \end{enumerate} After those two steps (which can actually just be one step) we can take a conditional branch dependant on whether a single bit (a single push button) was set or cleared. diff --git a/tex/cpu.tex b/tex/cpu.tex index 3ccc1d8..1d402e7 100644 --- a/tex/cpu.tex +++ b/tex/cpu.tex @@ -4,17 +4,18 @@ \chapter{The ARM Cortex-M0} The ARM Cortex-M0 CPU is certainly the most interesting block inside the STM32F051C6. This is where all processing happens, hence this is where the instructions which we write will run. It is therefore essential that we have an intricate understanding of the CPU so that we may write useful code for it. This chapter seeks to explore the CPU in some detail. \section{Programmer's Model of the CPU} +\label{sec:programmer's_model_of_the_CPU} \begin{figure}[t] \centering \includegraphics[width=0.9\textwidth]{./fig/programmers_model_v1.pdf} - \caption{A view of the internals of the STM32F051 with the ARM Cortex-M0 expanded} + \caption{A view of the internals of the STM32F051 with the ARM Cortex-M0 expanded.} \label{fig:prog_mod_v1} \end{figure} -A programmer's model is a representation of the inner workings of the CPU with sufficient detail to allows us to develop code for the CPU, but no unnecessary detail. The expanded view of the CPU which will now be discussed can be seen in \autoref{fig:prog_mod_v1}. This simple model of a CPU is a set of CPU registers, an Arithmetic and Logic Unit (ALU) and a control Unit. The CPU registers are blocks of storage each 32 bits wide which the CPU has the ability to operate on. Only data which is inside a CPU register can be operated on by the CPU. The ARM Cortex-M0 has 16 such registers. +A programmer's model is a representation of the inner workings of the CPU with sufficient detail to allows us to develop code for the CPU, but no unnecessary detail. The expanded view of the CPU which will now be discussed can be seen in \autoref{fig:prog_mod_v1}. This simple model of a CPU is a set of CPU registers, an Arithmetic and Logic Unit (ALU) and a Control Unit. The CPU registers are blocks of storage each 32 bits wide which the CPU has the ability to operate on. Only data which is inside a CPU register can be operated on by the CPU. The ARM Cortex-M0 has 16 such registers which are numbered R0 to R15. The ALU is that which performs the operations on the registers. It can take data from registers as inputs, do very basic processing and store the result in CPU registers. -The control unit manages execution by telling the ALU what to do. Together, the registers, ALU and control are able to execute instructions. +The Control Unit manages execution by telling the ALU what to do. Together, the registers, ALU and control are able to execute instructions. Examples of instructions which the CPU is able to execute: \begin{enumerate} \item adding the contents of R0 and R1 and storing the result in R6 @@ -53,9 +54,9 @@ \subsection{Three stage pipeline} \includegraphics[page=1, clip=true, trim=1mm 40mm 1mm 57mm, width=0.8\textwidth]{./fig/pipeline.pdf} \includegraphics[page=2, clip=true, trim=1mm 40mm 1mm 57mm, width=0.8\textwidth]{./fig/pipeline.pdf} \includegraphics[page=3, clip=true, trim=1mm 40mm 1mm 57mm, width=0.8\textwidth]{./fig/pipeline.pdf} -\caption{Showing three instructions being run through a three stage pipeline, as well as where the PC is pointing every cycle} +\caption{Showing three instructions being run through a three stage pipeline, as well as where the PC is pointing every cycle.} \label{fig:pipeline} \end{figure} \section{Reset Vector} -When the CPU starts up, where should it begin execution from? It could have a fixed location, perhaps the first address in flash which is defined to hold the first instruction to execute. This however would limit out flexibility. Very often we want other data to come before out instructions. Exactly what this other data is will be explored in more detail later, but suffice to say that it's useful to have flexibility to define where the first instruction is located. This is done with the reset vector. When it boots up, the CPU fetches number which it must initialise the PC to from the address 0x0800 0004. This address is known as the reset vector as it points to the first instruction to be executed after reset. +When the CPU starts up, where should it begin execution from? It could have a fixed location, perhaps the first address in flash which is defined to hold the first instruction to execute. This however would limit out flexibility. Very often we want other data to come before out instructions. Exactly what this other data is will be explored in more detail later, but suffice to say that it's useful to have flexibility to define where the first instruction is located. This is done with the reset vector. When it boots up, the CPU fetches a number which it must initialise the PC to from the address 0x0800 0004. This address is known as the reset vector as it points to the first instruction to be executed after reset. diff --git a/tex/exceptions.tex b/tex/exceptions.tex index 9767801..87f3c22 100644 --- a/tex/exceptions.tex +++ b/tex/exceptions.tex @@ -11,11 +11,11 @@ \chapter{Exceptions} \end{figure} When an exception occurs, the CPU performs a few tasks in order to service the exception: -\begin{itemize} - \item Save the current 'system state' to the stack in the form of a stack frame. This is basically just pushing a few important registers to the stack. The exact format of the stack frame is shown in \autoref{fig:stack_frame}. +\begin{enumerate} + \item Save the current "system state" to the stack in the form of a stack frame. This is basically just pushing a few important registers to the stack. The exact format of the stack frame is shown in \autoref{fig:stack_frame}. \item Fetch the data from the vector associated with that exception that occurred and load that data into the PC. \item Start executing the block of instructions pointed to by the vector -\end{itemize} +\end{enumerate} Following is a discussion on some of the key exceptions. @@ -28,24 +28,24 @@ \chapter{Exceptions} \end{figure} \section{Reset} -There are a number of possible causes of a reset all detailed in section 7.1.2 of the Reference Manual. They key ones are a power reset where the power to the micro is cycles or a NRST pin reset where the Negative ReSeT pin is pulled low and then released. +There are a number of possible causes of a reset which are all detailed in section 7.1.2 of the Reference Manual. They key ones are a power reset where the power to the micro is cycles or a NRST pin reset where the Negative ReSeT pin is pulled low and then released. When this exception occurs the microcontroller: -\begin{itemize} +\begin{enumerate} \item aborts execution of code, \item sets all registers to their default values, \item fetches the data from the reset vector, \item places that data into the PC and starts execution. -\end{itemize} +\end{enumerate} -The reset exception is fairly specialised in that is the only exception which does not cause the previous system state to be stacked. Quite the opposite in fact, it clears all state and begins fresh. +The reset exception is fairly specialised in that is the only exception which does not cause the previous system state to be stacked. Quite the opposite in fact, it clears the system state and begins fresh. \section{HardFault} A HardFault occurs when an instruction attempts to do something illegal or a peripheral attempts to do an illegal memory transfer. This includes attempting to access unimplemented memory addresses or trying to perform unaligned memory access or trying to execute an instruction which has a non-existent opcode. The full list of events which cause a HardFault exception are detailed in Table B1-6 of the ARMv6-M Architecture Reference Manual. Typically HardFaults are unrecoverable: when a HardFault happens there is generally something broken in the code and we do not want the code to carry on running. Rather we want to be made aware of the issue so that the code can be corrected. -When a HardFault happens the standard exception handling procedure takes place: the current state is stacked, the exception handler vector is fetched an executed. Due to the fact that the state is saved on the stack it is possible to return from the handler and resume execution of the main code but it would be unusual to want to do this due to the severity of a HardFault. +When a HardFault happens the standard exception handling procedure takes place: the current state is stacked, the exception handler vector is fetched and executed. Due to the fact that the state is saved on the stack it is possible to return from the handler and resume execution of the main code but it would be unusual to want to do this due to the severity of a HardFault. %\begin{overpic}[grid,page=23]{./stm32f0xx_programming_manual} diff --git a/tex/functions.tex b/tex/functions.tex index 23a9213..4eae901 100644 --- a/tex/functions.tex +++ b/tex/functions.tex @@ -8,7 +8,7 @@ \chapter{Functions} \begin{figure} \centering \includegraphics[scale=0.5]{./fig/function.pdf} -\caption{Function} +\caption{A block diagram of a function, with its inputs and output.} \label{fig:function} \end{figure} diff --git a/tex/gpio.tex b/tex/gpio.tex index 903fc28..724a154 100644 --- a/tex/gpio.tex +++ b/tex/gpio.tex @@ -1,9 +1,9 @@ \chapter{General Purpose Input/Outputs} -One of the simplest ways to interface the microcontroller with external circuitry is via GPIO. The ability for the microcontroller to communicate with external devices via GPIO pins is one of the defining differences between microcontrollers and microprocessors. +One of the simplest ways to interface the microcontroller with external circuitry is via General Purpose Input/Outputs (GPIO). The ability for the microcontroller to communicate with external devices via GPIO pins is one of the defining differences between microcontrollers and microprocessors. Most pins on the microcontroller are able to operate in GPIO mode. As the name implies, a GPIO pin can be either an input or an output. Additionally, a pin can be placed into an alternate function or analogue mode; these will be discussed later. -The microcontroller's GPIO pins are divided up into groups. Each group is known as a port and each port has a letter associated with it (PortA, PortB etc). Each port contains pins. The maximum number of pins which a port can contain is 16, but some ports contain as few as 2 pins. This means that the name which we assign to a pin is a combination of the port letter and the pin's number in that port. For example: Port A pin 7 refers to a specific pin (shortened to PA7). This naming scheme is useful as the name makes it clear how we interact with that pin. The ports are both a logical and physical division of the pins: all of the pins which belong to a certain port are controlled by a certain block of circuitry which manages that port. The name immediately tells us which block of circuitry our code should interface with in order to control that pin. +The microcontroller's GPIO pins are divided up into groups. Each group is known as a port and each port has a letter associated with it (PortA, PortB, etc). Each port contains pins. The maximum number of pins which a port can contain is 16, but some ports contain as few as 2 pins. This means that the name which we assign to a pin is a combination of the port letter and the pin's number in that port. For example: Port A pin 7 refers to a specific pin (shortened to PA7). This naming scheme is useful as the name makes it clear how we interact with that pin. The ports are both a logical and physical division of the pins: all of the pins which belong to a certain port are controlled by a certain block of circuitry which manages that port. The name immediately tells us which block of circuitry our code should interface with in order to control that pin. A diagram showing how the pin is structured electrically inside the microcontroller is shown in \autoref{fig:gpio}. \begin{figure} @@ -18,35 +18,35 @@ \chapter{General Purpose Input/Outputs} %\end{overpic} \section{Pin Mode} -As mentioned, the pin can be in one of four possible modes: input, output, alternate function, analogue. There is a register which controlled which mode the pin operates in, known as the GPIOx\_MODER. The 32 bits of the register are divided up into pairs of bits where each pair of pits sets the mode for the associated pin. +As mentioned, the pin can be in one of four possible modes: input, output, alternate function, analogue. There is a register which controls which mode the pin operates in, known as the GPIOx\_MODER. The 32 bits of the register are divided up into pairs of bits where each pair of pits sets the mode for the associated pin. \subsection{Input Mode} -Input mode is the default mode for most pins. In this mode, the pin is measuring the voltage applied to it and ascertaining whether it is a logic 0 or a logic 1. This 'decision' is made by a Schmitt Trigger which has useful characteristics such as well defined high and low levels, hysteresis and high impedance. The logic level of each pin is latched on each clock cycle and written to the Input Data Register (GPIOx\_IDR). As each pin can only be considered to be either a logic high or a logic low, there is only 1 bit necessary to represent the state of a pin. +Input mode is the default mode for most pins. In this mode, the pin is measuring the voltage applied to it and ascertaining whether it is a logic 0 or a logic 1. This "decision" is made by a Schmitt Trigger which has useful characteristics such as well defined high and low levels, hysteresis and high impedance. The logic level of each pin is latched on each clock cycle and written to the Input Data Register (GPIOx\_IDR). As each pin can only be considered to be either a logic high or a logic low, there is only 1 bit necessary to represent the state of a pin. \subsection{Output Mode} Here, the pin does not measure a logic level, but rather asserts a logic level. When in output mode, the pin will either assert a logic 0 allowing it to sink current from an external source, or assert a logic 1 allowing it to source current into an external sink. The logic level which is asserted is controlled by the Output Data Register (GPIOx\_ODR). Each bit in this register can be set by writing to the register. Additionally, the bits in this register can be set via the Bit Set and Reset Register (GPIOx\_BSSR). This register allows atomic (done in a single instruction) setting or clearing of individual bits in the ODR. -\subsection{A note on 'bricking' your micro} +\subsection{A note on "bricking" your micro} If you study your dev board circuit diagram carefully, you'll notice that PA13 and PA14 are connected to the debugger. These are the SWD data and SWD clock pins. By default, these pins are not configured as inputs. Rather, they are configured as Alternate Mode, which allows them to be connected to the SWD circuitry inside the STM32F051 and hence serve the purpose of transferring SWD traffic between the SWD peripheral and the ST-Link. If you look at section 9.4.1 of the reference manual, you'll see that in general the reset state of pins is input. Port A is however an exception. Its reset state is 0x2800 0000. This corresponds to all pins as inputs except PA13 and PA14 which are alternate mode. In order for these pins to be connected to the SWD circuitry, they must remain in Alternate Mode. If you set the pins to inputs, they will no longer serve as an interface for the SWD peripheral to the ST-Link. For this reason, you should under no circumstance modify the values of the bits at GPIOA\_MODER[29..36]. If you do accidentally set these pins to inputs, it becomes difficult to unset them. As soon as the micro boots up, your code will run and break connectivity with the debugger. The only way to fix this is to intercept the micro before it is able to boot up and erase your bad code from it. To do this, OpenOCD will be launched with some extra flags to prevent the micro booting up. The OpenOCD command should be executed while the micro is reset to ensure that the pins are back to their default reset state. -\begin{itemize} +\begin{enumerate} \item Hold down the reset button. This will force the micro to reset state and prevent your code from running. - \item Launch OpenOCD with the extra command line arguments: -c init -c "reset halt" - \item About a quarter of a second after pressing 'enter' on that openocd command, release the reset button. OpenOCD should now manage to establish connection. - \item Connect GDB to OpenOCD. Run the GDB command: monitor flash erase\_sector 0 0 last + \item Launch OpenOCD with the extra command line arguments: \texttt{-c init -c "reset halt"} + \item About a quarter of a second after pressing "enter" on that openocd command, release the reset button. OpenOCD should now manage to establish connection. + \item Connect GDB to OpenOCD. Run the GDB command: \texttt{monitor flash erase\_sector 0 0 last} \item Your bad code should now be erased. Power cycle the board, and OpenOCD should be able to connect to it with the normal command. -\end{itemize} +\end{enumerate} \section{Pull resistors} -When a pin is set to input mode and there is no logic level applied to it, what value will the bit for that pin in the IDR take on? A logic 1 or a logic 0? Due to the high impedance nature of the pin and the presence of environmental noise, the level which is read from the pin will probably jump randomly between a logic 1 and logic 0. The fact that it's a high impedance input means that even very weak EM signals will cause a voltage to appear on the pin which will cause it to oscillate between logic levels. This is generally bad. In order to define a sort of 'default' level which the pin will read when no external signal is applied to it, internal pull up or pull down resistors are used. These resistors are selectively turned on or off using the +When a pin is set to input mode and there is no logic level applied to it, what value will the bit for that pin in the IDR take on? A logic 1 or a logic 0? Due to the high impedance nature of the pin and the presence of environmental noise, the level which is read from the pin will probably jump randomly between a logic 1 and logic 0. The fact that it's a high impedance input means that even very weak EM signals will cause a voltage to appear on the pin which will cause it to oscillate between logic levels. This is generally bad. In order to define a sort of "default" level which the pin will read when no external signal is applied to it, internal pull-up or pull-down resistors are used. These resistors are selectively turned on or off using the Pull-up/Pull-down Register (GPIOx\_PUPDR). \subsubsection{How to set or clear individual bits} \label{sec:set_clear_individual_bits} -There is often a case where you wish to modify only one or two of the bits of a port, leaving the rest of the pins unchanged. If you simply write a pre-defined value to the pins, it will force \emph{all} of them to take on a specific value. The way to modify only a single bit is to do a logic AND or OR of the contents of the register with a pre-defined pattern. An OR has the ability to set specific bits while leaving others unchanged, while and AND has the ability to clear certain bits while leaving the others unchanged. For example, say we wanted to set bits 1 and 2, while clearing bits 0, 3, 4 and 5, leaving the other bits of the port unchanged: +There is often a case where you wish to modify only one or two of the bits of a port, leaving the rest of the pins unchanged. If you simply write a pre-defined value to the pins, it will force \emph{all} of them to take on a specific value. The way to modify only a single bit is to do a logic \texttt{AND} or \texttt{OR} of the contents of the register with a pre-defined pattern. An \texttt{OR} has the ability to set specific bits while leaving others unchanged, while and \texttt{AND} has the ability to clear certain bits while leaving the others unchanged. For example, say we wanted to set bits 1 and 2, while clearing bits 0, 3, 4 and 5, leaving the other bits of the port unchanged: \begin{lstlisting}[fontadjust=true,frame=trBL] @ assuming Rn contains the address of the register to modify: LDR R0, [Rn] diff --git a/tex/instruction_sets.tex b/tex/instruction_sets.tex index c4ac3dd..f5f51b8 100644 --- a/tex/instruction_sets.tex +++ b/tex/instruction_sets.tex @@ -5,15 +5,15 @@ \chapter{Instruction Sets} \begin{figure} \centering \includegraphics[width=\textwidth]{./fig/ISA.png} - \caption{Cortex instruction set architecture} + \caption{Cortex instruction set architecture.} \label{fig:isa} \end{figure} \section{ARM} -The original instruction set used by ARM processors was called the ARM instruction set. This instruction set contains only 32-bit instructions. This is a powerful instruction set as almost all instruction can be conditionally executed. However, seeing as all instructions are fixed to 32 bits wide, the code density is fairly poor. This is due to comparatively simple instructions like a simple add or PC relative branch using wasteful 32 bits of flash. +The original instruction set used by ARM processors was called the ARM instruction set. This instruction set contains only 32-bit instructions. This is a powerful instruction set as almost all instruction can be conditionally executed. However, seeing as all instructions are fixed to 32 bits wide, the code density is fairly poor. This is due to comparatively simple instructions like a simple add or PC relative branch using a wasteful 32 bits of flash. \section{Thumb} -In 1994 the ARM7TDMI architecture was released which featured the Thumb instruction set. This instruction set was limited to only 16-bit instructions. Obviously these instructions were less powerful as there was less room to specify information about the actions which an instruction should perform. However for simple instructions this was not an issue and resulted in programs being much smaller. For more complicated operations multiple Thumb instructions would be needed to perform the job of a single ARM instruction. +In 1994 the ARM7TDMI architecture was released which featured the Thumb instruction set. This instruction set was limited to only 16-bit instructions. Obviously these instructions were less powerful as there was less room to specify information about the actions which an instruction should perform. However for simple instructions this was not an issue and resulted in programs being much smaller. For more complicated operations multiple Thumb instructions would be required to perform the job of a single ARM instruction. In order to have a combination of the performance of the 32-bit ARM instruction set and the code density of the 16-bit Thumb instruction set, an ability called \emph{interworking} was provided. Interworking allows for the CPU to switch between executing Thumb instructions or ARM instructions. This is a useful ability but introduces some additional complexity into the system. @@ -25,8 +25,8 @@ \section{Thumb-2} \section{Implementation of Interworking} So we understand that Thumb processors can only execute 16-bit instructions. Thumb-2 processors can execute the Thumb instructions as well 32-bit instructions. ARM processors can execute a totally different set of 32-bit instructions. Some processors can run both the ARM and Thumb instruction sets. So, how do we tell one of these interworking capable processors whether an instruction is ARM or Thumb? -Firstly, note that data accesses must be aligned. Seeing as the minimum width of an instruction is 2 bytes, all instructions must be placed on addresses which are multiples of 2. As all addresses of instructions are multiples of 2, the lsb of the PC is always a 0. The bit is therefor sort of wasted. Hence, we assign a different purpose to this bit: when it is a 0 it indicates that the instruction pointed to by the PC is an ARM instruction. When it is a 1 it indicates that the instruction pointed to by the PC is a Thumb instruction. +Firstly, note that data accesses must be aligned. Seeing as the minimum width of an instruction is 2 bytes, all instructions must be placed on addresses which are multiples of 2. As all addresses of instructions are multiples of 2, the lsb of the PC is always a 0. That bit is therefore sort of wasted. Hence, we assign a different purpose to this bit: when it is a 0 it indicates that the instruction pointed to by the PC is an ARM instruction. When it is a 1 it indicates that the instruction pointed to by the PC is a Thumb instruction. -Although the Cortex series of CPUs does not support the ARM instruction set, is still requires that this rule of using the lsb of the PC to specify instruction set type is adhered to. Seeing as all instructions for the Cortex series (including our CPU) are Thumb or Thumb-2, this lsb of the PC should always be set to a 1. That is why our reset vector needs to point to the address of \texttt{\_start} +1. The +1 forces the lsb to a 1 indicating that the instruction at \texttt{\_start} is a Thumb instruction. +Although the Cortex series of CPUs does not support the ARM instruction set, it still requires that this rule of using the lsb of the PC to specify instruction set type is adhered to. Seeing as all instructions for the Cortex series (including our CPU) are Thumb or Thumb-2, this lsb of the PC should always be set to a 1. That is why our reset vector needs to point to the address of \texttt{\_start} +1. The +1 forces the lsb to a 1 indicating that the instruction at \texttt{\_start} is a Thumb instruction. If a vector attempts to set the lsb of the PC to 0, the CPU will HardFault as it would be trying to execute an instruction from an instruction set which is not supported. diff --git a/tex/load-store.tex b/tex/load-store.tex index ebc8384..6bf602f 100644 --- a/tex/load-store.tex +++ b/tex/load-store.tex @@ -3,7 +3,7 @@ \chapter{Loading and Storing} Loading is the process of getting data from somewhere in the memory space into the CPU registers so that it can be used in processing. Storing is the process of getting data which is in the CPU registers into memory. Remember that seeing as flash is read-only memory, we cannot store data to flash address, but we can store to RAM. -The general format for a load is that a destination register, a register containing a base address, and an offset are supplied. An effective address is then calculated as the base address plus the offset. The contents of memory at the effective address are then copied from memory into the destination CPU register. When we do this we are treating a register as a \emph{pointer}. When we regard the contents of a register as a memory address and use that register to access data in memory we are dereferencing a pointer: accessing the data pointed to by a pointer. This is an important concept! +The general format for a load operation is that a destination register, a register containing a base address, and an offset are supplied. An effective address is then calculated as the base address plus the offset. The contents of memory at the effective address are then copied from memory into the destination CPU register. When we do this we are treating a register as a \emph{pointer}. When we regard the contents of a register as a memory address and use that register to access data in memory we are dereferencing a pointer: accessing the data pointed to by a pointer. This is an important concept! A store operation is very similar. Again, a register containing a base address and an offset are supplied, but this time it is a source register not a destination register which is supplied. Again, and effective address of base plus offset is calculated. The contents of the source register is copied into the effective address. @@ -24,7 +24,7 @@ \section{Immediate Offset Loading} \subsection{Offset restrictions} \label{sec:load-store-restrictions} -Remember that all instructions are limited to 16 bits. The format of the LDR instruction in machine code is shown in \autoref{fig:ldr}. We can see that after 5 bits of opcode and $2 \times 3 = 6$ bits of register specifications, we are only left with 5 bits of offset. Normally, these 5 bits would only allow us to provide an offset of $2^5 - 1 = 31$ bytes. This is not very much! In order to extend the range of the 5 offset bits, the actual offset used is equal to the 5 bit immediate number multiplied by four. This multiplication by four is the same as appending two zeros to the end of the binary value, which you can see is being done in \autoref{fig:ldr}. This means that the amount which we are able to offset a base address by is now $(2^5 - 1) \times 4 = 124$, which is significantly more useful. However, seeing as we are multiplying to immediate number by four to get the actual offset, the implication is that all offsets \emph{must} be a multiple of four. +Remember that all instructions are limited to 16 bits. The format of the LDR instruction in machine code is shown in \autoref{fig:ldr}. We can see that after 5 bits of opcode and $2 \times 3 = 6$ bits of register specifications, we are only left with 5 bits of offset. Normally, these 5 bits would only allow us to provide an offset of $2^5 - 1 = 31$ bytes. This is not very much! In order to extend the range of the 5 offset bits, the actual offset used is equal to the 5 bit immediate number multiplied by four. This multiplication by four is the same as appending two zeros to the end of the binary value, which you can see is being done in \autoref{fig:ldr}. This means that the amount which we are able to offset a base address by is now $(2^5 - 1) \times 4 = 124$, which is significantly more useful. However, seeing as we are multiplying the immediate number by four to get the actual offset, the implication is that all offsets \emph{must} be a multiple of four. The compiler automatically takes care of dividing whatever offset we supply in our assembly instruction by four in order to get it to fit into the 5 bit immediate number, and the CPU then multiplies the immediate number by four to get the offset. For example: if we wanted an offset of 12, the immediate number which would be placed in the instruction by the compiler would be 3. @@ -52,7 +52,7 @@ \section{Register Offset Loading} \end{lstlisting} \section{Storing} -The storing commands are so similar to the loading that they will barely be discussed. One difference is that there is no PC-relative store, as there would be no point trying to store data to read-only memory. The store instruction takes moves the contents of a source register, \texttt{Rt}, and places it at the effective memory address equal to the base address, \texttt{Rn}, plus an offset either supplied as a 5-bit immediate number, \texttt{\#imm5}, or in an offset register, \texttt{Rm}. +The storing commands are so similar to the loading that they will barely be discussed. One difference is that there is no PC-relative store, as there would be no point trying to store data to read-only memory. The store instruction takes the contents of a source register, \texttt{Rt}, and places it at the effective memory address equal to the base address, \texttt{Rn}, plus an offset either supplied as a 5-bit immediate number, \texttt{\#imm5}, or in an offset register, \texttt{Rm}. \begin{lstlisting}[fontadjust=true,frame=trBL] STR Rt, [Rn, #imm5] diff --git a/tex/loops.tex b/tex/loops.tex index 118118c..c9031fa 100644 --- a/tex/loops.tex +++ b/tex/loops.tex @@ -33,7 +33,7 @@ \section{while} \section{for} The first example of the \texttt{while} loop shown above is so commonly used that another loop has been designed to make that case easier to write. This is the {for} loop. -The case which it is useful for is when you have +The case which it is useful for is when you have: \begin{itemize} \item some sort of counter variable which is initialised to a starting value, \item a condition which is often based on the counter value, and @@ -67,7 +67,7 @@ \section{for} } \end{lstlisting} -Note that as per C99 you can define variables in the initialisation part of the \texttt{for} loop but I prefer to be ANSI C compliant and define my variables before the loop. +Note that as per C99 you can define variables in the initialisation part of the \texttt{for} loop but I prefer to be ANSI C compliant and define my variables before the for loop. It should now make some sort of sense how our infinite loop works; it has three empty parameters. Basically it says: don't do any initialisations, don't be based on any conditions (run forever) and don't do anything specific after every iteration. diff --git a/tex/memory.tex b/tex/memory.tex index 9922217..a7da9b0 100644 --- a/tex/memory.tex +++ b/tex/memory.tex @@ -1,12 +1,12 @@ \chapter{Memory Model} -We will now beging to expand on some of the block is \autoref{fig:prog_mod_v0}. Before starting to explore how the CPU works, it's useful to have an understanding of how memory is laid out. We will start looking at the flash and RAM blocks. Together with another block called peripherals (which we will explore later), these blocks make up memory. +We will now begin to expand on some of the blocks in \autoref{fig:prog_mod_v0}. Before starting to explore how the CPU works, it's useful to have an understanding of how memory is laid out. We will start looking at the flash and RAM blocks. Together with another block called peripherals (which we will explore later), these blocks make up memory. It's important to note that this memory is located \emph{outside} of the CPU, but still inside the microcontroller IC. The memory of a device can be though of as a very long row of post boxes along a street. Each post box has an address, and each post box can have data put into it or taken out. The amount of data that each post box can hold is 8 bits, or one byte. Therefore, each memory address is said to address one byte. The address of each post box is 32 bits long, meaning that addresses range from 0 (0x00000000) to just over 4.3 billion (0xFFFFFFFF). In actual fact, the \emph{vast} majority of these addresses do not have a post box at them. These addresses are said to be unimplemented. Only very small sections of this address space are implemented and can actually be read from or written to. -Flash and RAM are contiguous blocks of memory, with a start address and an end address. A simplified memory map of the STM32F051 is shown in \autoref{fig:memory_map}. From this, we can see that if we want to use changeable variables in our programs, the variables should be located at addresses between 0x2000 0000 and 0x2000 1FFF. If we want to load code onto the micro which should not be lost when the device loses power, the code should be loaded into the non-volatile memory, flash, which has addresses between 0x0800 0000 and 0x0800 7FFF. +Flash and RAM are continuous blocks of memory, with a start address and an end address. A simplified memory map of the STM32F051 is shown in \autoref{fig:memory_map}. From this, we can see that if we want to use changeable variables in our programs, the variables should be located at addresses between 0x2000 0000 and 0x2000 1FFF. If we want to load code onto the micro which should not be lost when the device loses power, the code should be loaded into the non-volatile memory, flash, which has addresses between 0x0800 0000 and 0x0800 7FFF. If we want the ability to modify data during the execution of our program, the data should be placed in the read/write section of memory, RAM. \begin{figure} @@ -51,7 +51,7 @@ \section{Data Types and Endianness} 1 & 0xBB \\ 0 & 0xAA \\ \end{tabu} -\caption{Layouts of the word 0xAABBCCDD in memory at effective address 0, according to little or big endian format} +\caption{Layouts of the word 0xAABBCCDD in memory at effective address 0, according to little or big endian format.} \label{tab:endianness} \end{table} diff --git a/tex/multi-file-projects.tex b/tex/multi-file-projects.tex index 1372e04..7b590c5 100644 --- a/tex/multi-file-projects.tex +++ b/tex/multi-file-projects.tex @@ -36,7 +36,7 @@ \section{Static Visability} Seeing as the name foo becomes visible to the entire project when not defined as static, it may be helpful to prefix it with something characters to make it clear which file in the project it comes from. I find that calling my globally visible function myFile\_foo() works well. \\ -Exactly the same applies for global variables. As with functions, by default a variable defined in one file can be accessed by all files in the project. Again, as with functions, this can be dangerous if it's not what you expect. You should use the static keyword to make global variables only visible to the function they are defined in unless you really need them to be access from other files (not recommended). +Exactly the same applies for global variables. As with functions, by default a variable defined in one file can be accessed by all files in the project. Again, as with functions, this can be dangerous if it's not what you expect. You should use the static keyword to make global variables only visible to the function they are defined in unless you really need them to be accessible from other files (not recommended). \begin{lstlisting}[language=C] static uint8_t var1; // only visible in this file @@ -47,7 +47,7 @@ \section{Static Visability} \section{Headers and Source} Each source file is compiled separately to an object file with relative addressing. -At link time, the BL instructions have their references resolved. That's basically the only change that the linker makes to the object code: resolving references to other files\footnote{Of course, the other critical thing which the linker does is to combine input sections to output sections and allocated absolute addresses, but that doesn't change the machine code}. +At link time, the \texttt{BL} instructions have their references resolved. That's basically the only change that the linker makes to the object code: resolving references to other files\footnote{Of course, the other critical thing which the linker does is to combine input sections to output sections and allocated absolute addresses, but that doesn't change the machine code}. The linker does not and cannot check that the call you're making to another file is done correctly in terms of arguments and return value. This must be done by the compiler. How? Header files should be written. A header file is a collection of declarations for all of the symbols which the corresponding source file adds to the project namespace. @@ -70,9 +70,9 @@ \subsection{How a \#include works} This is typically a very verbose format which a programmer may not enjoy writing. The compiler should not have to do any additional work than compiling. In order to achieve this, there is first a program called a C \emph{preprocessor} which runs over the code, just removing some of the abstractions used by the programmer. -These abstractions are typically the \# statements. Such as \#include or \#define. +These abstractions are typically the \# statements. Such as \texttt{\#include} or \texttt{\#define}. -When the preprocessor runs the \#include line, it opens the file being included and parses the contents of the file being included as if it has been typed out in full right there where it's being included. +When the preprocessor runs the \texttt{\#include} line, it opens the file being included and parses the contents of the file being included as if it has been typed out in full right there where it's being included. There is nothing magical or hidden about an include. It simply inserts the contents of the file being included wherever the include line is placed. It's perfectly common to have many layers of \#including. A C file includes a header file which includes another header file which includes another header etc. The preprocessor resolves all of these includes.\\ The two types of include lines, whether using the double quotes or angle brackets determine where the compiler goes looking for the included file. diff --git a/tex/nvic.tex b/tex/nvic.tex index 5851115..575f49e 100644 --- a/tex/nvic.tex +++ b/tex/nvic.tex @@ -3,7 +3,7 @@ \chapter{Nested Vectored Interrupt Controller} In order for a peripheral to cause an interrupt, the interrupt request signal must first pass through the NVIC. This means that there is a central block which is responsible for managing the interrupts in the microcontroller. -The two main aspect of functionality which the NVIC provides in terms of interrupt management are now discussed. +The two main aspect of functionality which the NVIC provides in terms of interrupt management will now be discussed. \section{Interrupt Masking} In this context, masking refers to refers to blocking or preventing interrupts. diff --git a/tex/peripherals.tex b/tex/peripherals.tex index df3d86a..abf6067 100644 --- a/tex/peripherals.tex +++ b/tex/peripherals.tex @@ -9,13 +9,14 @@ \section{Internal Peripherals} \centering \includepdf[pages={11}]{./STM32F051_datasheet.pdf} } + In order for the CPU to interface with them, each peripheral has a block of memory associated with it. Recall the address space of the microcontroller as shown in \autoref{fig:memory_map}. The block called \emph{peripherals} running from address 0x4000 0000 to 0x4800 17FF is the range of addresses which is available to have peripherals associated with it. The full memory map can be seen in Figure 2 of the Reference Manual. Out of that large peripherals block of memory, each peripheral has a specific block of memory associated with it. The starting and ending address for each peripheral in the microcontroller can be seen in Table 1 of the Reference Manual. Note how the vast majority of the peripherals address space is unimplemented (or "reserved"). This allows there to be lots of space for expansion: fancier micrcontrollers can have more peripherals and make use of this unimplemented address space. -Inside each block of memory assigned to a specific peripheral is further sub-divisions of the block into \emph{registers}. Registers are blocks of memory (typically one word big on our processor) which provide a specific, well defined element of functionality, typically configuring how the peripherals works or providing some status information about the peripheral. The CPU is able to write data to a register to configure the peripheral or read data from a register to get information about the peripheral. Sometimes a register simply holds a number (for example: for use in a counter) but more frequently each individual bit in a register as a specific meaning. For example, a bit can be set high to enable some sort of functionality or set low to disable some functionality. +Inside each block of memory assigned to a specific peripheral is further sub-divisions of the block into \emph{registers}. Registers are blocks of memory (typically one word big on our processor) which provide a specific, well defined element of functionality, typically configuring how the peripherals works or providing some status information about the peripheral. The CPU is able to write data to a register to configure the peripheral or read data from a register to get information about the peripheral. Sometimes a register simply holds a number (for example: for use in a counter) but more frequently each individual bit in a register has a specific meaning. For example, a bit can be set high to enable some sort of functionality or set low to disable some functionality. -Each register has an address which must be known when interacting with that register. The way that the address is calculated is using a (base address) plus (offset) system. The base address is the start of the address range for the peripheral as seen in Table 1 of the Reference Manual, and the offset is the number which must be address to the base address to get the effective address of the register. This is a very convenient system as our load and store operations in the CPU also work on a base plus offset system. +Each register has an address which must be known when interacting with that register. The way that the address is calculated is using a "base address" plus "offset" system. The base address is the start of the address range for the peripheral as seen in Table 1 of the Reference Manual, and the offset is the number which must be added to the base address to get the effective address of the register. This is a very convenient system as our load and store operations in the CPU also work on a base plus offset system. A description of what each register does (and indeed what each bit in the register does) as well as the offset for that specific register can be found at the end of the chapter of the Reference Manual which deals with the peripheral (or class of peripherals) which you're trying to interact with. diff --git a/tex/stack.tex b/tex/stack.tex index 91f86da..430b22a 100644 --- a/tex/stack.tex +++ b/tex/stack.tex @@ -1,7 +1,7 @@ \chapter{Stack} -A stack is a concept. The concept is a data structure which implements a Last In / First Out queue. It has two interfaces namely: +A stack is a concept. The concept is a data structure which implements a Last In / First Out (LIFO) queue. It has two interfaces, namely: \begin{description} - \item[PUSH:] Take a value and places it at the top of the stack, on top of whatever already exists in the stack. + \item[PUSH:] Takes a value and places it at the top of the stack, on top of whatever already exists in the stack. \item[POP:] Removes the top element from the stack and puts it into a register. The next element down then becomes the top of the stack. \end{description} @@ -16,12 +16,12 @@ \chapter{Stack} \section{Stack Pointer} Clearly a well implemented stack is highly beneficial to a system. In order to implement the stack, one of our registers, R13 is assigned the special job of being the stack pointer (SP). The purpose of the stack pointer is to point to (hold the address of) the item most recently placed on the stack. In that way it keeps track of the stack. Typically a stack is implemented starting at the end of RAM (highest address) and working it's way down RAM. Hence, the SP should be initialised pointing to the end of RAM. -Well, that's not quite true! As discussed in section 2.1.2 of the Programming Manual, the order of operations for a stack push is to first decrement the pointer and then place the data at the new address pointed to by the SP. That means that if we want to place our first word pushed onto the stack, the SP must be initialised to point to one word AFTER the end of RAM. +Well, that's not quite true! As discussed in section 2.1.2 of the Programming Manual, the order of operations for a stack push is to first decrement the pointer and then place the data at the new address pointed to by the SP. That means that if we want to place our first word pushed onto the stack, the SP must be initialised to point to one word AFTER the end of RAM. For example, if the last address of RAM is 0x2000 1FFF, the SP must be initialised to hold the value 0x2000 2000. -The reason we want to start the stack right at the end of RAM is to allow it as much space as possible to grow. Typically computer systems have another data structure called a heap with starts at the beginning of RAM and grows upwards. These data structures should be as far away from each other as possible to prevent stack overflow, when the stack and the heap collide. Stack/Heap collision is about the worst think that can happen to a program. This is why lots of RAM is good: the more RAM we have the more data we can store before collision happens. +The reason we want to start the stack right at the end of RAM is to allow it as much space as possible to grow. Typically computer systems have another data structure called a heap with starts at the beginning of RAM and grows upwards. These data structures should be as far away from each other as possible to prevent stack overflow, when the stack and the heap collide. Stack/Heap collision is about the worst thing that can happen to a program. This is why lots of RAM is good: the more RAM we have the more data we can store before collision happens. \section{Stack Access Instructions} -The two instructions which give direct access to the top of the stack are the PUSH and POP instructions. Both of these instructions take something called a register list as an argument. This is sort of an array of registers, enclosed in curly brackets such as \{R0, R2, R5\}. This is very powerful as it allows us to push or pop multiple registers at one! +The two instructions which give direct access to the top of the stack are the \texttt{PUSH} and \texttt{POP} instructions. Both of these instructions take something called a register list as an argument. This is sort of an array of registers, enclosed in curly brackets such as \texttt{\{R0, R2, R5\}}. This is very powerful as it allows us to push or pop multiple registers at one! Seeing as the SP is a CPU register like any other, you can also use it for load/store operations enabling the random access of any element on the stack. For example, to load the \nth{5} last element on the stack by 42 without touching any of the other elements you could do: \begin{lstlisting}[fontadjust=true,frame=trBL] diff --git a/tex/subroutines.tex b/tex/subroutines.tex index 04dceec..a581cb0 100644 --- a/tex/subroutines.tex +++ b/tex/subroutines.tex @@ -1,7 +1,7 @@ \chapter{Subroutines} It would be very useful to have the ability to branch to a label, execute a block of code and then return back to where the branch was taken from. A block of code which is branched to and returned from in this way is called a subroutine. Subroutines are a very useful concept as they allow us to write a single block of code and then re-use it multiple times. -If we did not have subroutines we would have to duplicate code whenever we wanted to make use of the functionality provided by the code. +If we did not have subroutines we would have to duplicate code whenever we wanted to make use of the functionality provided by that code. This causes unnecessary use (wasting) of flash memory. Furthermore, without subroutines, the job of maintaining the code would be very difficult because if you want to adjust something in that block of code then you would have to make the adjustments in multiple places in your source file - wherever the block of code exists. By having a subroutines the code only occupies space in memory once and alterations to it only have to happen in one place. @@ -9,7 +9,7 @@ \chapter{Subroutines} In order to implement this subroutine concept the CPU needs the ability to store the return address somewhere when a subroutine is branched to. Subroutines are so useful that an entire CPU register is dedicated to the purpose of storing return addresses for subroutines. That is R14, otherwise known as the Link Register (LR). Subroutines work by storing the address of the next instruction to be executed in the LR and then branching to the label of the subroutine. -As you'll remember, this is the same as putting the address of the instruction which you want to execute into the PC. As usual, instruction will then be executed sequentially from that point. +As you'll remember, this is the same as putting the address of the instruction which you want to execute into the PC. As usual, instructions will then be executed sequentially from that point. In order to get the branch instruction to store the address of the next instruction in the LR, the following instruction format is used. \begin{lstlisting}[fontadjust=true,frame=trBL] @@ -17,9 +17,9 @@ \chapter{Subroutines} \end{lstlisting} -When you want to "return" from the subroutine back to the location in the code where the subroutine was called you need move the data in the LR into the PC. This causes the PC to go back to pointing to the instruction which follows the one that called the subroutine. +When you want to "return" from the subroutine to the location in the code where the subroutine was called you need move the data in the LR into the PC. This causes the PC to go back to pointing to the instruction which follows the one that called the subroutine. -In order to load the contents of an arbitrary register into the PC, the Branch Indirect (BX) instruction is used. +In order to load the contents of an arbitrary register into the PC, the Branch Indirect (\texttt{BX}) instruction is used. The general format of this instruction is: \begin{lstlisting}[fontadjust=true,frame=trBL] BX Rn @ where Rn is some register diff --git a/tex/system_overview.tex b/tex/system_overview.tex index 4cbe035..32a95a9 100644 --- a/tex/system_overview.tex +++ b/tex/system_overview.tex @@ -1,14 +1,7 @@ \chapter{System Overview} -\begin{figure}[t] - \centering - \includegraphics[width=\textwidth]{./fig/programmers_model_v0.pdf} - \caption{The most simplified view of the internals of the STM32F051} - \label{fig:prog_mod_v0} -\end{figure} - \section{What is a Microcontroller?} -The microcontroller can be understood by comparing it to something you are already very familiar with: the computer. Both a microcontroller and a computer can be modeled as a black box which takes in data and instructions, performs processing, and provides output. +The microcontroller can be understood by comparing it to something you are already very familiar with: the computer. Both a microcontroller and a computer can be modelled as a black box which takes in data and instructions, performs processing, and provides output. In order to do this, a micro has some of the same internals as a computer, shown graphically in \autoref{fig:prog_mod_v0} and discussed now: \begin{itemize} \item CPU: The section of the microcontroller which does the processing. It executes instructions which allows it to do arithmetic and logic operations, amongst other forms of operations. @@ -30,13 +23,20 @@ \section{What is a Microcontroller?} \label{table:specs_comp} \end{table} -The terms micro\textit{controller} and micro\textit{processor} are different and should not be used interchangeably. A micro\textit{processor} is a chip which is able to perform computation, but requires external memory and peripherals to function. A micro\textit{controller} has the memory and peripherals built into it, allowing it to be fully independent. Furthermore, the interface in and out of a microprocessor is mainly just an address and data bus. In a microcontroller, these data and address busses are internal to the device. The interfaces in and out of a microcontroller are configurable to be a wide variety of communication standards. This self-contained nature and ability to deal with a wide variety of signals allows a microcontroller to (as the name suggest) be embedded in a larger system and perform control and monitoring functions.\\ +The terms micro\textit{controller} and micro\textit{processor} are different and should not be used interchangeably. A micro\textit{processor} is a chip which is able to perform computation, but requires external memory and peripherals to function. A micro\textit{controller} has the memory and peripherals built into it, allowing it to be fully independent. Furthermore, the interface in and out of a microprocessor is mainly just an address and data bus. In a microcontroller, these data and address busses are internal to the device. The interfaces in and out of a microcontroller are configurable to be a wide variety of communication standards. This self-contained nature and ability to deal with a wide variety of signals allows a microcontroller to (as the name suggests) be embedded in a larger system and perform control and monitoring functions.\\ The micro we will be using is the STM32F051C6. It is manufactured by ST Microelectronics, but has an ARM Cortex-M0 CPU. ARM designed the CPU (specified how the transistors connect together). ST then takes this CPU design, adds it to their design for all of the other bits of the micro (flash, RAM, ports and much much more) and then produces the chip. +\begin{figure} + \centering + \includegraphics[width=\textwidth]{./fig/programmers_model_v0.pdf} + \caption{The most simplified view of the internals of the STM32F051} + \label{fig:prog_mod_v0} +\end{figure} + \subsection{Development board block diagram} The development board consists of modules which connect to the microcontroller. Most of these modules are optional in that they are not required for the microcontroller to run. We will develop code later in the course to interface with some of these modules. Those which are not optional are the voltage regulator and the debugger. -Following is a brief discussion of the purpose of each of the dev board modules (peripherals). You are not expected to know what many of these terms mean yet; this exists for you to refer back to later when you do encounter these perihperals. +The following is a brief discussion of the purpose of each of the dev board modules (peripherals). You are not expected to know what many of these terms mean yet; this exists for you to refer to later when you do encounter these peripherals. \begin{figure} \centering @@ -50,8 +50,8 @@ \subsection{Development board block diagram} \item STM32F051C6: This is the target microcontroller. It is connected to everything else on the board and it is where the code which we develop will execute. \item Debugger: this is essentially another microcontroller running special code on it which allows it to be able to pass information between a computer and the target microcontroller. The interface to the computer is a USB connection, and the interface to the target is a protocol called Serial Wire Debug (SWD) which is similar to JTAG. The specific type of debugger which we have is a ST-Link. \item Regulator: A MCP1702-33/T0 chip. This converts the 5 V provided by the USB port into 3.3 V suitable for running most of the circuitry on the board. - \item LEDs: One byte of LEDs, active high connected to the lower byte of port B. - \item Push buttons: Active low push buttons connected to the lower nibble of port A. + \item LEDs: Eight LEDs used as a binary representation of one byte of data, active high connected to the lower byte of port B. + \item Push buttons: Active low push buttons connected to the lower nibble (4 bits) of port A. \item Pots: 2 x 10K (or there abouts) potentiometers connected to PA5 and PA6. \item LCD Screen: A 16x2 screen connected to the micro in 4-bit mode. Used to display text. \item LCD contrast pot: The output of this potentiometer connects to the contrast pin of the LCD screen, hence allowing contrast adjustment. @@ -68,9 +68,9 @@ \subsection{Development board block diagram} For now, we will forget about all of the other modules on the dev board and consider our system to be a computer talking to a debugger talking to a target micro, as shown in \autoref{fig:debugger_to_micro}. This is the most basic system which must be understood to allow us to load code onto the target microcontroller. -\begin{figure}[t] +\begin{figure}[h] \includegraphics[width=\textwidth]{./fig/debugger_to_micro.pdf} - \caption{Highly simplified diagram showing how micro and computer communicate} + \caption{Highly simplified diagram showing how micro and computer communicate.} \label{fig:debugger_to_micro} \end{figure} diff --git a/tex/timer.tex b/tex/timer.tex index 1b17f4c..ce9562e 100644 --- a/tex/timer.tex +++ b/tex/timer.tex @@ -1,5 +1,5 @@ \chapter{Timers} -Before explaining how timers work, let's try to understand why we would need them. +Before explaining how timers work, let's try to understand why we would need them.\\ Up until now, when we have wanted events to occur some human-scale time (hundreds or thousands of milliseconds) apart, we have been using long but finite loops to create delays. The delay loops have been incrementing or decrementing a number in a CPU register many thousands of times; a task which takes an appreciable amount of time to complete. This can be considered a waste of CPU resources. Instead of getting the CPU to do some useful work (controlling a system or monitoring some external signals) it is simply modifying an internal register for a long time. @@ -40,8 +40,8 @@ \subsection{Clock from RCC} \subsection{Control} The control block configures how the timer will work. This includes such aspects as: \begin{itemize} -\item whether the TIMxCLK line is allowed to pass through the control block to the next blocks (counter enabled) or of the TIMxCLK line is prevented from going further (counter disables). -\item whether the timer will generate an interrupt when an overflow even happens +\item whether the TIMxCLK line is allowed to pass through the control block to the next blocks (counter enabled) or if the TIMxCLK line is prevented from going further (counter disabled). +\item whether the timer will generate an interrupt when an overflow event happens \item whether or not registers are shadowed \item and many more \end{itemize} @@ -51,7 +51,7 @@ \subsection{Control} \subsection{Prescaler} A prescaler is essentially a frequency divider. A clock line with a certain frequency enters the prescaler. The clock line exiting the prescaler has a frequency equal to the input frequency divided by some factor. -Prescalers are very common and useful in digital systems so it is worth discussing them a bit. A prescaler is essentially characterised by the range of values which it is able to divide by. Simple prescalers (such as those contained in the RCC block) are only able to divide by a select few powers of 2 (example, 1, 2, 8, 32, 128). Other prescalers (such as the one in our timer block) are able to divide by any integer in a certain range. Our timer PSC block has 16 bits worth of configurable prescaling so it can divide by anything from 0 to $2^{16} - 1$ = 65535. This prescaler which can divide by arbitrary integers is obviously more powerful than one with a very small number of values, but require more transistors to manufacture (ie: cost more). +Prescalers are very common and useful in digital systems so it is worth discussing them a bit. A prescaler is essentially characterised by the range of values which it is able to divide by. Simple prescalers (such as those contained in the RCC block) are only able to divide by a select few powers of 2 (example, 1, 2, 8, 32, 128). Other prescalers (such as the one in our timer block) are able to divide by any integer in a certain range. Our timer PSC block has 16 bits worth of configurable prescaling so it can divide by anything from 0 to $2^{16} - 1$ = 65535. This prescaler which can divide by arbitrary integers is obviously more powerful than one with a very small number of values, but require more transistors to manufacture (ie: costs more). There is a slight additional complication: the value which the prescaler actually divides by is equal to the value which it has been programmed with \emph{plus 1}. The reason for this is to avoid the division-by-zero case for when the prescaler is programmed with 0. In other words, if you place the value 0 into the prescaler it will actually divide by 1 (leave the signal unchanged). @@ -76,8 +76,8 @@ \subsection{Acknowledging Interrupts} The CPU does not automatically inform the timer when its interrupt is being handled. Instead, this must be done by your code. To acknowledge the interrupt, your interrupt handler code must write a 0 to the UIF bit in the TIMx\_SR. -Only once this happens with the peripheral clear its interrupt request. -If you do not acknowledge the interrupt, the ISR will be run again immediately after it finishes ad infinitum. +Only once this happens will the peripheral clear its interrupt request. +If you do not acknowledge the interrupt, the ISR will be run again immediately after it finishes \emph{ad infinitum}. It is advised to acknowledge the interrupt as the first action of the ISR.