-
Notifications
You must be signed in to change notification settings - Fork 5
Syntax
EzASM syntax is fairly straightforward. Every line will start with an instruction
-- this will be what you are telling the program to do. The instruction is normally followed by an output
type argument (if applicable) and then input
argument type(s). Putting this together you will get something like the following:
-
instruction output input1 input2
for a binary operator -
instruction output input
for most unary operators -
instruction input-output
for certain unary operators
A good example of input-output
types would be a register dereference (see Address Dereference section).
A list of instructions and their arguments can be found by following this link.
An output
or input
could be for example a register
. Registers are fast and easy data storage for operating on data, however they are limited in quantity. There are 40 registers which are meant for your use of reading from and writing to:
-
$t0 $t1 $t2 ... $t9
are the 10temporary
registers. These are meant for data which you are using to work with. -
$s0 $s1 $s2 ... $s9
are the 10saved
registers. These are meant for saving data between function calls which modify temporary data. -
$ft0 $ft1 $ft2 ... $ft9
are the 10temporary
floating point registers, similar to their$t0 - $t9
counterparts but for floats. -
$fs0 $fs1 $fs2 ... $fs9
are the 10saved
floating point registers, similar to their$s0 - $s9
counterparts but for floats.
The call of sub $t0 $t1 $t2
would subtract the value stored in $t2
from the value stored in $t1
and store it into $t0
.
Immediate values are hard-coded numbers which are loaded into the program. They are typed out the same way a number is normally typed out when programming and can only be used as an input
to an instruction
. It would not make sense to try to store data to a constant in your program.
The call of add $t1 14 50
would add the two immediate values 14
and 50
and store the result into the register $t1
.
There are also other ways to denote immediate values: hexadecimal and binary support is included in this language. Hexadecimal numbers are denoted by beginning the immediate with 0x
and can contain numbers 0
through f
; for example, the hexadecimal immediate 0xFF
would be equal to the decimal immediate 255
. Likewise, binary immediates begin with 0b
and can only contain the numbers 0
and 1
; for example, the binary immediate 0b1001
is equal to the decimal immediate 9
. Sometimes you may find that certain numbers you are working with are best denoted in hexadecimal or binary: you can use this feature to do so.
The call of mul $t2 0xF 0b10
would multiply the two immediate values 15
and 2
and store them into the register $t2
.
Immediates can also be negative: -10
is a valid immediate as well as -0xFF
and -0b1001
.
You can also use immediate floating point values with the other features of immediates: -10.3
is a valid floating point immediate, as well as 0x.d5
and 0b101.01
. It is important to note however that floating point and integer numbers cannot do arithmetic together normally. You will need to convert your floating point numbers to longs or your longs into floating point numbers to do arithmetic with them. This is because the two types of data are fundamentally stored differently in memory and interpreting float memory as an integer or vice versa will not give a meaningful number (usually).
Character literals, a single character enclosed by single-quotes (e.g., 'c'
) are also an input method in EzASM. The character is converted into its ASCII equivalent value and is then used as an integer.
The call of add $t3 '0' 3
will add the number 3
to the value of the ASCII representation of '0'
: 48
in decimal. This results in a value of 51
being stored into $t3
. 51
when interpreted as ASCII represents the character '3'
.
String literals, any number of characters enclosed by double quotes (e.g., "Hello, world!"
) are an input method in EzASM. Upon paring the string, the language will allocate memory outside of the typical stack and heap space in which to store the array of characters representing the string. The value given to your program will be the address
of this null-terminated string.
In the case of a string literal Hello, World!
, a block of memory of 14 words
will be allocated like the following:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 'H' | 'e' | 'l' | 'l' | 'o' | ',' | ' ' | 'W' | 'o' | 'r' | 'l' | 'd' | '!' | '\0'|
and the immediate value given to the program will be the address
of the starting point of the string; in this scenario it would be wherever 'H'
is stored in memory. Note that the 14th character of the string is allocated as a null byte (represented as '\0'
) which is just a byte full of zeroes.
Dereferencing is the operation performed by the *
operator on pointers in the C and C++ programming languages. In the same way that one would use *myIntPointer
to retrieve the value stored at the address inside myIntPointer
, programmers can use ($register)
to find the value stored at that register (assuming that the address is valid). One can also use immediate offsets to shift the address being dereferenced:
The call of add $t5 1 4($sp)
add 4
to the address stored at $sp
and then dereference the resulting address to get a value. The program will then add 1
to that value and store the result into $t5
.
Address dereferences can also be used as outputs:
The call of add ($sp) 3 7
will add the values 3
and 7
to get 10
and then store that value in the address pointed to by the $sp
register.
Labels are a way to name a part of your code. A single line with an alphanumeric token ending with a colon (:
) indicates a label. The line function1:
does not cause any code to be executed itself, however it acts as a way for the program to be able to start executing code at that line.
function1:
add $t0 $t0 1
jump function1
The program above will act as an infinite loop: the first line function1
is read as the label, the second line add $t0 $t0 1
increments the value in $t0
by 1, and then the third line jumps the program to the line following the label function 1
. It does so by changing the program counter
(the number stored in register $pc
) to the line number of the label. This code upon execution will forever increment $t0
by 1
.
Comments are strings of characters in a source file which are not executed as code. Any character after a #
delimeter on a line will be treated by the program as if it were not there. One can have entire lines as comments, for example # this is a comment
, or have a comment after an instruction, for example add $t0 $t0 3 # adds 3 to the vale of $t0
. Comments are a great way to document one's code so that, upon revisiting it, the programmer or their college can understand what is being done.