Skip to content

Compilers project - CMP2024 - Computer engineering - Cairo University

Notifications You must be signed in to change notification settings

DoniaGameel/Expresso

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Expresso 💻☕

Expresso Logo

Introduction

It's a C liked programming language

Sample program:

const int a = 5;
string b = "hello";
if (a == 5) {
    print (a);
}
else {
    if (b == "hello") {
        print (b);
    }
}

Run Steps

  • yacc -d parser.y: create y.tab.h and y.tab.c

  • lex lex.l: create lex.yy.c

  • gcc -g lex.yy.c y.tab.c -o main: create main

  • ./main example.txt: run main

    To run another example change "example.txt" with the example path

Tools and Technologies

Lex and Yacc are tools commonly used in the field of computer science for constructing compilers and interpreters. They are used to generate lexical analyzers and parsers, which are critical components in the process of translating high-level programming languages into machine-readable code.

  1. Lex: is a tool for generating lexical analyzers, also known as lexers or scanners. A lexical analyzer processes an input stream of characters (such as source code) and divides it into a sequence of tokens. Tokens are meaningful sequences of characters, such as keywords, operators, identifiers, and literals.
  2. Yacc: Yacc stands for "Yet Another Compiler Compiler." It is a tool used to generate parsers, which analyze the structure of the token sequence produced by the lexer to ensure it conforms to the grammatical rules of the programming language.

Workflow

  1. Define Tokens with Lex : Create a .l file specifying the tokens and their patterns.
  2. Define Grammar with Yacc : Create a .y file specifying the grammar rules and associated actions.
  3. Compile and Link : Use Lex to generate a C source file for the lexer, and Yacc to generate a C source file for the parser. Compile and link these with a C compiler.

Tokens

Token Regex Description
FLOAT_VAL [0-9]*"."[0-9]+ Float Numbers.
INTEGER_VAL 0 | ([1-9][0-9]*) Integer Numbers.
CHAR_VAL '[^\\']' One character inside sinqle quotes.
STRING_VAL \"([^\"\\;]|\\.)*\" string between double quotes.
TRUE_VAL "true" True value.
FALSE_VAL "false" False value.
skip this token \/\*([^*]|\*+[^/])*\*+\/ Multi-line comment.
skip this token \/\/.* In-line comment.
NEW_LINE "\n" new line.
IDENTIFIER [a-zA-Z_][a-zA-Z0-9_]* Identifiers (variables and contants names).

Syntax

Data Types

Tha language supports the following data types:

  • Integer
  • Float
  • Boolean
  • String
  • Character

It supports modifiers like const as well.

const int a = 10;
int b = 20;
float c = 10.5;
bool d = true;
string e = "Hello World";
char f = 'c';

Operators

The language supports the common operators.

// Arithmetic operators
a = b + c;
a = b - c;
a = b * c;
a = b / c;
a = b % c;
a = b++;
a = ++b;
a = b--;
a = --b;
// Logical operators
a = b && c;
a = b || c;
a = !b;
// Relational operators
a = b == c;
a = b != c;
a = b > c;
a = b >= c;
a = b < c;
a = b <= c;

Conditional Statements

The language supports the if-else, and switch-case statements.

int a = 10;
int b = 100;
// if statement
if (a <= 10) {
    print(a);
}
else {
    if (b > 100) {
        print(b);
    }
    else {
        print("else");
    }
}

// switch-case statement
switch (a) {
    case 1: 
        print("1");
        break;
    
    case 2: 
        print("2");
        break;
    
    case 3: 
        print("3");
        break;
    
    default: 
        print("default");
        break;
}

switch (a) {
    case (1): 
        print("1");
        break;
    
    case (2): 
        print("2");
        break;
    
    case (3): 
        print("3");
        break;
}

Loops

The language supports the while, for, and repeat-until loops.

// while loop
int a = 0;
while (a < 20) {
    print(a);
    a = a + 1;
}
print(a);
while (a < 20) {
    if (a == 10) {
        print(a);
    }
    a = a + 1;
}
// for loop
for (int b=2 ; b<10; b++ ) {
    print(b);
    while (b < 10) {
        if (b == 5) {
            print("hi");
            print(b);
        }
    }
}

// repeat-until loop
int a = 0;
repeat {
    print(a);
    a = a + 1;
    print(a);
} until (a == 1);

Functions

The language supports functions with and without parameters. The language also checks for parameters count and types in the function call

void func_1 (){
    print("func_1");
}
void func_2(int a, int b) {
    print("func_2");
    print(a);
    print(b);
}
func_2(1, 2); // function call

Production Rules

  • program → statements

  • statements → stmt | statements stmt

  • stmt → ';' | datatype IDENTIFIER ';' | datatype IDENTIFIER '=' expr ';' | CONST datatype IDENTIFIER '=' expr ';' | IDENTIFIER '=' expr ';' | PRINT '(' expr ')' ';' | IF '(' expr ')' '{' statements '}' else_stmt | WHILE '(' expr ')' '{' statements '}' | FOR '(' assignment ';' expr ';' IDENTIFIER '=' expr ')' '{' statements '}' | REPEAT '{' statements '}' UNTIL '(' expr ')' ';' | SWITCH '(' switch_identifier ')' '{' case_list '}' | '{' statements'}' | VOID IDENTIFIER '(' parameters_list ')' '{' statements '}' | datatype IDENTIFIER'(' parameters_list ')' '{' statements '}' | function_call | return_stmt

  • return_stmt → RETURN ';' | RETURN expr ';'

  • function_call → IDENTIFIER '(' parameters_list_call ')'

  • else_stmt → /*Empty production*/ | ELSE '{' statements '}'

  • datatype → INTEGER | FLOAT | CHAR | STRING | BOOL

  • assignment → datatype IDENTIFIER '=' expr

  • var_declaration → datatype IDENTIFIER

  • parameters_list → /* Empty production */ | var_declaration | parameters_list ',' var_declaration

  • parameters_list_call → /* Empty production */ | IDENTIFIER | parameters_list_call ',' IDENTIFIER

  • case_list → case_stmt | case_list case_stmt | case_list default_case

  • case_stmt → CASE expr':' statements BREAK ';'

  • default_case → DEFAULT ':' statements BREAK ';'

  • switch_identifier → IDENTIFIER

  • terminals → TRUE_VAL | FALSE_VAL | function_call | IDENTIFIER | INTEGER_VAL | FLOAT_VAL | CHAR_VAL | STRING_VAL

  • expr → | '(' expr ')' | '-' expr | INCR expr | expr INCR | DECR expr | expr DECR | expr '+' expr | expr '-' expr | expr '*' expr | expr '/' expr | expr '^' expr | expr '%' expr | expr LOGICAL_AND expr | expr LOGICAL_OR expr | LOGICAL_NOT expr | expr EQUALS expr | expr NOT_EQUALS expr | | expr LESS_THAN_OR_EQUALS expr | expr GREATER_THAN expr | expr GREATER_THAN_OR_EQUALS expr

Quadruples

Procedures

Quadruples Description ARG1 ARG2
START PROC Declare of a procedure procedure name
ENDPROC End of a procedure procedure name
CALL Calls a procedure, handles all the stuff related to the PC procedure name

Variables

Quadruples Description ARG1 ARG2 RES
PUSH Push to the stack frame Identifier/Expr/parameter
POP Pop from the stack frame Identifier/Expr/parameter

Branching & Jumps

Quadruples Description ARG1 ARG2
JMP Unconditional jump to the label label
JZ Jumps to the label if the result of the last relational operation is true label
JNZ Jumps to the label if the result of the last relational operation is falze label

Arithmetic Operations

Quadruples Description ARG1 ARG2
NEG Get the opposite sign of an expression expression
ADD Add two expressions expression expression
SUB Subtract two expressions expression expression
MUL Multiply two expressions expression expression
DIV Divide two expressions expression expression
MOD Modulus two expressions expression expression
POST_INCR Increment One expression after the operation expression
PRE_INCR Increment One expression before the operation expression
POST_DECR Decrement One expression after the operation expression
PRE_DECR Decrement One expression before the operation expression

Logical Operations

Quadruples Description ARG1 ARG2
LOGICAL_OR Get the logical or of two expressions expression expression
LOGICAL_AND Get the logical or of two expressions expression expression
LOGICAL_NOT Get the logical not of 1 expression expression
EQ Check if two expressions are equal expression expression
NEQ Check if two expressions are not equal expression expression
GT Check if the first expression is greater than the second expression expression
GTQ Check if the first expression is greater than or equal the second expression expression
LT Check if the first expression is less than the second expression expression
LTQ Check if the first expression is less than or equal the second expression expression

Semantic Errors

Semantic Errors

TYPE_MISMATCH

UNDECLARED

UNINITIALIZED

UNUSED

REDECLARED

CONSTANT

OUT_OF_SCOPE

Wrong Number of Parameters In Fumction Call

Wrong Parameters Types In Fumction Call

Examples

Variables

Switch case
For loop

Functions

Releases

No releases published

Packages

No packages published

Languages

  • C 86.5%
  • Yacc 12.3%
  • Other 1.2%