Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just some General questions regarding the CW #2

Open
paul-vangerow opened this issue Mar 3, 2022 · 3 comments
Open

Just some General questions regarding the CW #2

paul-vangerow opened this issue Mar 3, 2022 · 3 comments

Comments

@paul-vangerow
Copy link

Sorry if some of this has already been explained, but I'm just trying to wrap my head around some of the bash stuff and the general gist of how we can do the lab.

Right now I'm just trying to deal with the front end and am using Lab-2 as a reference.

  1. There have been a couple of ways in which you have run the compiler/lexer/parser throughout the labs, but just generally, how does the .flex file get its input. I've done a bit of reading and looked through the lectures again, but all I understand thus far is that lex uses a scanner. In the lecture you show this off by just writing something to check as the program constantly waits for input, however in the bash scripts sometimes you pass a whole file in, other times you do an input stream and overall I'm just very confused as to what way we should actually provide our lexer with input.

  2. Kind of the same thing but for the yacc files, how is the parser called or does it just start automatically? How does it interact with the lexer? It seems the function to call the lexer is defined, however I don't see where the parser actually starts and is called?

In the lectures you told us to ignore the generated code, but I don't see how we can accomplish anything without actually understanding what is happening - however documentation on this is a jumbled mess which can make things a lot harder to understand.

@ymherklotz
Copy link
Member

One thing that might help is try to understand the main function in lab 1 and lab 2.

For lab 1, you don't have a parser, so you have to manually call the lexer from the main function.

TokenType type=(TokenType)yylex();

This means that at every iteration the lexer will lex a token, and return the token type.

For lab 2, you are using bison. Bison makes an assumption that by default it can call the the lexer through the yylex function. This therefore plugs in directly to flex, and you don't have to do anything else to combine them. However, in bison you have to write the top-level function yourself that will run the parser and assign the top node of the AST correctly. In lab 2 this is parseAST, which calls the yyparse function that bison defines for you.

@paul-vangerow
Copy link
Author

Ahhh okay, I didn't realise that bison automatically calls the lexer.

Only real question now is how is input handled to the lexer, is it as an arguement (with a file) or as an input stream?

@ymherklotz
Copy link
Member

ymherklotz commented Mar 3, 2022

Documentation for flex and bison is normally quite extensive. The following is how I found the answer to the question.

For example, for this issue, as it deals with passing input to flex, you don't have to look at bison documentation.

Then, you can start by opening:

man flex

This should always be the default way to get a quick manual for any linux command. But this actually doesn't have the info you want (it's quite short). At the end it tells you though that you can do:

info flex

This shows the whole documentation. Then, it has an "Simple Examples" section, and the last example uses a variable called yyin, which seems to be doing what you want.

int main( int argc, char **argv ) {
    ++argv, --argc;  /* skip over program name */
    if ( argc > 0 )
        yyin = fopen( argv[0], "r" );
    else
        yyin = stdin;

    yylex();
}

Then, you can use Ctrl-S to search the document for yyin, and you can find that in Section 9, there is a more detailed explanation of what yyin does.

Whenever 'yylex()' is called, it scans tokens from the global input file 'yyin' (which defaults to stdin). It continues until it either reaches an end-of-file (at which point it returns the value 0) or one of its actions executes a 'return' statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants