Skip to content
This repository has been archived by the owner on Jun 3, 2021. It is now read-only.

Separate parsing into a library #266

Open
jyn514 opened this issue Feb 10, 2020 · 1 comment
Open

Separate parsing into a library #266

jyn514 opened this issue Feb 10, 2020 · 1 comment

Comments

@jyn514
Copy link
Owner

jyn514 commented Feb 10, 2020

Suggested by @pythondude325. Related to #151.

It would be really cool to have a separate library which serializes and deserializes C source code. Most of the serialization is already done in src/data in various impl Displays. The hard part will be factoring apart the lexer/parser from the rest of the compiler. The syntax part of this is already hard as noted in #151, but separating the preprocessor would also be somewhat difficult: I'd have to duplicate a fair bit of code (lexing tokens mostly) and have a way to pass locations around.

Related facts:

My proposed plan is this:

  • Have a lexer/parser combo. Possibly rewrite these from scratch using logos and LALRPOP. This will do absolutely no semantic checking, only parsing. This will be the library.
  • Before the parser runs, have a preprocessor. Serialize tokens to strings before passing to the lexer (or possible don't have tokens at all? Is that feasible?). To allow #if, add an expr() API to the serde parser. To allow keeping track of multiple files add a metadata field to location:
#[derive(Copy)]
struct Location<T: Copy> {
    span: Span,
    metadata: T,
}

This allows people who don't care to leave it blank (()) and people who do care to pass a FileId or similar.

  • After the preprocessor runs, the compiler proceeds as normal: analysis -> constant folding -> codegen -> linking.

Open questions:

  • Should the preprocessor be part of the library? If so, how should we deal with #includes? (devsnek on #lang-dev recommended calling a user-defined function - that would need to be aware of local vs. global includes as well as search paths).
  • Should this a separate crate or the same rcc crate with codegen behind a feature flag?
@jyn514 jyn514 changed the title Separate parsing into a serde library Separate parsing into a library Feb 10, 2020
@jyn514
Copy link
Owner Author

jyn514 commented Mar 26, 2020

If #356 is implemented, the preprocessor could instead run between the lexer and the parser. This would clean up the current somewhat hacky way the preprocessor consumes the lexer's characters for it. It would also allow people to opt-in/out of the preprocessor.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant