You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 3, 2021. It is now read-only.
It would be really cool to have a separate library which serializes and deserializes C source code. Most of the serialization is already done in src/data in various impl Displays. The hard part will be factoring apart the lexer/parser from the rest of the compiler. The syntax part of this is already hard as noted in #151, but separating the preprocessor would also be somewhat difficult: I'd have to duplicate a fair bit of code (lexing tokens mostly) and have a way to pass locations around.
The preprocessor does not allow floating point constants.
My proposed plan is this:
Have a lexer/parser combo. Possibly rewrite these from scratch using logos and LALRPOP. This will do absolutely no semantic checking, only parsing. This will be the library.
Before the parser runs, have a preprocessor. Serialize tokens to strings before passing to the lexer (or possible don't have tokens at all? Is that feasible?). To allow #if, add an expr() API to the serde parser. To allow keeping track of multiple files add a metadata field to location:
This allows people who don't care to leave it blank (()) and people who do care to pass a FileId or similar.
After the preprocessor runs, the compiler proceeds as normal: analysis -> constant folding -> codegen -> linking.
Open questions:
Should the preprocessor be part of the library? If so, how should we deal with #includes? (devsnek on #lang-dev recommended calling a user-defined function - that would need to be aware of local vs. global includes as well as search paths).
Should this a separate crate or the same rcc crate with codegen behind a feature flag?
The text was updated successfully, but these errors were encountered:
jyn514
changed the title
Separate parsing into a serde library
Separate parsing into a library
Feb 10, 2020
If #356 is implemented, the preprocessor could instead run between the lexer and the parser. This would clean up the current somewhat hacky way the preprocessor consumes the lexer's characters for it. It would also allow people to opt-in/out of the preprocessor.
Suggested by @pythondude325. Related to #151.
It would be really cool to have a separate library which serializes and deserializes C source code. Most of the serialization is already done in
src/data
in variousimpl Display
s. The hard part will be factoring apart the lexer/parser from the rest of the compiler. The syntax part of this is already hard as noted in #151, but separating the preprocessor would also be somewhat difficult: I'd have to duplicate a fair bit of code (lexing tokens mostly) and have a way to pass locations around.Related facts:
#include
as long as they are valid UTF8 (including things that would normally be lexer errors)#if
as long as they only contain integer constantsMy proposed plan is this:
logos
andLALRPOP
. This will do absolutely no semantic checking, only parsing. This will be the library.#if
, add anexpr()
API to the serde parser. To allow keeping track of multiple files add a metadata field to location:This allows people who don't care to leave it blank (
()
) and people who do care to pass aFileId
or similar.Open questions:
#include
s? (devsnek on #lang-dev recommended calling a user-defined function - that would need to be aware of local vs. global includes as well as search paths).rcc
crate with codegen behind a feature flag?The text was updated successfully, but these errors were encountered: