Skip to content

Bind variable declaration to symbol #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 22, 2021
Merged

Bind variable declaration to symbol #9

merged 6 commits into from
Dec 22, 2021

Conversation

helixbass
Copy link
Owner

In this PR:

  • create symbol table and populate with a symbol corresponding to a variable declaration during binding

To test:
If you have a file named eg tmp.tsx in the repo root directory whose contents are const a: number = true and run cargo run tmp.tsx you should see debug-logging of the (pre-binding) SourceFile, then a post-binding: debug-logging of the SourceFile where the SourceFile node's locals is a symbol table mapping from the __String "a" to a Symbol with escaped_name: "a" and declarations a Vec containing the VariableDeclaration AST node that declared a, and the symbol on the BaseNode inside that VariableDeclaration should be a populated weak reference (to the Symbol) and then it should panic during type-checking

Based on variable-declaration

@@ -1,12 +1,15 @@
#![allow(non_upper_case_globals)]

use bitflags::bitflags;
use parking_lot::RwLock;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kept feeling like it'd make sense to be able to "map" the contents of a RwLockReadGuard/RwLockWriteGuard so that a method could return eg a RwLockReadGuard wrapping some nested field of the thing the RwLock is wrapping. Per rust-lang/rust#44189, the standard library's RwLock doesn't support this but the parking_lot crate does. So I swapped in parking_lot's RwLock

self.container.try_read().unwrap().as_ref().unwrap().clone()
}

fn maybe_container(&self) -> Option<Rc<Node>> {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using maybe_*() naming convention for when I'm returning an Option vs returning the unwrapped field


fn add_declaration_to_symbol(&self, symbol: Rc<Symbol>, node: Rc<Node /*Declaration*/>) {
node.set_symbol(symbol.clone());
let declarations = {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block scoping is to ensure that the RwLockReadGuard returned by symbol.maybe_declarations() is dropped before we call .set_declarations() (which will want write access to the RwLock)

This whole pattern is a little wonky from a Rust perspective - appendIfUnique() mutates the passed array if it exists or else returns a new array, but since I can't give it ownership of the underlying existing symbol.declarations Vec it'd have to take an &mut to the Vec in order to avoid creating a new Vec, but that's not compatible with the passed-None-and-creates-new-Vec use case (since it'd have to return a full-fledged new Vec) (is this somewhere I could use Cow?)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't really have any better ideas than what you've done.

}

fn add_declaration_to_symbol(&self, symbol: Rc<Symbol>, node: Rc<Node /*Declaration*/>) {
node.set_symbol(symbol.clone());
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol's get stored in a symbol table but also as a field on Node's, so I'm wrapping them in Rc's and storing a weak reference in BaseNode's symbol field

Some(name) => {
if true {
symbol = Some(Rc::new(self.create_symbol(SymbolFlags::None, name.clone())));
symbol_table.insert(name, symbol.as_ref().unwrap().clone());
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time I'd used type aliasing in Rust I think, pretty slick how I can just call methods like .insert() (SymbolTable is aliased to a certain type of HashMap)

cb_nodes: TNodesCallback,
) {
if node.kind() <= SyntaxKind::LastToken {
return;
}
match &*node {
Node::VariableDeclaration(variable_declaration) => {
visit_node(&mut cb_node, Some(variable_declaration.name()));
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler was complaining that this call of visit_node() takes ownership of the cb_node closure argument (and then I try and call visit_node() with it again right after this), which I guess makes sense, so I modified visit_node() to take a reference to a closure instead of a closure


pub fn get_escaped_text_of_identifier_or_literal(node: Rc<Node>) -> __String {
if is_member_name(&*node) {
node.as_member_name().escaped_text()
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this node.as_*() pattern, eg I exposed a .as_member_name() method on Node that panics unless it can unwrap the Node into one of the AST node sub-variants that implements MemberNameInterface (and it returns a &dyn MemberNameInterface). It's a little bit like what I was thinking of doing via a macro (eg unwrap_ast_node!(node, BinaryExpression)) for unwrapping to concrete AST node types but for unwrapping to a certain AST node interface instead

@@ -203,6 +302,8 @@ pub struct BaseNode {
pub parent: RwLock<Option<Weak<Node>>>,
pub pos: AtomicUsize,
pub end: AtomicUsize,
pub symbol: RefCell<Option<Weak<Symbol>>>,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started wondering why I was using RwLock instead of RefCell (since it seems like you see Rc<RefCell<...>>'s more than Rc<RwLock<...>>'s). I'm not totally sure but I think I can replace a lot of the RwLock's with RefCell's and that per eg this article RefCell's sound cheaper/faster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, yeah I had no idea that RefCell was just the single-threaded version of RWLock. (The naming doesn't really help there.)

}

fn locals(&self) -> MappedRwLockWriteGuard<SymbolTable> {
RwLockWriteGuard::map(self.locals.try_write().unwrap(), |option| {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a usage of the RwLockWriteGuard::map() that parking_lot provides


fn add_declaration_to_symbol(&self, symbol: Rc<Symbol>, node: Rc<Node /*Declaration*/>) {
node.set_symbol(symbol.clone());
let declarations = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't really have any better ideas than what you've done.

} else {
None
};
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be expressed with and_then (flat map) instead of an if-statement, but maybe isn't compatible with whatever will replace unimplemented!.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a return inside a closure body return from the enclosing function (like a Ruby block) or just from the closure (relevant here because this block that would go inside .and_then() needs to return from the function)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from the closure

array
} else {
vec![to_add]
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be written using one of the map_or* functions.

@@ -203,6 +302,8 @@ pub struct BaseNode {
pub parent: RwLock<Option<Weak<Node>>>,
pub pos: AtomicUsize,
pub end: AtomicUsize,
pub symbol: RefCell<Option<Weak<Symbol>>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, yeah I had no idea that RefCell was just the single-threaded version of RWLock. (The naming doesn't really help there.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants