Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create <whitespace> and <newline>, (, ), etc nodes rather than throwing out tokens #16

Open
bcoe opened this issue Dec 24, 2020 · 6 comments
Labels
enhancement New feature or request

Comments

@bcoe
Copy link
Member

bcoe commented Dec 24, 2020

So that there's no lost information in the parse tree, we should create <whitespace> and <newline> tokens for spacing characters that fall outside of <text> nodes.

@bcoe bcoe added the enhancement New feature or request label Dec 24, 2020
@wesleytodd
Copy link

I realized after submitting that review that the pr was merged, so just to keep the context organized I figured I would re-post it here to see if we need to follow up:


I might be wrong here, but I think we need a <blankline> token which is <blankline> ::= <newline>, <newline>+ as the separators between the <summary>, <body>, & <footer>. If I read this correctly as it, you would have a valid summary/body as (only one LF):

fix: summary line
this is my body

Am I missing something here? The implementation looks right, but I think it is just miss-represented in the grammar spec.

#26 (review)

@bcoe bcoe changed the title Create <whitespace> and <newline> nodes rather than throwing out tokens Create <whitespace> and <newline>, (, ), etc nodes rather than throwing out tokens Dec 29, 2020
@bcoe
Copy link
Member Author

bcoe commented Dec 29, 2020

Am I missing something here? The implementation looks right, but I think it is just miss-represented in the grammar spec.

@wesleytodd this was intentional on my part, I see no reason that:

fix: summary line
this is my body

shouldn't parse correctly, since it doesn't cause any trouble for the grammar.

@bcoe
Copy link
Member Author

bcoe commented Dec 29, 2020

@byCedric @wesleytodd having been playing around a bit with a tree visitor, it jumps out at me that we'll probably also want to add (, ), and any other tokens I might be missing? to the tree.

This allows us to do something like visit(summary, () => true, (node) => { if (node.value) { text += node.value} }), to put a portion of the tree back together into raw text.

@wesleytodd
Copy link

this was intentional on my part

Not opposed to this, but it is a change (might be breaking even?) from the current spec. What does the current parser do in this case?

This allows us to do something like visit(summary, () => true, (node) => { if (node.value) { text += node.value} }), to put a portion of the tree back together into raw text.

This was my main concern with not having these. If you want to modify and stringify, these markers are very helpful to maintain exact formatting.

@byCedric
Copy link
Contributor

@byCedric @wesleytodd having been playing around a bit with a tree visitor, it jumps out at me that we'll probably also want to add (, ), and any other tokens I might be missing? to the tree.

Yeah, it feels a bit odd that we remove those scopes. In this change I had to mitigate it by doing some stuff differently.

One thing we could add is something similar to nlcst's punctuation. Not sure if we have more than the scope parenthesis, but if we do we might want to consider that.

@bcoe
Copy link
Member Author

bcoe commented Dec 30, 2020

Not opposed to this, but it is a change (might be breaking even?) from the current spec.

I think I've made a few breaking changes as we transcribe the grammar. I'm trying to channel all the things I've seen colleagues accidentally get wrong, and then reach out to me about.

Making the grammar as forgiving as possible, but then having a linter on top of it, I'd say is the ideal. We can have a linter indicate that it's recommended to have \n\n between the summary and body, while still parsing appropriately.

Yeah, it feels a bit odd that we remove those scopes. In this change I had to mitigate it by doing some stuff differently.

I'm happy with a Punctuation node, if it's only parenthesis we don't have in the tree right now, perhaps we could even just call it a Parenthesis node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants