Skip to content

Conversation

avh4
Copy link
Contributor

@avh4 avh4 commented May 1, 2022

Note there are two competing concerns w/r to laziness: 1) make sure we copy the content from the ByteString buffer before it's deallocated, and 2) try not to do extra work if the result is going to be discarded. I think I did an 80% of ideal here, erring on the side of (1). Notably, in the case of DocComments, I'm pretty sure there's double copying that could be cleaned up later; and also maybe the ByteString buffer is deallocated later than I think, in which case we could probably skip the copying by using P.Snippet instead of Utf8.Utf8 like DocComment does, if we're confident that the original ByteString will still be around until we finish formatting.

The next step after this is #58

@avh4 avh4 self-assigned this May 1, 2022
@avh4 avh4 mentioned this pull request May 1, 2022
31 tasks
@avh4 avh4 force-pushed the avh4/parse-comments branch from 4de0eef to b72a2d2 Compare May 1, 2022 22:16
@avh4 avh4 changed the title [WIP] Whitespace parser parses comments Whitespace parser parses comments May 1, 2022
@avh4 avh4 requested a review from robinheghan May 2, 2022 00:04
This also removes -fno-unused-do-binds from modules that had it inline,
and enables -Wall for the general build.
@avh4 avh4 force-pushed the avh4/parse-comments branch from d9ed4c7 to 809cf42 Compare May 2, 2022 00:26
Copy link
Member

@robinheghan robinheghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

make sure we copy the content from the ByteString buffer before it's deallocated

By copy you mean the Utf8.fromPtr calls, right?

@@ -329,7 +333,7 @@ chompSubscription =
spaces_em
addLocation (Var.upper E.Effect)

spaces_em :: Parser E.Module ()
spaces_em :: Parser E.Module [Src.Comment]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change () to [Src.Comment] here? Is it for easier composition later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really look where spaces_em is used yet, but I just pushed that up since this is just a wrapper for Space.compAndCheckIndent. My plan to making sure we don't accidentally discard any comments is to keep pushing the [Src.Comment] up at every level until we get to the parse that actually creates the AST node, and rely on the unused-bind warning to tell us when there are still comments that are discarded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense :)

@@ -1,7 +1,6 @@
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE UnboxedTuples #-}
{-# OPTIONS_GHC -Wall -fno-warn-unused-do-bind #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that you clean up stuff like this along the way 💯

chompAndCheckIndent toSpaceError toIndentError =
P.Parser $ \(P.State src pos end indent row col) cok _ cerr _ ->
let (# status, newPos, newRow, newCol #) = eatSpaces pos end row col
let (# status, newPos, newRow, newCol #) = eatSpaces pos end row col []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your change: what's up with the (# syntax here? I assume it's a special type of tuple. Stack-allocated, maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's an unboxed tuple, so I guess that means all four return values will be allocated on the stack, instead of allocating a Tuple object? I've never really used it before seeing it here in the parser.


data MultiStatus
= MultiGood
= MultiGood !(Utf8.Utf8 Src.GREN_COMMENT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bang operator is a way to opt-out of lazyness, right? Sorry about the noise, just checking my understanding as I'm reading 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. (And also this is something I haven't used much myself apart from seeing it here.) I'm not quite sure if it means that the argument to MultiGood will be evaluated before creating the MultiStatus value, or if it means that the argument will be forced at the same time that the MultiStatus value is evaluated.

@avh4
Copy link
Contributor Author

avh4 commented May 3, 2022

By copy you mean the Utf8.fromPtr calls, right?

Yep.

@avh4 avh4 merged commit 23bb058 into main May 22, 2022
@avh4 avh4 deleted the avh4/parse-comments branch May 22, 2022 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants