-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle leading whitespace on code blocks #134
Handle leading whitespace on code blocks #134
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobbypriambodo, thanks! It looks good, and the new tests make it easy to review :)
-
Indeed, we should avoid the
Str
module. There are two alternatives.- The better one is probably that odoc depends on astring, which has
Astring.cuts
. As a side note, astring should be listed as a direct dependency of odoc. Right now, we are picking it up transitively. We can save this for a later issue, or you're welcome to tweak theopam
file now :) - We have some quickly-made function for this in the parser. Maybe that should be replaced by astring as well, but there is also the option to factor it out and call it from the lexer.
- The better one is probably that odoc depends on astring, which has
-
I agree that it's better to avoid dealing with mixes of spaces and tabs. We can wait until someone submits an issue showing us what actually needs to be fixed. I suspect the case will be a number of tabs followed by a number of spaces, but such source formatting is often inconsistent in how many leading tabs there will be, so the appearance will depend on the tab width, and basically we will have a mess. It's difficult even to handle pure tabs correctly when it comes to display, since we don't know the width.
src/parser/lexer.mll
Outdated
let least_amount_of_whitespace = | ||
lines | ||
|> List.map count_leading_whitespace | ||
|> List.fold_left min max_int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a comment, that using max_int
is okay because if the list is empty, we will not use the result of the fold
(in the map
below).
Thanks for the review. I went with the astring approach since it touches less files :D I've also added it to the Thanks for catching on Do tell if you have more concerns, would be happy to address them. |
Thanks for all that, it looks good to me :) The |
@bobbypriambodo There is one last thing remaining. I think we need to filter out empty or all-whitespace lines from computing the minimum:
In that example, the blank line is completely empty, so even when I indent the first and last line by two spaces, the blank line, which has no characters, forces the overall amount of leading whitespace to be computed as zero. Blank lines could have whitespace characters in general, though, even though that's sloppy coding :) Of course, after ignoring the blank line, we will get two for the amount of leading space, so we have to be prepared for the blank line having less characters than that to substring out. Let me know if you'd like me to fix this, but a quick follow-on PR is also welcome :) |
Ah, that is true, I already tried it on several examples but none had that characteristics. Sorry for that. I think I'll be able to fix it tomorrow (it's past 11 PM on this part of the world). If that's not quick enough then please handle it as you see fit. Thanks. |
Thanks, that is fine :) and sorry for not catching right away. |
Tries to address #133.
The implementation brings in
Str
module to split strings (sincesplit_on_chars
is available only since 4.04.0). Let me know if you have concerns.I also wondered whether we need to specially handle weird spaces and tabs combo, but decided not to since I don't think there's a use case for that.