-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move line continuation handling to io.c #2215
Move line continuation handling to io.c #2215
Conversation
15dfae8
to
6765c0b
Compare
A test was failing, which did this:
The problem came from a difference in behavior between (kernel) string streams and file streams in To fix this, I changed the string stream code to ignore all This still is not quite perfect, as regular file streams could in border cases also produce a line buffer ending in Another fix is that I change the prompt when we do an explicit line continuation (which is much easier now that we handle them in a single place). As a result, this example (in master)
becomes this:
|
BTW, I am aware that this PR is missing tests, I'll work on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for the missing tests, I think this is ok. The only thing that irked me a bit was the #\
change in lib/stbcckt.gi
, but after playing around a bit I realised that the change in the scanner that causes this change to be necessary makes it more consistent.
cfcac20
to
05f0a1d
Compare
05f0a1d
to
90ca5bb
Compare
Codecov Report
@@ Coverage Diff @@
## master #2215 +/- ##
==========================================
+ Coverage 73.13% 73.26% +0.12%
==========================================
Files 481 482 +1
Lines 246510 265819 +19309
==========================================
+ Hits 180286 194739 +14453
- Misses 66224 71080 +4856
|
90ca5bb
to
867141c
Compare
I have now changed this, as discussed on issue #2201, to not handle line continuations inside of comments, which thus made it possible to drop the changes to In any case, this PR here still lacks explicit test cases; I'll work on them when I am back from vacation. Also, it contains a ton of refactoring to the scanner code which was enabled by the (re)moved line continuation code. I could put that into a separate PR, to simplify reviewing, too. |
867141c
to
dc177b3
Compare
I have now added tests. Indeed, now the first commit adds some tests for line continuations. Then, as mentioned in #2201, a new helper function The rest then is a ton of refactoring and simplification in the scanner, which is enabled by the code simplification thanks to the moving of the line continuation code. Another improvement is that now the line continuation code always changesthe GAP prompt to the line continuation prompt Thing is, all these extra commits likely make reviewing this harder than it is. I'll look into moving them into one or multiple additional PRs, and reducing this PR here to the bare minimum, but that'll have to wait for another day. In the meantime, feel free to already review parts or all of this PR. |
src/io.c
Outdated
if (!*STATE(In)) | ||
GetLine(); | ||
else if (STATE(In)[1] == '\n') { | ||
STATE(In) += 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here which indicates that a line continuation was detected?
Also: the term "line continuation" should be documented somewhere (resp. be replaced by a different, documented name, if somebody has a good suggestion?)
src/io.c
Outdated
STATE(Prompt) = ""; | ||
} | ||
else if (STATE(In)[1] == '\r') | ||
STATE(In) += (STATE(In)[2] == '\n') ? 3 : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention here that this is about handling Windows/DOS style CR+LF?!?
src/io.c
Outdated
@@ -220,6 +253,14 @@ Char PEEK_CURR_CHAR(void) | |||
return *STATE(In); | |||
} | |||
|
|||
void IGNORE_REST_OF_LINE(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestioins for a better name? SKIP_TO_END_OF_LINE
? SKIP_LINE
? ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SKIP_TO_END_OF_LINE sounds good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to SKIP_TO_END_OF_LINE
dc177b3
to
4da519c
Compare
src/scanner.c
Outdated
/* handle escape sequences */ | ||
if ( c == '\\' ) { | ||
c = GET_NEXT_CHAR(); | ||
/* if next is another '\\' followed by '\n' it must be ignored */ | ||
while ( c == '\\' && PEEK_NEXT_CHAR() == '\n' ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR gets rid of two out of three uses of PEEK_NEXT_CHAR
. The only remaining one is in GetNumber
and is used to disambiguate floats from range expressions, i.e. [1.4]
vs. [1..4]
.
This greatly simplifies dealing with CR+LF Windows line ends, and also removes a behavioral difference between string streams and regular streams: the former treated an isolate CR as a line terminator, and stopped filling the buffer after encountering one, so for a CRLF, one would get a buffer ending in just CR; while for a regular file stream, one would see a buffer ending CR followed by LF.
4da519c
to
9c704a9
Compare
I have now reduced this PR to its bare minimum. Further refactoring and fixes will come in separate PRs. |
src/io.c
Outdated
|
||
// handle line continuation, i.e., backslash followed by new line or CRLF; | ||
// and also the end of a line in general | ||
while (*STATE(In) == '\\' || *STATE(In) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, why do we handle null characters like backslashes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, we just have to keep reading for both: for null chars, we have to read more data via GetLine()
, then start over; and for backslash, we have to see if next comes a lineend, and if so, skip the backslash and the lineend. And in theory, either of these can happen after the other, hence the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that make sense? Would a better comment help? Suggestions welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this again, I didn't realise that the second part of 201 and 202 are the same case. Maybe change 202 to if(*STATE(In) == 0)
, which would make it more obvious that is matches the second condition in 201?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do, and will try to write some comments, too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
6a13163
to
40e08f5
Compare
... and not in the scanner. This is much simpler, and also ensures uniform treatment of line continuations everywhere. Several new test cases are added to demonstrate this. This leads to one change in behavior: line continuations inside of triple quoted strings are now handled, while before they would just insert a backslash followed by a newline into the string. This change is intentional. A test case is adjusted accordingly.
40e08f5
to
29bb736
Compare
Resolve #2201.