-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: simplify scanning of large string literals #2381
Conversation
Instead of going back and forth between the reader and scanner, using states S_PARTIALSTRING or S_PARTIALTRIPSTRING, simply accumulate the whole string into a GAP string object right away. To avoid a potential speed penalty caused by constantly dereferencing a GAP string object (as reading a character could cause a garbage collection), we first accumulate the data we read into a buffer on the stack, and then flush the whole buffer into the GAP string whenever it is full.
Codecov Report
@@ Coverage Diff @@
## master #2381 +/- ##
==========================================
- Coverage 73.69% 73.69% -0.01%
==========================================
Files 484 484
Lines 245655 245668 +13
==========================================
+ Hits 181036 181042 +6
- Misses 64619 64626 +7
|
@@ -986,7 +996,7 @@ void GetSymbol ( void ) | |||
case '^': STATE(Symbol) = S_POW; GET_NEXT_CHAR(); break; | |||
|
|||
case '~': STATE(Symbol) = S_TILDE; GET_NEXT_CHAR(); break; | |||
case '?': STATE(Symbol) = S_HELP; GetHelp(); break; | |||
case '?': GetHelp(); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why's this gone? I can't see any bug it could have been causing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not gone, I moved it into GetHelp
for symmetry with the other GetFOO
functions in the following lines.
C_NEW_STRING(topic, STATE(ValueLen), (void *)STATE(Value)); | ||
TRY_READ { IntrHelp(topic); }; | ||
TRY_READ { IntrHelp(STATE(ValueObj)); } | ||
STATE(ValueObj) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why clean ValueObj?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paranoia, I guess: The idea is that we already used this string in IntrHelp
, and don't want to accidentally reuse it elsewhere (say, due to a big, where some function expects the scanner to set ValueObj
, but this doesn't happen -- either due to a bug in the scanner, or because the calling code was n fact wrong to assume the scanner would set ValueObj
).
In any case, I have another change coming up in another PR where I reset ValueObj
and Value
at the start of Match
, which would make this change partially obsolete (but not generally, I guess).
Anyway, if you are concerned about it now, I can get rid of it before merging.
This PR simplifies scanning of large string literals. As a side effect, we get rid of
STATE(ValueLen)
(this requires also tweaking the scanner code forS_HELP
). To achieve this, we introduceObj STATE(ValueObj)
which is now used by the string scanning code to store its result; since the scanned data is turned into a string object later on anyway, there is no harm in this.As an optimization, we use an intermediate (stack based) buffer in which we accumulate data, until the buffer is full resp. we are done scanning, when we transfer the data into the final string object. This way, we don't have to worry about garbage collection, and the compiler can emit better code.
Subset of PR #2371