-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parser: make parser single-pass #61
Conversation
|
||
Type type; | ||
if (!GetType(to_parse, to_parse + size, &type)) { | ||
|
||
size_t start_pos = internal::SkipToNextToken(str, end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be technically correct, it should probably be ptrdiff_t
, not size_t
. Though they are the same things on x86/x86_64 and most other platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it is not the case, because SkipToNextToken
returns size_t
, which is fine (there is no pointer subtraction to get result in this function), but generally you are right and there are other occurrences where it can be important. I guess, I will still review and check for any issues that may be caused by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I investigated this topic and found out that the main difference between size_t
and ptrdiff_t
is that former is unsigned integer, whereas latter is signed.
Also, there are a lot of arguments on this topic and generally it is recommended to use signed types for array indexing as well as pointer arithmetic.
At the same time Google C++ Style Guide says:
When appropriate, you are welcome to use standard types like size_t and ptrdiff_t.
We use int very often, for integers we know are not going to be too big, e.g., loop counters. Use plain old int for such things. You should assume that an int is at least 32 bits, but don't assume that it has more than 32 bits. If you need a 64-bit integer type, use int64_t or uint64_t.
I guess it will be better to substitute almost all size_t
uses with ptrdiff_t
or plain int
, but we need to discuss which one will suit us better.
Also I think we need new issue and corresponding PR for this topic, so that we can talk it over and weigh all the pros and cons.
if (size != parsed_size) { | ||
(kParseFunctions[type])(isolate, str + start_pos, end, &parsed_size); | ||
|
||
parsed_size += internal::SkipToNextToken(str + start_pos + parsed_size, end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not increment str
pointer after each SkipToNextToken
call? We don't need the initial address anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For two reasons: it is const char*
and to enable universal check on line 98.
bool is_single_line; | ||
|
||
switch (str[1]) { | ||
case '/': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for braces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just followed Google C++ Styleguide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the Google C++ Styleguide states:
case blocks in switch statements can have curly braces or not, depending on your preference.
Then we need to decide whether we use them or don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked it more with braces, what is your opinion on this topic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK, I would not use them and have never used them before in such cases, but I checked it out in my editor and this code snippet indeed seems to look prettier and more readable with braces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's stick with braces then. We need to update it in other switch-statements, should I apply it in this pull request, or it is the case for another one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better raise a new PR.
} | ||
case '*': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
src/jsrs_parser.cc
Outdated
is_single_line = false; | ||
break; | ||
} | ||
default: { // In case it is not a comment start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- No need for braces.
- Two spaces before a single-line comment.
result[j++] = str[i]; | ||
} | ||
} | ||
if (is_single_line) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a ternary operator, though it's up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
@belochub I don't feel well now so I will review the rest of the PR tomorrow. |
a95be15
to
1199fc3
Compare
Rebased on master and fixed indentation in single-line comment. |
@belochub mind changing |
return String::Empty(isolate); | ||
} | ||
strncpy(result + res_index, symb, out_offset); | ||
delete[] symb; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to get rid of dynamic memory allocation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can allocate a buffer on stack and pass it to GetControlChar
by pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, though.
src/jsrs_parser.cc
Outdated
@@ -524,25 +545,21 @@ Local<String> ParseKeyInObject(Isolate* isolate, | |||
} else { | |||
size_t current_length = 0; | |||
for (size_t i = 0; i < *size; i++) { | |||
if (begin[i] == ':') { | |||
if (begin[i] == '_' || (i != 0 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe decouple the check into an inline function?
if (current_length != 0) { | ||
result = String::NewFromUtf8(isolate, begin, | ||
NewStringType::kInternalized, | ||
static_cast<int>(current_length)) | ||
.ToLocalChecked(); | ||
break; | ||
} else { | ||
THROW_EXCEPTION(SyntaxError, "Unexpected token :"); | ||
THROW_EXCEPTION(SyntaxError, "Unexpected identifier"); | ||
return Local<String>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like it to be consistent. In some cases we use Local<String>()
, in some cases — String::Empty(Isolate)
. Though it is the matter of a different PR since it was not introduced in this one.
break; | ||
} | ||
current_key = ParseKeyInObject(isolate, | ||
begin + i, | ||
end, | ||
¤t_length); | ||
i += current_length; | ||
i += SkipToNextToken(begin + i, end); | ||
if (begin[i] != ':') { | ||
THROW_EXCEPTION(SyntaxError, "Unexpected token"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change the message to the more meaningful "Colon expected"
?
src/packet_parser.cc
Outdated
size_t parsed_size = 0; | ||
const char* source = PrepareString(isolate, *in, in.length(), &total_size); | ||
// const char* source = PrepareString(isolate, *in, in.length(), &total_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented-out line should be deleted.
src/packet_parser.cc
Outdated
@@ -29,9 +29,10 @@ namespace packet_parser { | |||
Local<String> ParseNetworkPackets(Isolate* isolate, | |||
const String::Utf8Value& in, | |||
Local<Array> out) { | |||
size_t total_size = 0; | |||
size_t total_size = in.length(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can now be const size_t
.
LGTM if the commented line is removed. Other points are discussable. |
* Eliminate `jstp::parser::internal::PrepareString` function thus avoiding extra memory copying. * Fix deleting comment blocks located inside of strings. * Fix key parsing not throwing error when it included spaces but was not enclosed in quotes. Fixes: #60
1199fc3
to
aa8c617
Compare
@aqrln updated to resolve some of the issues you addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@belochub other issues mentioned have to be extracted and posted on the issue tracker. Will you do it or I should take care of it? |
@aqrln, I already have an idea for a new PR fixing some of the issues, so I guess I will also extract other issues as well. |
Landed in 3b2c1d3. |
This is a new and shiny first major release for `metarhia-jstp`. Changes include API refactoring and improvements, implementations of CLI, sessions, and application versions, native addon build optimizations, lots of bug fixes, test coverage increase, and other, less notable changes. This release also denotes the bump of the protocol version to v1.0. The only difference from the previous version of the protocol is that "old" heartbeat messages (`{}`) are now deprecated and `ping`/`pong` messages must be used for this purpose instead. Notable changes: * **src,build:** improve the native module subsystem *(Alexey Orlenko)* [#36](#36) **\[semver-minor\]** * **build:** compile in ISO C++11 mode *(Alexey Orlenko)* [#37](#37) **\[semver-minor\]** * **build:** improve error handling *(Alexey Orlenko)* [#40](#40) **\[semver-minor\]** * **lib:** refactor record-serialization.js *(Alexey Orlenko)* [#41](#41) * **parser:** fix a possible memory leak *(Alexey Orlenko)* [#44](#44) **\[semver-minor\]** * **protocol:** change the format of handshake packets *(Alexey Orlenko)* [#54](#54) **\[semver-major\]** * **parser:** make parser single-pass *(Mykola Bilochub)* [#61](#61) * **parser:** remove special case for '\0' literal *(Mykola Bilochub)* [#68](#68) **\[semver-major\]** * **parser:** fix bug causing node to crash *(Mykola Bilochub)* [#75](#75) * **client:** drop redundant callback argument *(Alexey Orlenko)* [#104](#104) **\[semver-major\]** * **client:** handle errors in connectAndInspect *(Alexey Orlenko)* [#105](#105) **\[semver-major\]** * **socket,ws:** use socket.destroy() properly *(Alexey Orlenko)* [#84](#84) **\[semver-major\]** * **cli:** add basic implementation *(Mykola Bilochub)* [#107](#107) **\[semver-minor\]** * **connection:** fix error handling in optional cbs *(Alexey Orlenko)* [#147](#147) **\[semver-major\]** * **test:** add JSON5 specs test suite *(Alexey Orlenko)* [#158](#158) * **lib:** change event signature *(Denys Otrishko)* [#187](#187) **\[semver-major\]** * **lib:** add address method to Server *(Denys Otrishko)* [#190](#190) **\[semver-minor\]** * **parser:** implement NaN and Infinity parsing *(Mykola Bilochub)* [#201](#201) * **parser:** improve string parsing performance *(Mykola Bilochub)* [#220](#220) * **lib:** optimize connection events *(Denys Otrishko)* [#222](#222) **\[semver-major\]** * **lib:** refactor server and client API *(Denys Otrishko)* [#209](#209) **\[semver-major\]** * **lib,src:** rename term packet usages to message *(Denys Otrishko)* [#270](#270) **\[semver-major\]** * **lib:** emit events about connection messages *(Denys Otrishko)* [#252](#252) **\[semver-minor\]** * **lib:** implement API versioning *(Denys Otrishko)* [#231](#231) **\[semver-minor\]** * **lib:** allow to set event handlers in application *(Denys Otrishko)* [#286](#286) **\[semver-minor\]** * **lib:** allow to broadcast events from server *(Denys Otrishko)* [#287](#287) **\[semver-minor\]** * **connection:** make callback method private *(Alexey Orlenko)* [#306](#306) **\[semver-major\]** * **lib:** implement sessions *(Mykola Bilochub)* [#289](#289) **\[semver-major\]** * **connection:** use ping-pong instead of heartbeat *(Dmytro Nechai)* [#303](#303) **\[semver-major\]**
* Eliminate `jstp::parser::internal::PrepareString` function thus avoiding extra memory copying. * Fix deleting comment blocks located inside of strings. * Fix key parsing not throwing error when it included spaces but was not enclosed in quotes. Fixes: metarhia/jstp#60 PR-URL: metarhia/jstp#61
* Eliminate `jstp::parser::internal::PrepareString` function thus avoiding extra memory copying. * Fix deleting comment blocks located inside of strings. * Fix key parsing not throwing error when it included spaces but was not enclosed in quotes. Fixes: metarhia/jstp#60 PR-URL: metarhia/jstp#61
* Eliminate `jstp::parser::internal::PrepareString` function thus avoiding extra memory copying. * Fix deleting comment blocks located inside of strings. * Fix key parsing not throwing error when it included spaces but was not enclosed in quotes. Fixes: metarhia/jstp#60 PR-URL: metarhia/jstp#61
jstp::parser::internal::PrepareString
function thus avoiding extra memory copying.Fixes: #60