lib,parser: optimize parser and socket transport #254

belochub · 2017-07-12T10:08:03Z

Optimize parser by using Buffers without conversion to string when possible and avoiding calling ToString when parsing strings.
Change socket to work with Buffers instead of strings to avoid unneeded multiple conversion.
Refactor (reimplement) parseNetworkPackets function to make it more verbose, improving readability, and change it to work on Buffers instead of strings.
Avoid excessive copying in cases when messages are divided and being received by parts (experimental).

This PR somewhat experimental in some parts (mostly lib/socket.js changes) and is open to comments with better ideas on how to improve it.

lundibundi · 2017-07-12T12:12:23Z

lib/socket.js

+    if (this._buffer) {
+      newPartLength = chunk.indexOf(0);
+      if (newPartLength === -1) {
+        this._buffer = Buffer.concat(


Maybe it's worth trying to allocate a big buffer and manually append data to it if there is enough space or if there is not enough space then copy both of them into a new bigger buffer till it's size grows to MAX_PACKET_SIZE (same approach may be used in _parseRemains to avoid creating a new buffer 2 more times)? Or maybe there were problems with this approach?

What do you mean by "append"? Isn't it just copying from one buffer into another?

Yes but AFAIK concat always creates a new buffer to copy the ones passed. I propose to avoid copying both buffers if the first one contains enough space to fit the second buffer and just copy the second buffer to the first one, hence 'append'.

The size of the buffer will always be equal to the size of the data inside it because it is coming from 'data' event on the socket, so there is never enough space to copy the second buffer inside it. And you can't just resize the buffer because it is just a view over the ArrayBuffer underneath, you can only create a new one.
I still can't understand how you want this to be implemented.

I meant this for a case where we receive consecutive messages without '\0' in them, so when first such message arrives we copy its contents to the buffer in this class that has enough space (with space to spare) or if it doesn't then we replace it with a bigger one and copy, hence on a second and so on half-messages we won't have to copy what we've received before and only append the received data to the buffer we have in class.

@lundibundi

now when we receive first half-message we concat it with null(initial state of class buffer)

Where does this happen exactly? I don't recall this behavior at all.

upon each next half-message we receive

There literally can only be two half-messages :)

then upon receiving consequent half-messages

Same here, can you please explain what you mean by "consequent half-messages"?

This way we will only copy our accumulated data if there is not enough space in the class buffer (and we will grow it more that just the size of received data), hence avoiding a lot of copying in case of very fragmented data.

Yes and I was talking exactly about the growing part, it will lead to almost the same amount of copying we have now (see first paragraph in #254 (comment)), and will lead to increased memory usage, I don't understand how this will be better for performance overall.

@belochub Oh yeah, we doesn't concat with null, we just assign it to _buffer.

If we assume that

There literally can only be two half-messages :)

Then yes, there will be the same amount of copying.
If we don't need to handle the case where there are more that 2 half-messages then my suggestion will only make the code more complex, so we should not implement it in this case.

@lundibundi, there is nothing to assume, it is a fact that there can't be more than two halves of a whole message (half-messages).

@belochub I think that @lundibundi meant partial-message not half-message. He used the wrong term for it.

@nechaido, okay then, to benchmark this approach we will need an option to set the message size in benchmark, can you, please, add this option to the benchmark introduced in #253?

lundibundi · 2017-07-12T12:13:40Z

lib/socket.js

+    }
+  }
+
+  _parseRemains(newPart, newPartLength) {


Maybe name it _parseWith or _parseWithChunk?

What do you think about _parseMessageRemains?

I do not understand the reason for "Remains", as I see this function actually parses the main part and what remains is the latter part of chunk after slice.

"Remains" means message part that remains in the buffer from the previous parsing.

Ain't that function parse data that was accumulated (and including the next message) till we've received message with '\0' in it?

Ok, that's up to you anyway. Also maybe _parseRemainsWith.

I don't understand this naming, _parseRemainsWith what? There should be a word after "with".

As I see it, first argument to the function is what with refers to.

Yes, but when you call this function, you can't tell what was the name of the arguments.

Oh, well, I can 😄. Anyway my thought was that you don't need names of the arguments, it's that the arguments that you pass are what with refers to.

nechaido

LGTM. but I'd like to compare this to https://github.com/metarhia/jstp/pull/254/files/4f51cef4ba14ea1d4858f46785cb7d6da74d2d17#r127193334 if someone implements it.

nechaido · 2017-07-31T12:28:53Z

@belochub could you rebase this on master and run benchmark with and without proposed changes and publish results here.

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data).

belochub · 2018-08-22T15:13:43Z

Closing this, since serde implementation was moved from this package into the separate mdsf package.
Some of the optimizations from here should probably be moved into the mdsf package.

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data). Refs: metarhia/jstp#254

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data). Refs: metarhia/jstp#254

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data). * Replace deprecated Utf8Value constructor usage. Refs: metarhia/jstp#254

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data). Refs: metarhia/jstp#254

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data). Refs: metarhia/jstp#254 PR-URL: #21 Reviewed-By: Denys Otrishko <shishugi@gmail.com> Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254 PR-URL: #22 Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com> Reviewed-By: Denys Otrishko <shishugi@gmail.com>

belochub added lib optimization parser work-in-progress labels Jul 12, 2017

belochub requested review from aqrln, lundibundi and nechaido July 12, 2017 10:08

belochub force-pushed the parser-optimization branch 3 times, most recently from 65caace to 4f51cef Compare July 12, 2017 11:48

lundibundi reviewed Jul 12, 2017

View reviewed changes

nechaido reviewed Jul 19, 2017

View reviewed changes

belochub added 3 commits August 1, 2017 02:40

parser: improve overall parser performance

d46c15e

* Slightly improve string parsing performance. * Highly improve parsing performance for Buffers by avoiding toString conversion (leading to copying data).

lib,parser: improve parseNetworkPackets

75372b3

lib: avoid excessive copying before parsing

07542d3

belochub force-pushed the parser-optimization branch from 4f51cef to 07542d3 Compare August 1, 2017 00:42

belochub closed this Aug 22, 2018

belochub mentioned this pull request Aug 22, 2018

parser: improve overall parser performance metarhia/mdsf#21

Closed

belochub added a commit to metarhia/mdsf that referenced this pull request Aug 22, 2018

parser: refactor parseJSTPMessages

994e4e6

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

belochub added a commit to metarhia/mdsf that referenced this pull request Aug 22, 2018

parser: refactor parseJSTPMessages

2089e9f

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

belochub mentioned this pull request Aug 22, 2018

parser: refactor native parseJSTPMessages metarhia/mdsf#22

Closed

belochub added a commit to metarhia/mdsf that referenced this pull request Aug 29, 2018

parser: refactor parseJSTPMessages

0a825e2

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

belochub added a commit to metarhia/mdsf that referenced this pull request Aug 30, 2018

parser: refactor parseJSTPMessages

6e7eebe

Avoid using `strlen()` magic and explicitly check for `'\0'` character. Refs: metarhia/jstp#254

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib,parser: optimize parser and socket transport #254

lib,parser: optimize parser and socket transport #254

belochub commented Jul 12, 2017

lundibundi Jul 12, 2017

belochub Jul 12, 2017

lundibundi Jul 13, 2017

belochub Jul 13, 2017

lundibundi Jul 13, 2017

belochub Jul 19, 2017

lundibundi Jul 21, 2017 •

edited

Loading

belochub Jul 21, 2017

nechaido Jul 21, 2017

belochub Jul 21, 2017

lundibundi Jul 12, 2017

belochub Jul 12, 2017

lundibundi Jul 13, 2017

belochub Jul 13, 2017

lundibundi Jul 13, 2017

lundibundi Jul 13, 2017 •

edited

Loading

belochub Jul 13, 2017

lundibundi Jul 13, 2017

belochub Jul 13, 2017

lundibundi Jul 13, 2017

nechaido left a comment •

edited

Loading

nechaido commented Jul 31, 2017

belochub commented Aug 22, 2018

lib,parser: optimize parser and socket transport #254

lib,parser: optimize parser and socket transport #254

Conversation

belochub commented Jul 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lundibundi Jul 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lundibundi Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nechaido left a comment • edited Loading

Choose a reason for hiding this comment

nechaido commented Jul 31, 2017

belochub commented Aug 22, 2018

lundibundi Jul 21, 2017 •

edited

Loading

lundibundi Jul 13, 2017 •

edited

Loading

nechaido left a comment •

edited

Loading