Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoting of decoder for better maintainability #76

Merged
merged 40 commits into from
Apr 26, 2019

Conversation

thephoenixofthevoid
Copy link
Contributor

Should fix #64. The idea is to allow do whatever user want if user is ready to do things manually.

@thephoenixofthevoid
Copy link
Contributor Author

thephoenixofthevoid commented Apr 22, 2019

I run into the same issue when using msgpack5 to get object duplex connections over binary one when implementing https://github.com/thephoenixofthevoid/remote-leveldown (it's multileveldown written from scratch using somewhat different approach to reduce complexity to zero or so), managed to pass almost all abstract-leveldown testes except for being demotivated by the only test case (https://github.com/Level/abstract-leveldown/blob/c0abe28ed5cfa1c45d5fc25585ac9c60991d48ec/test/put-get-del-test.js#L136)

@mcollina
Copy link
Owner

Can you also update the README?
It’s probably better to have to two functions, one to encode undefined, and one to encode NaN.

Would they have to be functions? could they be static objects?

@thephoenixofthevoid
Copy link
Contributor Author

thephoenixofthevoid commented Apr 23, 2019

Yes, having an object is reasonable but here is one thing to consider: actually we have to provide a reliable way to detect these encoded entities. Here is what I see reasonable:

interface FallbackCodec {
  encode: (data:any) => Buffer|string;
  decode: (data:Buffer|string) => any;
  type: number;
  isBuffer: boolean; // and so on
}

Here type is an positive integer from 1 to 127, a extension slot that will be taken to implement this fallback behavior.

So, if we see unsupported datatype on our way, we have to delegate encoding to encode function, check out how many bytes result takes and then prepend length and extension slot number type .

On decoding, it will undergo the same logic as if I would register:

registerDecoder(fallbackCodec.type, fallbackCodec.decode)

So we might rather implement it using this kind of API, not littering in options.
I will update README too after all the rest.

@thephoenixofthevoid
Copy link
Contributor Author

thephoenixofthevoid commented Apr 23, 2019

Are there only 2 cases of undefined serialization? Strictly speaking functions can be considered too. If they are implemented in somewhat simple way and do not rely on context we can send them over connection, by getting source code and passing to Function constructor on the opposite side.

It's worth noting that this MUST not be default behavior, because of security things and developer must be aware of what is going on to make use of it. Passing functions to this fallback encoder seems like the very balanced approach.

@mcollina
Copy link
Owner

So we might rather implement it using this kind of API, not littering in options.

Maybe we should just document how to do this? It comes up often as a question.

@thephoenixofthevoid
Copy link
Contributor Author

Module throws instantly when undefined or NaN is encountered. If we simply delete these checks, bugs will be reintroduced, which were the very reason these checks exist. By the way, it seems it's the high time to split the-god-function into more manageable small ones.

Changes to be committed:
  modified:   lib/decoder.js
Well, provided that initialOffset is the value passed as an argument:
console.log(headerLength + totalBytesConsumed === offset - initialOffset)
// true   --- always
1. getSize now is implemented through hash not switch statement.
   More concise and appear to be much faster.

2. Created helpers.js file, and functions that logically unrelated
   to decoding/encoding will go there.

3. Where possible prefer flat code that nested branches

4. In decodeExt dramatically simplify all the things.

Changes to be committed:
	modified:   lib/decoder.js
	new file:   lib/helpers.js
@thephoenixofthevoid
Copy link
Contributor Author

Please take a look at what kind of changes I am applying to the codebase.
If you are ok with them, I will keep on that.

It's caused by not incrementing offset after reaading first byte.
To handle this change I first of all introduce "initialOffset" and
change everywhere offset to "initial offset". This commit does
this part of the job.

Then I will clean 1-by-1 every affected function and branch.
1. Remove "else" statements that go after "return"
2. Where appropriate (another 1-4 bytes have been read) offset is incremented
3. So that expressions like "initialOffset + 2" --- "... + 5" could be replaced by just "offset"

Changes to be committed:
  modified:   lib/decoder.js
1. function decodeSigned (buf, offset, size)
2. function decodeFloat (buf, offset, size)
3. function decodeExt (buf, offset, type, size, headerSize)
Currently:

| RANGE      |  STATUS             | function name (if exists) or type name  |
|------------|---------------------|-----------------|
| 0x00, 0x7f |  MISSING COMPLETELY | positive fixint
| 0x80, 0x8f |  MISSING COMPLETELY | fixmap
| 0x90, 0x9f |  MISSING COMPLETELY | fixarr
| 0xa0, 0xbf |  MISSING COMPLETELY | fixstr
| 0xc0, 0xc3 |  DONE               | decodeConstants(...)
| 0xc4, 0xc6 |  OLD CODE           | decodeBuffers(...)
| 0xc7, 0xc9 |  WIP NOW            | decodeExt(...)
| 0xca, 0xcb |  DONE               | decodeFloat
| 0xcc, 0xcf |  DONE               | decodeUnsignedInt()
| 0xd0, 0xd3 |  DONE               | decodeSigned()
| 0xd4, 0xd8 |  DONE               | decodeFixExt()
| 0xd9, 0xdb |  OLD CODE           | decodeStr(...)
| 0xdc, 0xdd |  OLD CODE           | decodeArray(...)
| 0xde, 0xdf |  OLD CODE           | decodeMap(...)
| 0xe0, 0xff |  MISSING COMPLETELY | negative fixint
@mcollina
Copy link
Owner

Can you remove the yarn.lock file and add it to gitignore?

Copy link
Contributor Author

@thephoenixofthevoid thephoenixofthevoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like I have overlooked fix-things. They are there.

@thephoenixofthevoid
Copy link
Contributor Author

It seems like decodingTypes is better to implement as a Map, (or object). Because there is a key existing --- no need to iterate over arrays or whatever. It might break tests, so I will do it later.

@thephoenixofthevoid
Copy link
Contributor Author

Can you remove the yarn.lock file and add it to gitignore?

What do you think about the changes if you managed to take a look at it?

Copy link
Owner

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid committing the dist  files? I'll update them before release.

.gitignore Outdated Show resolved Hide resolved
lib/decoder.js Outdated Show resolved Hide resolved
lib/decoder.js Outdated Show resolved Hide resolved
@mcollina
Copy link
Owner

Good work! you are almost done, a couple of nits.

@thephoenixofthevoid
Copy link
Contributor Author

I should have probably stated more clearly that I am still working on it and plan to finish with encoding side today or tomorrow. I was just asking if the code transformation applied is ok, because the changes are rather deep. I will highlight once everything is done, what was changed the most. Just to be sure that the change has been noticed.

This was referenced Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error "undefined is not encodable in msgpack"
3 participants