-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize encoding of dictionaries #752
Optimize encoding of dictionaries #752
Conversation
Switch from CBOR map to CBOR array to improve speed and preserve ordering. Remove key strings from encoding. Remove old backwards-compatibility decoding code. Add tests, including round-trip for decoding old format and encoding new format. Closes onflow#743
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👏
runtime/interpreter/encode.go
Outdated
@@ -755,13 +757,13 @@ func (e *Encoder) prepareDictionaryValue( | |||
if err != nil { | |||
return nil, err | |||
} | |||
entries[key] = prepared | |||
entries = append(entries, prepared) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly setting the value to the index might be slightly better in terms of performance (I haven't tested though, just a hunch)
entries = append(entries, prepared) | |
entries[index] = prepared | |
index++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VERY nice catch! Your hunch is correct! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In isolation (small example program), this change provides a bigger improvement.
For encoding, the new way required more coding changes. Encoding speed improvement is around 0.5-0.7% vs append
for large data. Thanks again for the feedback, I'll incorporate this change!
New benchmark comparisons won't show this 0.5-0.7% speedup (compared to yesterday's benchmark) because compiling after adding error handling took away this gain.
runtime/interpreter/encode.go
Outdated
@@ -683,7 +684,8 @@ func (e *Encoder) prepareDictionaryValue( | |||
return nil, err | |||
} | |||
|
|||
entries := make(map[string]interface{}, v.Entries.Len()) | |||
// Use CBOR array for entry value to preserve ordering and improve speed. | |||
entries := make([]interface{}, 0, v.Entries.Len()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use cborArray
here and in other places?
entries := make([]interface{}, 0, v.Entries.Len()) | |
entries := make(cborArray, 0, v.Entries.Len()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Nice catch! For this PR, I should do this in other places if it's in functions for dictionaries. For non-dictionaries like composite types, I'd prefer to do it in the PR for composite types but I'm OK with either approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, that sounds good!
I found some code that required more error handling (needed to check array length). I fixed it and will be more careful before opening my next PR (I had gotten too comfortable using coverage+fuzzing tools to alert me because it's feasible for serialization libraries, probably not for interpreters). |
Add error handling to check decoded dictionary array length. Incorporate feedback to replace append with direct assignment for an additional speed gain of 0.5-0.7% for encoding large data.
New benchmarks after incorporating feedback and adding error-handling are similar to the original benchmark comparisons posted with the PR. Encoding vs master:
Decoding vs master:
Benchmark comparisons were done on linux_amd64 with Go 1.15.10. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, nice work @fxamacker! 👏
No worries about opening earlier, feel free to keep doing that, the review and adding missing things is what the PRs are here for 👍
@@ -31,6 +31,8 @@ import ( | |||
"github.com/onflow/cadence/runtime/common" | |||
) | |||
|
|||
type cborArray = []interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
} | ||
} | ||
|
||
return cbor.Tag{ | ||
Number: cborTagDictionaryValue, | ||
Content: cborMap{ | ||
Content: cborArray{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, nice that the slice initialization can stay the same, I didn't know this was possible 👍
if version <= 3 { | ||
decoded, err = DecodeValueV3(encoded, &testOwner, nil, version, nil) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea to still support the old one 👍
Closes #743
Description
Optimize encoding of dictionaries to:
Changes
Encoding comparisons:
Decoding comparisons:
Benchmark comparisons were done on linux_amd64 with Go 1.15.10.
Special thanks to @turbolent and @SupunS for their helpful feedback, explanations, and suggestions during our call!
For contributor use:
feature/storage-optimizations
branchFiles changed
in the Github PR explorer