-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: flattenJson: efficient strings, arrays and objects skip. #119
Conversation
flattener_bench_test.go
Outdated
@@ -0,0 +1,156 @@ | |||
package quamina |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the benchmarks I am presenting, I think it will be good to commit those (if you agree I'll move status.json
to testdata
directory)
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 yes please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! will make a proper commit for that.
do we want to run the benchmarks in CI? I am not sure about it, since I don't github actions will give us good results and will take some time to run the benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See issue #30 - one problem is that GitHub runs actions single-threaded on a pathetic tiny instance. So we probably can't routinely run them all. This bothers me a bit because I want to make sure that nothing gets checked in that accidentally causes a noticeable slowdown.
Implement a way to efficiently skip values, in this way we are just scanning the buffer until we find the end of string/array/object, and don't check inner values. ``` benchmark old ns/op new ns/op delta Benchmark_JsonFlattener_ContextFields-10 17312 9053 -47.71% Benchmark_JsonFlattener_MiddleNestedField-10 17426 10713 -38.52% Benchmark_JsonFlattener_LastField-10 17060 9553 -44.00% Benchmark_JsonFlattner_Evaluate_ContextFields-10 17645 9495 -46.19% Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 17935 11152 -37.82% Benchmark_JsonFlattner_Evaluate_LastField-10 17475 9882 -43.45% benchmark old allocs new allocs delta Benchmark_JsonFlattener_ContextFields-10 23 3 -86.96% Benchmark_JsonFlattener_MiddleNestedField-10 24 4 -83.33% Benchmark_JsonFlattener_LastField-10 22 2 -90.91% Benchmark_JsonFlattner_Evaluate_ContextFields-10 33 13 -60.61% Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 33 13 -60.61% Benchmark_JsonFlattner_Evaluate_LastField-10 29 9 -68.97% benchmark old bytes new bytes delta Benchmark_JsonFlattener_ContextFields-10 560 48 -91.43% Benchmark_JsonFlattener_MiddleNestedField-10 576 64 -88.89% Benchmark_JsonFlattener_LastField-10 544 32 -94.12% Benchmark_JsonFlattner_Evaluate_ContextFields-10 992 480 -51.61% Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 1000 488 -51.20% Benchmark_JsonFlattner_Evaluate_LastField-10 824 312 -62.14% ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! My only worry is about skipBlock
but maybe I'm missing something obvious?
I notice in the coverage report that the coverage went down a bit: https://app.codecov.io/github/timbray/quamina/commit/9b8031b158a40424d90574f38d4993e140c67b48
I'm not a fanatic about coverage percent but when it goes down that always makes me nervous. I looked at the actual coverage red/green and didn't see anything uncovered that looked very worrying, but please have a glance at the coverage and see if there's anything that surprises you.
@@ -475,6 +496,60 @@ func (fj *flattenJSON) readLiteral(literal []byte) ([]byte, error) { | |||
return literal, nil | |||
} | |||
|
|||
func (fj *flattenJSON) skipBlock(openSymbol byte, closeSymbol byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This func makes me nervous. Normally when we're going through the buffer we use fj.step()
and fj.ch()
, is there a reason why this is different?
Also, nesting; Suppose you have an array (arguments [
, ]
) and one of the elements is an object, possibly deeply nested? Also, how about an object ({
, }
) and a member value is an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, nesting; Suppose you have an array (arguments [, ]) and one of the elements is an object, possibly deeply nested? Also, how about an object ({, }) and a member value is an array.
Nesting is covered here with the levels, you can see that we are skeeping most of status.json
which contains nested objects with arrays.
I'll add specific tests for that, to validate skipping of objects & arrays.
Normally when we're going through the buffer we use fj.step() and fj.ch(), is there a reason why this is different?
This is actually an interesting case! I wrote the code initially without the step
and ch
and when I used those methods the performance (time taken, not allocations) increased, talking about something like:
for {
ch := fj.ch()
// process
if fj.step() != nil { return err }
}
So I tried to investigate this, it didn't make any sense, since this methods are inlined and cheap, so I did manual inlining and got no improvements.
Finally what I found that iterating over the fj.event
with fj.eventIndex
is slower than -
data = fj.event[fj.eventIndex:]
i = 0
// iterating from i until len(data)
I am not sure why iterating from the middle to from the start is slower, I was thinking about trying to check it with compiler explorer and see what's the difference .. I'll try that but I am not an assembly expert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh... agree, makes no sense to me. data
is now a new []byte
against the same backing memory, maybe Go has some magic about indexing into []byte
when the index is small? Anyhow, if this is repeatable, maybe we should have a fancier version of step()
and ch()
that rebases the slice and then all the parts of flattenJSON
get the benefit?
Also, unless the improvement is dramatic I'm a little reluctant to have one piece of the big complicated function that arbitrarily has its own process for traversing the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyhow, if this is repeatable, maybe we should have a fancier version of step() and ch() that rebases the slice and then all the parts of flattenJSON get the benefit?
I was thinking about giving an option to have an iterator over the event that will hand you the current character and it might does what you are saying, but didn't want to introduce big change like that in this PR.
Also, unless the improvement is dramatic I'm a little reluctant to have one piece of the big complicated function that arbitrarily has its own process for traversing the buffer.
The improvement was dramatic, I'll post benchmarks differences tomorrow and lets get a decision based on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvement was dramatic, I'll post benchmarks differences tomorrow and lets get a decision based on that.
Wow, given that ch() and step() are inlined, that is really weird (also they didn’t show up in profiling output). Anyhow, there's a lesson hiding in here somewhere and once we figure it out we should give all of flattener the benefit. I agree, probably another PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update on benchmarks, I rewrote skipStringValue
and skipBlock
with step
& ch
-
benchmark old ns/op new ns/op delta
Benchmark_JsonFlattener_ContextFields-10 9062 12211 +34.75%
Benchmark_JsonFlattener_MiddleNestedField-10 10730 13619 +26.92%
Benchmark_JsonFlattener_LastField-10 9570 12598 +31.64%
Benchmark_JsonFlattner_Evaluate_ContextFields-10 9487 12607 +32.89%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 11122 14021 +26.07%
Benchmark_JsonFlattner_Evaluate_LastField-10 9873 12893 +30.59%
benchmark old allocs new allocs delta
Benchmark_JsonFlattener_ContextFields-10 3 3 +0.00%
Benchmark_JsonFlattener_MiddleNestedField-10 4 4 +0.00%
Benchmark_JsonFlattener_LastField-10 2 2 +0.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10 13 13 +0.00%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 13 13 +0.00%
Benchmark_JsonFlattner_Evaluate_LastField-10 9 9 +0.00%
benchmark old bytes new bytes delta
Benchmark_JsonFlattener_ContextFields-10 48 48 +0.00%
Benchmark_JsonFlattener_MiddleNestedField-10 64 64 +0.00%
Benchmark_JsonFlattener_LastField-10 32 32 +0.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10 480 480 +0.00%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10 488 488 +0.00%
Benchmark_JsonFlattner_Evaluate_LastField-10 312 312 +0.00%
I tried to write a representative benchmark - https://gist.github.com/yosiat/930426f4baea92b5f188203c6acb716b and it doesn't makes sense to me, I checked with https://godbolt.org/ and it looks that step
instructions are the same as for
code in it's relevant parts 🤔
for i < len(data) { | ||
c := data[i] | ||
|
||
if c == '\\' && i+1 < len(data) && data[i+1] == '"' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put a short comment here explaining why you can skip all the other \\
escapes, because the only thing you care about is an unescaped "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add!
Thanks for the speedy review!
This is actually interesting! I checked the coverage and increased coverage actually when I ran only I'll give it a look, it's important that we will have a (almost) perfect coverage for parsing JSON. I'll push commits to address the changes you are raising and before merging I'll do some rebases with squashes (so it will go with the commit conventions, and will be concentrated together). |
Regarding benchmarks: yeah it's tricky, but we could take a look at benchmark-action/github-action-benchmark and see if it provides value for us? |
This looks really awesome!! Will be interesting to use it here, and get alerts on PRs when we have a regression. I'll do it in PR immediately after this PR is merged, and let's see what we think about the results? By the way, I think it will be interesting to have CityLots benchmark as go benchmark, I tried doing so, but the results included reading the city lots file for some reason, I'll give another round of look. |
Codecov ReportBase: 94.48% // Head: 94.08% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #119 +/- ##
==========================================
- Coverage 94.48% 94.08% -0.41%
==========================================
Files 16 16
Lines 1849 1909 +60
==========================================
+ Hits 1747 1796 +49
- Misses 71 78 +7
- Partials 31 35 +4
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@timbray the PR is ready for another round of review. Regarding code coverage, the comment says there is a decrease, but when I open https://app.codecov.io/gh/timbray/quamina/pull/119 there is an increase of About the commits, once you approve I'll merge, but before merging I'll squash the commits (manually with interactive rebase) and will do force-push. |
When I look at the codecov report, it says the coverage is up: https://app.codecov.io/github/timbray/quamina/commit/76952717c23cb3f1cc04d1f9f9a61dd23876d876 OK, will review today. |
@@ -475,6 +489,65 @@ func (fj *flattenJSON) readLiteral(literal []byte) ([]byte, error) { | |||
return literal, nil | |||
} | |||
|
|||
func (fj *flattenJSON) skipBlock(openSymbol byte, closeSymbol byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So skipBlock sort of works, but I think its behavior is a little troubling. Try this unit test:
func TestSkipBlock(t *testing.T) {
blocks := []string{
`[ { foo ] ]`,
`{ "foo": }, "bar": 1 }"`,
}
for _, block := range blocks {
f := newJSONFlattener()
fj := f.(*flattenJSON)
fj.event = []byte(block)
fj.eventIndex = 0
var err error
if block[0] == '[' {
err = fj.skipBlock('[', ']')
} else if block[0] == '{' {
err = fj.skipBlock('{', '}')
}
if err == nil {
fmt.Printf("for %s block=%s\n", block, string(fj.event[fj.eventIndex:]))
} else {
fmt.Printf("for %s err=%s\n", block, err.Error())
}
}
}
Is this a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skipBlock
comes with a trade-off, that we won't validate the JSON during parsing.. so if we get a block (which we are not interesting about for matching) and contains invalid data we won't err.
To me, it makes sense, it would be problematic if it's a block that we are interesting in and it's not valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another perspective is about early skipping that we want to implement next..
consider: event which is { "context": ..., "payload": ... }
and we have patterns only on context fields, in that case once we fetch them all we won't read payload.
Now in that case, payload can be invalid json, for example: { "context": { "id": 1 }, "payload": { "a": }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll be honest, the notion that we would fail to detect broken JSON was sort of shocking to me, and remember that, because of Hyrum's Law, this becomes part of our contract with the world. So:
We need a change to README.md describing this behavior. I just checked and the discussion of error-checking in MatchesForJSONEvent()
is already weak, so I'll give myself an action item to take care of that.
I'll approve the PR but I'd like to request that you do a little investigation and figure out why ch()
and step()
seem to be slowing things down, i.e. why the code in skipBlock
looks different from the rest of flatten_json.go
. OK? If you can't I will, but at this moment you have all this swapped into your head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did an investigation, It would be nice if you can enable "Discussions" feature so we can talk about it over there.
I ran some benchmarks, playing around and published the results and the code here: https://github.com/yosiat/quotes-count
The reason that step
& ch
are slower is essentially because they are on the heap, while the other example is on stack. Because of obvious reasons we can move flattenJson
to be on stack.
The interesting stuff I found is:
_QuotesCount_Heap_For_EventIndex
is doingfor eventIndex < len(event)
and incrementingeventIndex
, on M1 is shows performance benefit while on Intel it doesn't._QuotesCount_Heap_Next
- Instead of doingfor
without conditions, I added anext
method which just checks the eventIndex - on M1 it's faster, but on Intel it's slower - so we can't use this trick.
I checked other json parsers on how they managing their state, and found that it splits:
- Most of them is doing the same as
flattenJson
- referencing their state with heap pointers. - Others have each method accept the rest of payload to process and return an offset, so for example, reading an array will be
offset, value, err := fj.readArray(fj.event[i:])
and then upon success doi += offset
, so in both methods the index is on the stack, same as the event.
I can do POC of the second option to see how much it improves the performance, but it will be have worse DX and be more error prone. so I am not sure if the cost (performance vs DX) will be worth it.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Discussions now enabled and I launched one. I have some questions about your most recent - want to replicate this over in the discussion and I'll follow-up there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! replicated to: #122 with some context added for new folks.
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
As suggested here: timbray#119 (comment)
Implement a way to efficiently skip values, in this way we are just scanning the buffer until we find the end of string/array/object, and don't check inner values.
See discussion here: #113 (comment)