Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: flattenJson: efficient strings, arrays and objects skip. #119

Merged
merged 9 commits into from
Oct 5, 2022

Conversation

yosiat
Copy link
Collaborator

@yosiat yosiat commented Oct 4, 2022

Implement a way to efficiently skip values, in this way we are just scanning the buffer until we find the end of string/array/object, and don't check inner values.

benchmark                                                old ns/op     new ns/op     delta
Benchmark_JsonFlattener_ContextFields-10                 17312         9053          -47.71%
Benchmark_JsonFlattener_MiddleNestedField-10             17426         10713         -38.52%
Benchmark_JsonFlattener_LastField-10                     17060         9553          -44.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         17645         9495          -46.19%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     17935         11152         -37.82%
Benchmark_JsonFlattner_Evaluate_LastField-10             17475         9882          -43.45%

benchmark                                                old allocs     new allocs     delta
Benchmark_JsonFlattener_ContextFields-10                 23             3              -86.96%
Benchmark_JsonFlattener_MiddleNestedField-10             24             4              -83.33%
Benchmark_JsonFlattener_LastField-10                     22             2              -90.91%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         33             13             -60.61%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     33             13             -60.61%
Benchmark_JsonFlattner_Evaluate_LastField-10             29             9              -68.97%

benchmark                                                old bytes     new bytes     delta
Benchmark_JsonFlattener_ContextFields-10                 560           48            -91.43%
Benchmark_JsonFlattener_MiddleNestedField-10             576           64            -88.89%
Benchmark_JsonFlattener_LastField-10                     544           32            -94.12%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         992           480           -51.61%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     1000          488           -51.20%
Benchmark_JsonFlattner_Evaluate_LastField-10             824           312           -62.14%

See discussion here: #113 (comment)

@yosiat yosiat marked this pull request as ready for review October 4, 2022 15:00
@@ -0,0 +1,156 @@
package quamina
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the benchmarks I am presenting, I think it will be good to commit those (if you agree I'll move status.json to testdata directory)

What do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 yes please.

Copy link
Collaborator Author

@yosiat yosiat Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! will make a proper commit for that.

do we want to run the benchmarks in CI? I am not sure about it, since I don't github actions will give us good results and will take some time to run the benchmarks.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See issue #30 - one problem is that GitHub runs actions single-threaded on a pathetic tiny instance. So we probably can't routinely run them all. This bothers me a bit because I want to make sure that nothing gets checked in that accidentally causes a noticeable slowdown.

Implement a way to efficiently skip values, in this way we are just scanning the buffer until we find the end of string/array/object, and don't check inner values.

```
benchmark                                                old ns/op     new ns/op     delta
Benchmark_JsonFlattener_ContextFields-10                 17312         9053          -47.71%
Benchmark_JsonFlattener_MiddleNestedField-10             17426         10713         -38.52%
Benchmark_JsonFlattener_LastField-10                     17060         9553          -44.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         17645         9495          -46.19%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     17935         11152         -37.82%
Benchmark_JsonFlattner_Evaluate_LastField-10             17475         9882          -43.45%

benchmark                                                old allocs     new allocs     delta
Benchmark_JsonFlattener_ContextFields-10                 23             3              -86.96%
Benchmark_JsonFlattener_MiddleNestedField-10             24             4              -83.33%
Benchmark_JsonFlattener_LastField-10                     22             2              -90.91%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         33             13             -60.61%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     33             13             -60.61%
Benchmark_JsonFlattner_Evaluate_LastField-10             29             9              -68.97%

benchmark                                                old bytes     new bytes     delta
Benchmark_JsonFlattener_ContextFields-10                 560           48            -91.43%
Benchmark_JsonFlattener_MiddleNestedField-10             576           64            -88.89%
Benchmark_JsonFlattener_LastField-10                     544           32            -94.12%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         992           480           -51.61%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     1000          488           -51.20%
Benchmark_JsonFlattner_Evaluate_LastField-10             824           312           -62.14%
```
Copy link
Owner

@timbray timbray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! My only worry is about skipBlock but maybe I'm missing something obvious?

I notice in the coverage report that the coverage went down a bit: https://app.codecov.io/github/timbray/quamina/commit/9b8031b158a40424d90574f38d4993e140c67b48

I'm not a fanatic about coverage percent but when it goes down that always makes me nervous. I looked at the actual coverage red/green and didn't see anything uncovered that looked very worrying, but please have a glance at the coverage and see if there's anything that surprises you.

@@ -475,6 +496,60 @@ func (fj *flattenJSON) readLiteral(literal []byte) ([]byte, error) {
return literal, nil
}

func (fj *flattenJSON) skipBlock(openSymbol byte, closeSymbol byte) error {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This func makes me nervous. Normally when we're going through the buffer we use fj.step() and fj.ch(), is there a reason why this is different?

Also, nesting; Suppose you have an array (arguments [, ]) and one of the elements is an object, possibly deeply nested? Also, how about an object ({, }) and a member value is an array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, nesting; Suppose you have an array (arguments [, ]) and one of the elements is an object, possibly deeply nested? Also, how about an object ({, }) and a member value is an array.

Nesting is covered here with the levels, you can see that we are skeeping most of status.json which contains nested objects with arrays.

I'll add specific tests for that, to validate skipping of objects & arrays.

Normally when we're going through the buffer we use fj.step() and fj.ch(), is there a reason why this is different?

This is actually an interesting case! I wrote the code initially without the step and ch and when I used those methods the performance (time taken, not allocations) increased, talking about something like:

for {
  ch := fj.ch()

  // process

  if fj.step() != nil { return err }
}

So I tried to investigate this, it didn't make any sense, since this methods are inlined and cheap, so I did manual inlining and got no improvements.

Finally what I found that iterating over the fj.event with fj.eventIndex is slower than -

data = fj.event[fj.eventIndex:]
i = 0
// iterating from i until len(data)

I am not sure why iterating from the middle to from the start is slower, I was thinking about trying to check it with compiler explorer and see what's the difference .. I'll try that but I am not an assembly expert.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh... agree, makes no sense to me. data is now a new []byte against the same backing memory, maybe Go has some magic about indexing into []byte when the index is small? Anyhow, if this is repeatable, maybe we should have a fancier version of step() and ch() that rebases the slice and then all the parts of flattenJSON get the benefit?

Also, unless the improvement is dramatic I'm a little reluctant to have one piece of the big complicated function that arbitrarily has its own process for traversing the buffer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyhow, if this is repeatable, maybe we should have a fancier version of step() and ch() that rebases the slice and then all the parts of flattenJSON get the benefit?

I was thinking about giving an option to have an iterator over the event that will hand you the current character and it might does what you are saying, but didn't want to introduce big change like that in this PR.

Also, unless the improvement is dramatic I'm a little reluctant to have one piece of the big complicated function that arbitrarily has its own process for traversing the buffer.

The improvement was dramatic, I'll post benchmarks differences tomorrow and lets get a decision based on that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement was dramatic, I'll post benchmarks differences tomorrow and lets get a decision based on that.

Wow, given that ch() and step() are inlined, that is really weird (also they didn’t show up in profiling output). Anyhow, there's a lesson hiding in here somewhere and once we figure it out we should give all of flattener the benefit. I agree, probably another PR for that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update on benchmarks, I rewrote skipStringValue and skipBlock with step & ch -

benchmark                                                old ns/op     new ns/op     delta
Benchmark_JsonFlattener_ContextFields-10                 9062          12211         +34.75%
Benchmark_JsonFlattener_MiddleNestedField-10             10730         13619         +26.92%
Benchmark_JsonFlattener_LastField-10                     9570          12598         +31.64%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         9487          12607         +32.89%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     11122         14021         +26.07%
Benchmark_JsonFlattner_Evaluate_LastField-10             9873          12893         +30.59%

benchmark                                                old allocs     new allocs     delta
Benchmark_JsonFlattener_ContextFields-10                 3              3              +0.00%
Benchmark_JsonFlattener_MiddleNestedField-10             4              4              +0.00%
Benchmark_JsonFlattener_LastField-10                     2              2              +0.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         13             13             +0.00%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     13             13             +0.00%
Benchmark_JsonFlattner_Evaluate_LastField-10             9              9              +0.00%

benchmark                                                old bytes     new bytes     delta
Benchmark_JsonFlattener_ContextFields-10                 48            48            +0.00%
Benchmark_JsonFlattener_MiddleNestedField-10             64            64            +0.00%
Benchmark_JsonFlattener_LastField-10                     32            32            +0.00%
Benchmark_JsonFlattner_Evaluate_ContextFields-10         480           480           +0.00%
Benchmark_JsonFlattner_Evaluate_MiddleNestedField-10     488           488           +0.00%
Benchmark_JsonFlattner_Evaluate_LastField-10             312           312           +0.00%

I tried to write a representative benchmark - https://gist.github.com/yosiat/930426f4baea92b5f188203c6acb716b and it doesn't makes sense to me, I checked with https://godbolt.org/ and it looks that step instructions are the same as for code in it's relevant parts 🤔

for i < len(data) {
c := data[i]

if c == '\\' && i+1 < len(data) && data[i+1] == '"' {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put a short comment here explaining why you can skip all the other \\ escapes, because the only thing you care about is an unescaped "

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add!

@yosiat
Copy link
Collaborator Author

yosiat commented Oct 4, 2022

Thanks for the speedy review!

I'm not a fanatic about coverage percent but when it goes down that always makes me nervous. I looked at the actual coverage red/green and didn't see anything uncovered that looked very worrying, but please have a glance at the coverage and see if there's anything that surprises you.

This is actually interesting! I checked the coverage and increased coverage actually when I ran only TestFJ tests from 24%~ to 30%~.

I'll give it a look, it's important that we will have a (almost) perfect coverage for parsing JSON.

I'll push commits to address the changes you are raising and before merging I'll do some rebases with squashes (so it will go with the commit conventions, and will be concentrated together).

@embano1
Copy link
Collaborator

embano1 commented Oct 5, 2022

Regarding benchmarks: yeah it's tricky, but we could take a look at benchmark-action/github-action-benchmark and see if it provides value for us?

@yosiat
Copy link
Collaborator Author

yosiat commented Oct 5, 2022

Regarding benchmarks: yeah it's tricky, but we could take a look at benchmark-action/github-action-benchmark and see if it provides value for us?

This looks really awesome!! Will be interesting to use it here, and get alerts on PRs when we have a regression. I'll do it in PR immediately after this PR is merged, and let's see what we think about the results?

By the way, I think it will be interesting to have CityLots benchmark as go benchmark, I tried doing so, but the results included reading the city lots file for some reason, I'll give another round of look.

@codecov-commenter
Copy link

Codecov Report

Base: 94.48% // Head: 94.08% // Decreases project coverage by -0.40% ⚠️

Coverage data is based on head (f6b7e9c) compared to base (4b37ecc).
Patch coverage: 82.81% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #119      +/-   ##
==========================================
- Coverage   94.48%   94.08%   -0.41%     
==========================================
  Files          16       16              
  Lines        1849     1909      +60     
==========================================
+ Hits         1747     1796      +49     
- Misses         71       78       +7     
- Partials       31       35       +4     
Impacted Files Coverage Δ
flatten_json.go 94.09% <82.81%> (-1.36%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@yosiat
Copy link
Collaborator Author

yosiat commented Oct 5, 2022

@timbray the PR is ready for another round of review.

Regarding code coverage, the comment says there is a decrease, but when I open https://app.codecov.io/gh/timbray/quamina/pull/119 there is an increase of 0.17% and for flatten_json.go the increase is 0.42%.

About the commits, once you approve I'll merge, but before merging I'll squash the commits (manually with interactive rebase) and will do force-push.

@yosiat yosiat requested a review from timbray October 5, 2022 10:58
@timbray
Copy link
Owner

timbray commented Oct 5, 2022

When I look at the codecov report, it says the coverage is up: https://app.codecov.io/github/timbray/quamina/commit/76952717c23cb3f1cc04d1f9f9a61dd23876d876

OK, will review today.

@@ -475,6 +489,65 @@ func (fj *flattenJSON) readLiteral(literal []byte) ([]byte, error) {
return literal, nil
}

func (fj *flattenJSON) skipBlock(openSymbol byte, closeSymbol byte) error {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So skipBlock sort of works, but I think its behavior is a little troubling. Try this unit test:

func TestSkipBlock(t *testing.T) {
	blocks := []string{
		`[ { foo ] ]`,
		`{ "foo": }, "bar": 1 }"`,
	}

	for _, block := range blocks {
		f := newJSONFlattener()
		fj := f.(*flattenJSON)
		fj.event = []byte(block)
		fj.eventIndex = 0
		var err error
		if block[0] == '[' {
			err = fj.skipBlock('[', ']')
		} else if block[0] == '{' {
			err = fj.skipBlock('{', '}')
		}
		if err == nil {
			fmt.Printf("for %s block=%s\n", block, string(fj.event[fj.eventIndex:]))
		} else {
			fmt.Printf("for %s err=%s\n", block, err.Error())
		}
	}
}

Is this a problem?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipBlock comes with a trade-off, that we won't validate the JSON during parsing.. so if we get a block (which we are not interesting about for matching) and contains invalid data we won't err.

To me, it makes sense, it would be problematic if it's a block that we are interesting in and it's not valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another perspective is about early skipping that we want to implement next..
consider: event which is { "context": ..., "payload": ... } and we have patterns only on context fields, in that case once we fetch them all we won't read payload.

Now in that case, payload can be invalid json, for example: { "context": { "id": 1 }, "payload": { "a": }.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll be honest, the notion that we would fail to detect broken JSON was sort of shocking to me, and remember that, because of Hyrum's Law, this becomes part of our contract with the world. So:

We need a change to README.md describing this behavior. I just checked and the discussion of error-checking in MatchesForJSONEvent() is already weak, so I'll give myself an action item to take care of that.

I'll approve the PR but I'd like to request that you do a little investigation and figure out why ch() and step() seem to be slowing things down, i.e. why the code in skipBlock looks different from the rest of flatten_json.go. OK? If you can't I will, but at this moment you have all this swapped into your head.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did an investigation, It would be nice if you can enable "Discussions" feature so we can talk about it over there.

I ran some benchmarks, playing around and published the results and the code here: https://github.com/yosiat/quotes-count

The reason that step & ch are slower is essentially because they are on the heap, while the other example is on stack. Because of obvious reasons we can move flattenJson to be on stack.

The interesting stuff I found is:

  • _QuotesCount_Heap_For_EventIndex is doing for eventIndex < len(event) and incrementing eventIndex, on M1 is shows performance benefit while on Intel it doesn't.
  • _QuotesCount_Heap_Next - Instead of doing for without conditions, I added a next method which just checks the eventIndex - on M1 it's faster, but on Intel it's slower - so we can't use this trick.

I checked other json parsers on how they managing their state, and found that it splits:

  • Most of them is doing the same as flattenJson - referencing their state with heap pointers.
  • Others have each method accept the rest of payload to process and return an offset, so for example, reading an array will be offset, value, err := fj.readArray(fj.event[i:]) and then upon success do i += offset , so in both methods the index is on the stack, same as the event.

I can do POC of the second option to see how much it improves the performance, but it will be have worse DX and be more error prone. so I am not sure if the cost (performance vs DX) will be worth it.

What do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Discussions now enabled and I launched one. I have some questions about your most recent - want to replicate this over in the discussion and I'll follow-up there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! replicated to: #122 with some context added for new folks.

@timbray timbray merged commit 08c5f54 into timbray:main Oct 5, 2022
@yosiat yosiat deleted the skipper branch October 6, 2022 21:25
@yosiat yosiat mentioned this pull request Oct 11, 2022
5 tasks
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 12, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 29, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 29, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 30, 2022
yosiat added a commit to yosiat/quamina that referenced this pull request Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants