Improve performance of mapstructure in HTTP API #6147

pearkes · 2019-07-16T18:48:51Z

We should cache mapstructure conversion results in the API (see below) or we could do the conversion before any field processing happens and just use those results everywhere. The decodeBody function is used by more than just the TXN API but it is most noticeable there given the types of requests typically made.

Quoting @mkeeler:

Every time DecodeHookExec is called it first runs typedDecodeHook on the ComposeDecodeHookFunc. Then after conversion etc. it will execute the composed function which recursively invokes DecodeHookExec on each of those two functions. That in turn does the same thing with calling typedDecodeHook. So for every field we call typedDecodeHook 3 times. As you can see in the flamegraph calling that function accounts for most of the time spent in DecodeHookExec and overall is accounting for a very large portion of the overall CPU time.

So what does typedDecodeHook do? Very little in fact. In essence its just checking if the passed in value is convertible to one of two interfaces, converts it and returns the converted value. This “conversion” is just checking that a provided function pointer has the right signature and “converts” it to the named function type.

The answer here is that the reflection base checking if the fn pointers are convertible and then converting them is not cheap and we are doing it way more often than we need to. For every field in every request being decoded by the HTTP API we will rerun the exact same conversion routines repeatedly. They will always produce the same result, yet we keep rerunning them. In some cases a Txn request could contain many operations to be performed. Each one of those operations will have at least 4 fields internally. So we will do the 3 conversion

5X + 1 times where X is the number of operations in the txn (so we are calling typedDecodeHook at least 16 times for every Txn op). While the HTTP API enforces a max of 64 ops in a Txn request, that is enforced AFTER decoding so it could be possible to send huge requests that have no chance of actually being applied but they can easily consume very large amounts of CPU.

s-mang · 2019-08-22T21:32:28Z

I'm going to try to paste some of the context I gain on this issue as it comes to me, in case I drop off or someone else has ideas they can chime in.

Wrote a more strenuous benchmark for mapstructure, which decodes a semi-complex, 75KB json blob into a Go struct. Thought the new benchmark alone, especially compared to the output of existing benchmarks, was note worthy (new more strenuous benchmark at the bottom) 🏒 :

Sarahs-MacBook-Pro:mapstructure sadams$ go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile cpuprofile.out
goos: darwin
goarch: amd64
pkg: github.com/mitchellh/mapstructure
Benchmark_Decode-8                   	  300000	      4342 ns/op	    1993 B/op	      51 allocs/op
Benchmark_DecodeViaJSON-8            	  500000	      3998 ns/op	    1232 B/op	      34 allocs/op
Benchmark_JSONUnmarshal-8            	 1000000	      1881 ns/op	     288 B/op	      13 allocs/op
Benchmark_DecodeBasic-8              	  200000	      9671 ns/op	    6327 B/op	      76 allocs/op
Benchmark_DecodeEmbedded-8           	  200000	      6396 ns/op	    5827 B/op	      75 allocs/op
Benchmark_DecodeTypeConversion-8     	   50000	     31449 ns/op	   16402 B/op	     240 allocs/op
Benchmark_DecodeMap-8                	  500000	      3057 ns/op	    1208 B/op	      34 allocs/op
Benchmark_DecodeMapOfStruct-8        	  100000	     12206 ns/op	   11031 B/op	     139 allocs/op
Benchmark_DecodeSlice-8              	 1000000	      2333 ns/op	     944 B/op	      29 allocs/op
Benchmark_DecodeSliceOfStruct-8      	  200000	     11264 ns/op	   10430 B/op	     128 allocs/op
Benchmark_DecodeWeaklyTypedInput-8   	  500000	      2728 ns/op	    1464 B/op	      28 allocs/op
Benchmark_DecodeMetadata-8           	  500000	      2762 ns/op	    1707 B/op	      26 allocs/op
Benchmark_DecodeMetadataEmbedded-8   	  300000	      6242 ns/op	    5276 B/op	      62 allocs/op
Benchmark_DecodeTagged-8             	 1000000	      1601 ns/op	     840 B/op	      18 allocs/op
Benchmark_DecodeLargeJson-8          	     500	   2893028 ns/op	 1373138 B/op	   33971 allocs/op

(gist of the benchmark is here just in case: https://gist.github.com/adams-sarah/44cb61871a1057da8fde05b059c0478a)

s-mang · 2019-08-23T19:29:05Z

Alright I've got some more benchmarks and some charts. I decided to compare the growth of these benchmarks against similar benchmarks that I wrote for encoding/json.

(RED = mitchellh/mapstructure, BLUE = encoding/json)

The last 2 benchmarks where you see the growth are decoding 75kb json and 1mb json blobs respectively.

This is pretty much what I expected given that mapstructure uses reflect, and uses it recursively. Using reflect recursively does not scale well in my experience, especially in terms of memory usage.

I'm worried about the use of this package, these benchmarks are an initial red flag in my head. But I'm going to dive into some application profiles next. Perhaps this is not the underlying issue. We shall see.

Benchmark values pasted here:

encoding/json

Benchmark_Sarah_Decode-8                	 1000000	      2213 ns/op	     320 B/op	      14 allocs/op
Benchmark_Sarah_DecodeViaJSON-8         	 1000000	      1781 ns/op	     656 B/op	       5 allocs/op
Benchmark_Sarah_DecodeBasic-8           	  500000	      3259 ns/op	     368 B/op	      13 allocs/op
Benchmark_Sarah_DecodeMapOfStruct-8     	 1000000	      2111 ns/op	     648 B/op	      16 allocs/op
Benchmark_Sarah_DecodeSlice-8           	 1000000	      1122 ns/op	     232 B/op	       8 allocs/op
Benchmark_Sarah_DecodeSliceOfStruct-8   	 1000000	      1388 ns/op	     288 B/op	       8 allocs/op
Benchmark_Sarah_DecodeTagged-8          	 3000000	       471 ns/op	     184 B/op	       2 allocs/op
Benchmark_Sarah_DecodeLargeJson75kb-8   	    2000	   1026076 ns/op	  104284 B/op	    7335 allocs/op
Benchmark_Sarah_DecodeLargeJson1mb-8    	     100	  14790997 ns/op	 2039177 B/op	  135597 allocs/op


mitchellh/mapstructure

Benchmark_Decode-8                   	  300000	      4670 ns/op	    1993 B/op	      51 allocs/op
Benchmark_DecodeViaJSON-8            	  300000	      4222 ns/op	    1232 B/op	      34 allocs/op
Benchmark_DecodeBasic-8              	  200000	     10160 ns/op	    6327 B/op	      76 allocs/op
Benchmark_DecodeMapOfStruct-8        	  100000	     12954 ns/op	   11031 B/op	     139 allocs/op
Benchmark_DecodeSlice-8              	  500000	      2432 ns/op	     944 B/op	      29 allocs/op
Benchmark_DecodeSliceOfStruct-8      	  100000	     12137 ns/op	   10431 B/op	     128 allocs/op
Benchmark_DecodeTagged-8             	 1000000	      1655 ns/op	     840 B/op	      18 allocs/op
Benchmark_DecodeLargeJson75kb-8      	     500	   2983173 ns/op	 1373150 B/op	   33971 allocs/op
Benchmark_DecodeLargeJson1mb-8       	     100	  37533688 ns/op	19567845 B/op	  460321 allocs/op

s-mang · 2019-08-23T21:04:07Z

Attaching a screenshot of the cpu flame graph mentioned in the issue description. Or, rather, an instance of one.

Fixes #6147

pearkes added type/enhancement Proposed improvement or new feature theme/performance Performance benchmarking or potential improvement labels Jul 16, 2019

pearkes added this to the Upcoming milestone Jul 16, 2019

pearkes assigned s-mang Jul 30, 2019

s-mang removed their assignment Aug 1, 2019

s-mang self-assigned this Aug 22, 2019

s-mang mentioned this issue Aug 29, 2019

txn: don't try to decode request bodies > raft.SuggestedMaxDataSize #6422

Merged

s-mang mentioned this issue Oct 24, 2019

Use encoding/json as JSON decoder instead of mapstructure #6680

Merged

s-mang closed this as completed in #6680 Oct 29, 2019

s-mang added a commit that referenced this issue Oct 29, 2019

Use encoding/json as JSON decoder instead of mapstructure (#6680)

78ad820

Fixes #6147

s-mang added a commit that referenced this issue Nov 1, 2019

Use encoding/json as JSON decoder instead of mapstructure (#6680)

8f3b4b5

Fixes #6147

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of mapstructure in HTTP API #6147

Improve performance of mapstructure in HTTP API #6147

pearkes commented Jul 16, 2019

s-mang commented Aug 22, 2019

s-mang commented Aug 23, 2019 •

edited

Loading

s-mang commented Aug 23, 2019 •

edited

Loading

Improve performance of mapstructure in HTTP API #6147

Improve performance of mapstructure in HTTP API #6147

Comments

pearkes commented Jul 16, 2019

s-mang commented Aug 22, 2019

s-mang commented Aug 23, 2019 • edited Loading

s-mang commented Aug 23, 2019 • edited Loading

s-mang commented Aug 23, 2019 •

edited

Loading

s-mang commented Aug 23, 2019 •

edited

Loading