-
Notifications
You must be signed in to change notification settings - Fork 19.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.0] [Performance] Improve large line performance. Improve line smoothing. #13314
Conversation
Limiting the control points inside the range of two connect points. Instead of the global bounding box in the previous implementation
Thanks for your contribution! The pull request is marked to be |
wow, this is super exciting. can't wait to test this improvement on the uPlot benchmark! https://github.com/leeoniya/uPlot#performance a 10x improvement would move echarts to second place. would you mind reviewing my implementation to make sure i'm using the library correctly, and also let me know when this will be available to test in a linkable build? https://github.com/leeoniya/uPlot/blob/master/bench/ECharts.html cheers! |
also, your smoothing does not represent the data well. take a look at using the Catmull-Rom Spline (Centripetal) algo (not fully optimized): https://leeoniya.github.io/uPlot/demos/line-smoothing.html in general, the smoothing changes are probably best for a separate PR from the performance improvements. |
@leeoniya Thanks for the benchmark. I'm really impressed by the performance of uPlot! I add an optimized version that takes a Float64Array as input data. It's a bit more complex to write but has better performance and takes less heap memory. Optimized version: Original verison: Tested the optimized version on my 15inch mac. The initialization time is about 170ms. And the heap memory is about 10~12MB(peak), 5MB(final). About the smoothing algorithm. I have tried the Catmull-Rom Spline before. But it has similar issues which may look quite weird I'm not sure if you did any modification on the original Catmull-Rom Spline algorithm. |
PS: I did both the performance improvement and smoothing algorithm improvement when refactoring the polyline builder. So I didn't try to separate it into two PRs. |
i simply adjusted this implementation for the uPlot demo from this gist: https://github.com/leeoniya/uPlot/blob/master/demos/line-smoothing.html#L49 i don't think the version i'm using would behave as your posted image given the very few and sparse datapoints which are visible. can you try your dataset in the uPlot demo and see how that looks? (sorry i'm on holiday and on phone, so cannot test it for a while). |
https://echarts-bench-from-uplot.glitch.me/uplot-smoothing.html I tried it and the result seems a bit different from my Catmull-Rom spline implementation. Haven't dig into it and not sure if it's because I drew the spline with interpolated points and in the uplot are bezier curves. But it looks similar to our original bezier curve implementation without limiting to min, max extremes. The problem is the curve will exceed the extremes. For example under the 0 baseline or exceed the max value(if the max value has some meaning like 100 percent. If we limiting the curve between global min, max extremes. It becomes almost the same as the result before change in this PR. |
Congratulations! Your PR has been merged. Thanks for your contribution! 👍 |
Merged this PR. @leeoniya We can still discuss the smoothing algorithm here. I will open another PR to improve it if necessary. |
thanks for the detailed reply. i agree that exceeding the local extremes is a worse outcome than a more "S" shaped creative interpolation between datapoints. overall, i hate smoothed charts since they trade data clarity for a stylistic choice. similar to stacked area/trendlines. i wish people would simply stop using these types of misleading charts. hopefully by not supporting these in uPlot i help a little to accelerate their demise :) i've tried feeding typed arrays to uPlot in the past and surprisingly did not measure a difference in performance. i'll take a closer look in a few days at your improvement to the uPlot bench, but i was surprised that typed arrays made a significant difference for you. maybe i should look into it again. out of curiosity, does echarts handle data gaps by default or is this an additional setting? in uPlot the null-testing per-datapoint adds ~20% overhead to the bench in both, dataset min/max finding as well as path drawing. i decided it was an okay trade-off instead of bloating the code and maintaining slightly different dedicated loops for gap-less data. |
Perhaps it's because echarts use a different data structure that has worse performance than uplot when using a normal array. I noticed the data structure used in uplot is [
[........], // x dimension
[........], // y dimension
[........] // z dimension
] It's much flatter than the structure echarts used. [
[x, y, z],
[x, y, z],
[x, y, z]
] Which may need less heap memory and have less GC pressure. Using Float64Array in echarts is mainly for getting away from long GC time because TypedArray don't take the heap memory. I tested read/write of Float64Array, it has no big difference than a normal array. Also in echarts we provide a faster way to process the Float64Array data. Meanwhile, when processing the normal array we need to check the structure of each item in the array because the structure we supported is quite flexible. It may be convenient for the developers, but not good for the performance:(
echarts handles null data by default. Haven't tested how much time can be saved if we remove the null testing. But I guess it should not the bottleneck for echarts currently. |
Hi @leeoniya I did a bit more optimization in these two days. Now the time has been reduced to about 140ms. With LTTB downsample enabled, it can be less than 100ms on my 15inch MacBook and the display result is almost the same. https://echarts-bench-from-uplot.glitch.me/index-lttb.html I also add two test cases with 10x larger data. https://echarts-bench-from-uplot.glitch.me/index-large.html Just want to thank you and your uplot that motivate me to push the performance of echarts to the limit. |
thanks @pissang would you like to submit a PR to the uPlot repo for the implementation improvements as well as bumping the echarts lib when the next alpha or beta of 5.0 is tagged and available via jsDelivr CDN? |
@leeight Sure. |
This pull request include two improvements to line series.
Details about performance improvement.
1. Use TypedArray.
The most important thing did to improve the performance is to use TypedArray instead of normal array as much as possible. It will reduce the heap memory and GC cost significantly.
In this PR we use TypedArray to store the points of polyline and polygons. Also no
setItemLayout
anymore.2. Bake the transform of dataToPoint to matrix in Cartesian2D
It will simplify the transform to a very fast 6x2 multiply and add operations. But it only works for affine transform when axis type is
time
orvalue
. So it's mainly for boosting the time series data.3. Remove unnecessary date parsing when it's a timestamp.
It can avoid creating millions of temporary
Date
object when parsing the time series data.5. Speed up storage access in List component.
It's a trick did for modern JS engine. I found the access of
storage
is slow when initializing millions of data in https://github.com/apache/incubator-echarts/blob/99ab1d7a2f1413ce38dd55d3808c4e434fe49ba8/src/data/List.ts#L628I profiled it and saw
KeyedLoadIC_Megamorphic
in the profiling result, which means it's not a fast property access.So we simplify the case to the following code.
And the test result shows
fastObjectAccess
can be 10x faster thanslowObjectAccess
. I can't find out why and how to avoid it because my limited knowledge of JS engine. So I add an extra_storageArr
array and get from this array in the hotspot code likeList#_initDataFromProvider
. It works well and the cost of_initDataFromProvider
reduced to half.Performance Result
Render time comparison between 4.x and 5.0
Heap memory comparison between 4.x and 5.0
Details about smooth improvement.
The original smooth algorithm will constraint the control points to the global bounding box. Which may lead to two issues as shown in the following picture.
So in this PR I did two changes.
And the result after change.
Fixed issues
#12249
#10200
#4556