Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(partition): dictionary encoding and prop elimination #1079

Merged
merged 1 commit into from
Mar 16, 2021

Conversation

monfera
Copy link
Contributor

@monfera monfera commented Mar 15, 2021

Summary

  • file size went down from the earlier 2.6MB to 229kB, because
  • switched to proper arrays in layers
  • elements in layers got dictionary encoded, string count goes from 21061 to 1396
  • dictionary encoding is frequency aware, ie. more frequently used strings get shorter index numbers (reverse frequency sort)
  • removed name and depth properties that the story didn't use

No visual change.

Steps

Just documenting the console steps, I don't think it's worth making a utility out of it, as it's a userland optimization w.r.t. elastic-charts.


temp1 = data;

// let's see count of strings: 21061
temp1.reduce((p,n) => p + Object.keys(n.layers).length ,0)

// let's see the count of distinct strings: 1396, more than 1 OOM fewer, naturally
Object.keys(temp1.reduce((p,n) => Object.assign(p, ...Object.values(n.layers).map(v => ({[v]: true}))), {})).length

// collecting names as keys with values as occurrence count here
a = {}

// populate a
temp1.forEach(n => Object.values(n.layers).forEach(name => a[name] ? a[name]++ : a[name] = 1))

// inverse frequency order of entries
ordered = Object.entries(a).sort(([_, a], [__, b]) => b - a)

// we just ultimately need the simple name array ie. string[]
dictionary = ordered.map(([name]) => name)

// let's make an object with keys as names and values as indices into dictionary
lookup = Object.fromEntries(Object.entries(dictionary).map(([numberString, name]) => [name, Number(numberString)]))

// let's now switch layers from string[] to integer[]
newData = temp1.map(n => ({...n, layers: Object.values(n.layers).map(name => lookup[name])}))

Then I put them in an object: { dictionary, facts: newData } and that's what's in the .json now.

Checklist

  • Proper documentation or storybook story was added for features that require explanation or tutorials

@monfera monfera added the :partition Partition/PieChart/Donut/Sunburst/Treemap chart related label Mar 15, 2021
@codecov-io
Copy link

Codecov Report

Merging #1079 (3887cc8) into master (ca07fdb) will increase coverage by 0.41%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1079      +/-   ##
==========================================
+ Coverage   72.67%   73.09%   +0.41%     
==========================================
  Files         367      383      +16     
  Lines       11405    11720     +315     
  Branches     2478     2524      +46     
==========================================
+ Hits         8289     8567     +278     
- Misses       3101     3130      +29     
- Partials       15       23       +8     
Flag Coverage Δ
unittests 72.67% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/mocks/series/series.ts 91.22% <0.00%> (ø)
src/mocks/series/index.ts 100.00% <0.00%> (ø)
src/mocks/scale/index.ts 100.00% <0.00%> (ø)
src/mocks/index.ts 100.00% <0.00%> (ø)
src/mocks/specs/specs.ts 83.33% <0.00%> (ø)
src/mocks/series/series_identifiers.ts 100.00% <0.00%> (ø)
src/mocks/series/utils.ts 100.00% <0.00%> (ø)
src/mocks/store/store.ts 92.59% <0.00%> (ø)
src/mocks/geometries.ts 92.85% <0.00%> (ø)
src/mocks/store/index.ts 100.00% <0.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca07fdb...3887cc8. Read the comment docs.

@monfera monfera changed the title test(partition): dictionary encoding and redundant prop elimination test(partition): dictionary encoding and prop elimination Mar 15, 2021
@monfera
Copy link
Contributor Author

monfera commented Mar 15, 2021

jenkins test this please

Copy link
Collaborator

@nickofthyme nickofthyme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@monfera monfera merged commit 588bdfa into elastic:master Mar 16, 2021
@nickofthyme
Copy link
Collaborator

🎉 This PR is included in version 25.4.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@nickofthyme nickofthyme added the released Issue released publicly label Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:partition Partition/PieChart/Donut/Sunburst/Treemap chart related released Issue released publicly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants