You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading hits dataset into databend, it will generate ~ 16 segments.
Each segment's metadata is stored in JSON format. The segment contains so many fields which makes the file too large.
Hits dataset is small (9kw rows, 20GB compressed data), but the metadata just took 16 * 12M = 172MB, we need to reduce the metadata size (Storage as Binary & compression format).
❯ du -sh _data/1/208469/_sg/f48d4df4462144a3a234fef0d8cd28bf_v2.json
12M _data/1/208469/_sg/f48d4df4462144a3a234fef0d8cd28bf_v2.json
If we set table_meta_segment_count to zero, each query will load the segment metadata multiple times which works very slowly!
Thought we already cached this, it could be better in smaller size.
The text was updated successfully, but these errors were encountered:
Summary
Loading hits dataset into databend, it will generate ~ 16 segments.
Each segment's metadata is stored in JSON format. The segment contains so many fields which makes the file too large.
Hits dataset is small (9kw rows, 20GB compressed data), but the metadata just took
16 * 12M = 172MB
, we need to reduce the metadata size (Storage as Binary & compression format).If we set
table_meta_segment_count
to zero, each query will load the segment metadata multiple times which works very slowly!Thought we already cached this, it could be better in smaller size.
The text was updated successfully, but these errors were encountered: