Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
XiangpengHao committed Oct 28, 2024
1 parent d233dbf commit c645271
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions posts/caching-datafusion/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,8 @@ The figure below compares query latencies between two caching strategies using t
![](cache-arrow.png)

#### Takeaways
- Caching Arrow consistently outperforms or matches caching Parquet across all queries, demonstrating its effectiveness as an optimization strategy.
- The performance gains vary significantly based on query characteristics:
- Caching Arrow consistently outperforms or matches caching Parquet across all queries.
- The performance gains vary significantly:
- Scan-intensive queries (Q20-Q23) show the largest improvements, with up to 3x speedup, since they benefit directly from avoiding Parquet decoding
- Aggregation-heavy queries (Q8-Q18) see more modest gains, as their execution time is dominated by computation rather than data access
- Memory usage can be a concern - Q23 triggered an out-of-memory error when caching Arrow data, highlighting its excessive memory usage.
Expand Down

0 comments on commit c645271

Please sign in to comment.