Add FixedWidthRows for the arrow-row #4524

yahoNanJing · 2023-07-14T11:15:08Z

Which issue does this PR close?

Closes #4523.

Rationale for this change

There are mainly two reasons to introduce the FixedWidthRows:

For group value of fixed length, we don't need to store the offsets vector to get the row value, which makes the operation of getting the row value much more efficient. It's also proved by the benchmark results, especially for tpch q17.
By introducing the row_width to the RowConverter, it will be much easier for the Datafusion side to decide whether to make the group value embedded or not. The benchmark results also prove that when the group values are of small enough fixed length, to embed the value to the RawTable will bring great benefits.

What changes are included in this PR?

Are there any user-facing changes?

yahoNanJing · 2023-07-14T14:42:14Z

With the dependency of this PR, the tpch 1G benchmark for the datafusion, https://github.com/yahoNanJing/arrow-datafusion/tree/EBAY-KYLIN-4003-5, is as follows:

single partition

the default partitions

yahoNanJing · 2023-07-14T14:47:54Z

Next step, I will do some more optimization at the datafusion side to push the fixed row value to the RawTable value if the row width is less than a threshold.

tustvold · 2023-07-14T15:03:44Z

I think I would like to hold off on merging this for a bit, I want to do some other experiments first before committing to this approach

alamb · 2023-07-14T15:36:28Z

@yahoNanJing can you please explain the rationale for the introduction of a fixed width row to a variable length one?

yahoNanJing · 2023-07-17T05:47:36Z

After embedding the value to the RawTable, for tpch q17, there's around 30% performance gain compared to the main branch.

single partition
default partition

alamb · 2023-07-17T17:52:10Z

More context for anyone else following along here can be found in apache/datafusion#6969

I would be very interested in trying to specialize the row_hash grouping operator t avoid the Row format entirely (and use a native Vec<T> for primitive types) and see how that compares with the fixed width row

yahoNanJing · 2023-07-17T21:50:55Z

I would be very interested in trying to specialize the row_hash grouping operator t avoid the Row format entirely (and use a native Vec for primitive types) and see how that compares with the fixed width row

Hi @alamb, there are three reasons that I use the fixed width row:

It covers the null value
It may contain multiple columns not just for one column
For single column case, to embed a variable length column value, like String, to the RawTable may not be good.

alamb · 2023-07-17T22:01:22Z

For single column case, to embed a variable length column value, like String, to the RawTable may not be good.

Just to be clear, what I was imagining for the group storage is not to change the contents of the RawTable (it will continue to contain group_indexes).

But instead of storing group_values using the arrow Row format

 stores "group       stores group values, 
    indexes"          in arrow_row format 
                                          
 ┌─────────────┐      ┌────────────┐      
 │   ┌─────┐   │      │ ┌────────┐ │      
 │   │  5  │   │ ┌────┼▶│  "A"   │ │      
 │   ├─────┤   │ │    │ ├────────┤ │      
 │   │  9  │   │ │    │ │  "Z"   │ │      
 │   └─────┘   │ │    │ └────────┘ │      
 │     ...     │ │    │            │      
 │   ┌─────┐   │ │    │    ...     │      
 │   │  1  │───┼─┘    │            │      
 │   ├─────┤   │      │            │      
 │   │ 13  │───┼─┐    │ ┌────────┐ │      
 │   └─────┘   │ └────┼▶│  "Q"   │ │      
 └─────────────┘      │ └────────┘ │      
                      │            │      
                      └────────────┘      
                                          
                                          
                                          
       map            group_values        
  (Hash Table)

We would instead store the group values using a native type like Vec<T> like this

 stores "group               stored in a      
    indexes"                native Vec<T>     
                                              
 ┌─────────────┐            ┌──────────┐      
 │   ┌─────┐   │            │ ┌──────┐ │      
 │   │  5  │   │    ┌───────┼▶│  1   │ │      
 │   ├─────┤   │    │       │ ├──────┤ │      
 │   │  9  │   │    │       │ │  3   │ │      
 │   └─────┘   │    │       │ └──────┘ │      
 │     ...     │    │       │          │      
 │   ┌─────┐   │    │       │    ...   │      
 │   │  1  │───┼────┘       │          │      
 │   ├─────┤   │            │          │      
 │   │ 13  │───┼────┐       │ ┌──────┐ │      
 │   └─────┘   │    └───────┼▶│  5   │ │      
 └─────────────┘            │ └──────┘ │      
                            │          │      
                            └──────────┘      
                                              
                                              
                            group_values      
       map                                    
  (Hash Table)

I agree the null value would need some special handling, but since this would only be for single columns (where there can be at most one null value) I think we could figure out some way to handle it

yahoNanJing · 2023-07-17T22:24:18Z

At my side, the bottleneck is the get_mut() of the RawTable. And the biggest improvement of this PR brings is to embed the value to the RawTable, which speed up the get_mut() very much.

I don't know whether it will bring much benefit by just changing the Row to the Vec and I'm looking forward to your benchmark results 😄

alamb · 2023-07-18T09:31:29Z

At my side, the bottleneck is the get_mut() of the RawTable. And the biggest improvement of this PR brings is to embed the value to the RawTable, which speed up the get_mut() very much.

Ah, that is a good insight -- the change yahoNanJing/arrow-datafusion@a18ac07 embeds the group value directly into the table 🤔

Another experiment we could try would be to use unsafe accesses to the group_values table / rows 🤔

Update: I am running this experiment with apache/datafusion#7010

I will also try and compare it to @yahoNanJing 's approach

alamb · 2023-07-18T20:42:03Z

Here are my benchmark results: apache/datafusion#6969 (comment) (TLDR they are mixed)

yahoNanJing · 2023-07-20T03:18:41Z

Hi @alamb and @tustvold, how's your testing going on? And what do you think of this proposal?

tustvold · 2023-07-20T03:26:20Z

I hope to have a PR up tomorrow that will special case single columns, and will be able to get some numbers for you. I am very keen to avoid this complexity if we can get the same or better performance in a simpler way

yahoNanJing · 2023-07-20T06:13:59Z

@tustvold, the simpler is the better. I agree to adopt a simpler way if we can achieve the same or similar performance gain.

alamb · 2023-07-20T15:35:38Z

As @tustvold mentions, using native types for single columns should be significantly faster than any row format (thought we need to prove that). I do wonder if fixed width formats would be useful for multi-column equality comparisons (again we would need to show this via benchmarks, etc)

I wonder if one of the hesitations with the "FixedWidthRows" describes how this is implemented rather than its major usecase. Maybe we could name it something like "EqualityRows" or something given it focuses just on equality rather than ordering

tustvold · 2023-07-20T16:16:46Z

I do wonder if fixed width formats would be useful for multi-column equality comparisons

Do we have any benchmarks that are grouping by multiple primitive columns? Whilst I appreciate that optimising for benchmarks is a form of observability bias, it can be a useful way to focus our efforts where they will have the most impact?

using native types for single columns should be significantly faster than any row format

My initial experiments have

--------------------
Benchmark tpch_mem.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ specialize-primitive-group-values ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 385.21ms │                          387.08ms │     no change │
│ QQuery 2     │ 102.04ms │                           84.20ms │ +1.21x faster │
│ QQuery 3     │ 104.50ms │                          103.45ms │     no change │
│ QQuery 4     │  69.37ms │                           71.72ms │     no change │
│ QQuery 5     │ 234.38ms │                          236.49ms │     no change │
│ QQuery 6     │  28.62ms │                           28.05ms │     no change │
│ QQuery 7     │ 550.32ms │                          583.28ms │  1.06x slower │
│ QQuery 8     │ 160.42ms │                          152.87ms │     no change │
│ QQuery 9     │ 348.10ms │                          345.23ms │     no change │
│ QQuery 10    │ 201.04ms │                          204.01ms │     no change │
│ QQuery 11    │  95.55ms │                           94.25ms │     no change │
│ QQuery 12    │ 112.20ms │                          110.78ms │     no change │
│ QQuery 13    │ 199.70ms │                          156.56ms │ +1.28x faster │
│ QQuery 14    │  30.56ms │                           32.08ms │     no change │
│ QQuery 15    │  34.63ms │                           31.02ms │ +1.12x faster │
│ QQuery 16    │ 104.27ms │                          105.36ms │     no change │
│ QQuery 17    │ 582.36ms │                          483.59ms │ +1.20x faster │
│ QQuery 18    │ 994.82ms │                          906.17ms │ +1.10x faster │
│ QQuery 19    │ 112.26ms │                          109.24ms │     no change │
│ QQuery 20    │ 204.93ms │                          218.78ms │  1.07x slower │
│ QQuery 21    │ 695.60ms │                          687.18ms │     no change │
│ QQuery 22    │  55.44ms │                           55.00ms │     no change │
└──────────────┴──────────┴───────────────────────────────────┴───────────────┘

there's around 30% performance gain compared to the main branch

I have not been able to reproduce these results using EBAY-KYLIN-4003-5

Comparing main and EBAY-KYLIN-4003-5
--------------------
Benchmark tpch_mem.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ EBAY-KYLIN-4003-5 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 385.21ms │          379.92ms │     no change │
│ QQuery 2     │ 102.04ms │          103.16ms │     no change │
│ QQuery 3     │ 104.50ms │          110.41ms │  1.06x slower │
│ QQuery 4     │  69.37ms │           74.71ms │  1.08x slower │
│ QQuery 5     │ 234.38ms │          249.58ms │  1.06x slower │
│ QQuery 6     │  28.62ms │           28.92ms │     no change │
│ QQuery 7     │ 550.32ms │          589.97ms │  1.07x slower │
│ QQuery 8     │ 160.42ms │          157.83ms │     no change │
│ QQuery 9     │ 348.10ms │          364.37ms │     no change │
│ QQuery 10    │ 201.04ms │          204.18ms │     no change │
│ QQuery 11    │  95.55ms │          103.85ms │  1.09x slower │
│ QQuery 12    │ 112.20ms │          113.52ms │     no change │
│ QQuery 13    │ 199.70ms │          178.66ms │ +1.12x faster │
│ QQuery 14    │  30.56ms │           31.45ms │     no change │
│ QQuery 15    │  34.63ms │           34.25ms │     no change │
│ QQuery 16    │ 104.27ms │          106.39ms │     no change │
│ QQuery 17    │ 582.36ms │          520.88ms │ +1.12x faster │
│ QQuery 18    │ 994.82ms │         1030.34ms │     no change │
│ QQuery 19    │ 112.26ms │          113.70ms │     no change │
│ QQuery 20    │ 204.93ms │          193.52ms │ +1.06x faster │
│ QQuery 21    │ 695.60ms │          670.74ms │     no change │
│ QQuery 22    │  55.44ms │           55.78ms │     no change │
└──────────────┴──────────┴───────────────────┴───────────────┘

tustvold · 2023-07-30T20:58:24Z

Marking as a draft whilst we work out how best to proceed here

tustvold · 2023-12-23T12:52:10Z

Closing this as I believe the consensus reached was not to proceed with this, feel free to reopen if I am mistaken

github-actions bot added the arrow Changes to the arrow crate label Jul 14, 2023

yahoNanJing marked this pull request as draft July 14, 2023 12:17

yahoNanJing force-pushed the issue-4523 branch 2 times, most recently from fe13b27 to 68085f7 Compare July 14, 2023 13:29

Add FixedWidthRows

8c93d35

yahoNanJing force-pushed the issue-4523 branch from 68085f7 to 8c93d35 Compare July 14, 2023 13:37

yahoNanJing marked this pull request as ready for review July 14, 2023 14:46

yahoNanJing requested review from alamb and tustvold July 14, 2023 14:46

alamb mentioned this pull request Jul 14, 2023

Improve aggregate performance by special casing single group keys apache/datafusion#6969

Closed

Add row_data() for IRows

5b2ae31

This was referenced Jul 18, 2023

Test: inline everything and add row_unchecked #4539

Closed

Test: Use unchecked group_value access in group hash apache/datafusion#7010

Closed

alamb mentioned this pull request Jul 19, 2023

Extract GroupValues (#6969) apache/datafusion#7016

Merged

yahoNanJing mentioned this pull request Jul 20, 2023

Remove RowAccumulators and datafusion-row apache/datafusion#6968

Merged

tustvold marked this pull request as draft July 30, 2023 20:58

tustvold closed this Dec 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FixedWidthRows for the arrow-row #4524

Add FixedWidthRows for the arrow-row #4524

yahoNanJing commented Jul 14, 2023 •

edited

Loading

yahoNanJing commented Jul 14, 2023 •

edited

Loading

yahoNanJing commented Jul 14, 2023

tustvold commented Jul 14, 2023

alamb commented Jul 14, 2023

yahoNanJing commented Jul 17, 2023 •

edited

Loading

alamb commented Jul 17, 2023

yahoNanJing commented Jul 17, 2023

alamb commented Jul 17, 2023

yahoNanJing commented Jul 17, 2023 •

edited

Loading

alamb commented Jul 18, 2023 •

edited

Loading

alamb commented Jul 18, 2023

yahoNanJing commented Jul 20, 2023

tustvold commented Jul 20, 2023

yahoNanJing commented Jul 20, 2023

alamb commented Jul 20, 2023

tustvold commented Jul 20, 2023

tustvold commented Jul 30, 2023

tustvold commented Dec 23, 2023

Add FixedWidthRows for the arrow-row #4524

Add FixedWidthRows for the arrow-row #4524

Conversation

yahoNanJing commented Jul 14, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

yahoNanJing commented Jul 14, 2023 • edited Loading

yahoNanJing commented Jul 14, 2023

tustvold commented Jul 14, 2023

alamb commented Jul 14, 2023

yahoNanJing commented Jul 17, 2023 • edited Loading

alamb commented Jul 17, 2023

yahoNanJing commented Jul 17, 2023

alamb commented Jul 17, 2023

yahoNanJing commented Jul 17, 2023 • edited Loading

alamb commented Jul 18, 2023 • edited Loading

alamb commented Jul 18, 2023

yahoNanJing commented Jul 20, 2023

tustvold commented Jul 20, 2023

yahoNanJing commented Jul 20, 2023

alamb commented Jul 20, 2023

tustvold commented Jul 20, 2023

tustvold commented Jul 30, 2023

tustvold commented Dec 23, 2023

yahoNanJing commented Jul 14, 2023 •

edited

Loading

yahoNanJing commented Jul 14, 2023 •

edited

Loading

yahoNanJing commented Jul 17, 2023 •

edited

Loading

yahoNanJing commented Jul 17, 2023 •

edited

Loading

alamb commented Jul 18, 2023 •

edited

Loading