POC: Use khash sets instead of maps for isin #53059

WillAyd · 2023-05-03T21:04:12Z

This is a POC towards what @realead described in #39799

The IsIn benchmarks overall seemed a bit unreliable, but I could consistently get results of algos.isin.IsinWithArangeSorted that look like:

>>> asv continuous upstream/main HEAD -b algos.isin.IsinWithArangeSorted
       before           after         ratio
     [38881793]       [939529bb]
     <main>           <khash-set-poc>
-        47.3±1μs       42.9±0.3μs     0.91  algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 2000)
-        715±30μs          601±4μs     0.84  algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 100000)
-         107±5μs         89.6±1μs     0.84  algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 8000)
-      26.5±0.4ms       12.2±0.7ms     0.46  algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 1000000)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

The performance improvement on the largest dataset might be in line with @realead expectation that

For big datasets, the running time of the above algorithms is dominated by cache-misses. Thus having twice as many cache-misses, because also values are touched could mean a factor 2 slowdown

pandas/_libs/src/klib/khash.h

pandas/_libs/khash.pxd

WillAyd · 2023-05-24T22:51:09Z

Just found out about the perf tool to measure branch prediction misses. Just ran this for a benchmark:

perf stat python -c "import pandas as pd; import numpy as np; N = 1_000_000; ser = pd.Series(np.arange(N)); vals = np.arange(N); ser.isin(vals)"

On main I get the following:

 Performance counter stats for 'python -c import pandas as pd; import numpy as np; N = 1_000_000; ser = pd.Series(np.arange(N)); vals = np.arange(N); ser.isin(vals)':

          1,965.47 msec task-clock                       #    2.357 CPUs utilized             
               147      context-switches                 #   74.791 /sec                      
                 0      cpu-migrations                   #    0.000 /sec                      
            31,457      page-faults                      #   16.005 K/sec                     
     3,936,570,319      cpu_core/cycles/                 #    2.003 G/sec                     
     3,802,321,801      cpu_atom/cycles/                 #    1.935 G/sec                       (52.15%)
     7,214,481,236      cpu_core/instructions/           #    3.671 G/sec                     
     8,924,412,911      cpu_atom/instructions/           #    4.541 G/sec                       (52.15%)
     1,376,744,013      cpu_core/branches/               #  700.464 M/sec                     
     1,457,321,116      cpu_atom/branches/               #  741.460 M/sec                       (52.15%)
        23,755,435      cpu_core/branch-misses/          #   12.086 M/sec                     
            76,473      cpu_atom/branch-misses/          #   38.908 K/sec                       (52.15%)
    19,950,214,260      cpu_core/slots/                  #   10.150 G/sec                     
     6,988,756,509      cpu_core/topdown-retiring/       #     34.9% Retiring                 
     3,077,912,645      cpu_core/topdown-bad-spec/       #     15.4% Bad Speculation          
     5,841,369,682      cpu_core/topdown-fe-bound/       #     29.2% Frontend Bound           
     4,090,935,951      cpu_core/topdown-be-bound/       #     20.5% Backend Bound            
       607,579,968      cpu_core/topdown-heavy-ops/      #      3.0% Heavy Operations          #     31.9% Light Operations         
     2,935,493,211      cpu_core/topdown-br-mispredict/  #     14.7% Branch Mispredict         #      0.7% Machine Clears           
     2,838,977,426      cpu_core/topdown-fetch-lat/      #     14.2% Fetch Latency             #     15.0% Fetch Bandwidth          
     2,956,225,784      cpu_core/topdown-mem-bound/      #     14.8% Memory Bound              #      5.7% Core Bound               

       0.833991099 seconds time elapsed

       1.010541000 seconds user
       0.958761000 seconds sys

Versus this PR:

 Performance counter stats for 'python -c import pandas as pd; import numpy as np; N = 1_000_000; ser = pd.Series(np.arange(N)); vals = np.arange(N); ser.isin(vals)':

          1,954.42 msec task-clock                       #    2.370 CPUs utilized             
               215      context-switches                 #  110.007 /sec                      
                 2      cpu-migrations                   #    1.023 /sec                      
            26,302      page-faults                      #   13.458 K/sec                     
     3,927,968,672      cpu_core/cycles/                 #    2.010 G/sec                     
     3,786,117,128      cpu_atom/cycles/                 #    1.937 G/sec                       (52.56%)
     7,217,333,673      cpu_core/instructions/           #    3.693 G/sec                     
     8,849,613,601      cpu_atom/instructions/           #    4.528 G/sec                       (52.56%)
     1,378,602,285      cpu_core/branches/               #  705.376 M/sec                     
     1,443,771,308      cpu_atom/branches/               #  738.720 M/sec                       (52.56%)
        23,818,420      cpu_core/branch-misses/          #   12.187 M/sec                     
           299,996      cpu_atom/branch-misses/          #  153.496 K/sec                       (52.56%)
    19,880,779,164      cpu_core/slots/                  #   10.172 G/sec                     
     6,961,062,372      cpu_core/topdown-retiring/       #     35.0% Retiring                 
     3,064,425,117      cpu_core/topdown-bad-spec/       #     15.4% Bad Speculation          
     5,844,723,817      cpu_core/topdown-fe-bound/       #     29.4% Frontend Bound           
     4,026,716,906      cpu_core/topdown-be-bound/       #     20.2% Backend Bound            
       600,833,950      cpu_core/topdown-heavy-ops/      #      3.0% Heavy Operations          #     32.0% Light Operations         
     2,926,332,561      cpu_core/topdown-br-mispredict/  #     14.7% Branch Mispredict         #      0.7% Machine Clears           
     2,818,938,163      cpu_core/topdown-fetch-lat/      #     14.2% Fetch Latency             #     15.2% Fetch Bandwidth          
     2,943,103,050      cpu_core/topdown-mem-bound/      #     14.8% Memory Bound              #      5.4% Core Bound               

       0.824527840 seconds time elapsed

       1.034003000 seconds user
       0.924299000 seconds sys

So not sure this really makes a big difference for branch predictions. Will have to research more

github-actions · 2023-06-24T00:05:50Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

WillAyd · 2023-06-27T21:34:11Z

Closing for now but would be nice to pick up again in the future

WillAyd · 2023-12-04T23:54:52Z

Expanded this to other primitives aside from int64 / uint64. Across the board seems like this helps, although there are some regressions. Needs more investigation

| Change   | Before [daa9cdb1] <csv-perf>   | After [ed3f0467] <khash-set-poc>   |   Ratio | Benchmark (Parameter)                                                                    |
|----------|--------------------------------|------------------------------------|---------|------------------------------------------------------------------------------------------|
| +        | 2.26±0.2ms                     | 14.5±0.5ms                         |    6.44 | algos.isin.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, -2)                    |
| +        | 1.36±0.04ms                    | 1.73±0.2ms                         |    1.27 | algos.isin.IsIn.time_isin_empty('category[object]')                                      |
| +        | 60.4±2μs                       | 74.8±2μs                           |    1.24 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 2000, 'inside')        |
| +        | 1.39±0.06ms                    | 1.68±0.2ms                         |    1.2  | algos.isin.IsIn.time_isin_empty('category[int]')                                         |
| +        | 62.0±2μs                       | 74.0±3μs                           |    1.19 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 2000)                 |
| -        | 32.9±1μs                       | 29.9±1μs                           |    0.91 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 11, 'outside')   |
| -        | 33.1±0.2μs                     | 30.0±1μs                           |    0.91 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 11, 'inside')   |
| -        | 66.1±0.9ms                     | 59.7±0.4ms                         |    0.9  | algos.isin.IsInLongSeriesLookUpDominates.time_isin('object', 5, 'monotone_hits')         |
| -        | 424±10μs                       | 381±4μs                            |    0.9  | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 15, 'outside')  |
| -        | 142±2μs                        | 126±8μs                            |    0.89 | algos.isin.IsIn.time_isin('Int64')                                                       |
| -        | 3.32±0.2ms                     | 2.96±0.1ms                         |    0.89 | algos.isin.IsInForObjects.time_isin('long_floats', 'long_floats')                        |
| -        | 42.3±1μs                       | 37.7±0.4μs                         |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 11, 'inside')  |
| -        | 42.5±0.7μs                     | 38.0±0.6μs                         |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 11, 'outside') |
| -        | 328±10μs                       | 291±2μs                            |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 15, 'inside')   |
| -        | 1.76±0.1ms                     | 1.57±0.02ms                        |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 17, 'outside')  |
| -        | 33.0±1μs                       | 29.2±0.4μs                         |    0.88 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 11, 'outside')  |
| -        | 2.81±0.1ms                     | 2.48±0.07ms                        |    0.88 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 18, 'inside')   |
| -        | 41.4±0.3μs                     | 36.3±0.6μs                         |    0.88 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 2000)                   |
| -        | 2.53±0.4ms                     | 2.19±0.09ms                        |    0.87 | algos.isin.IsIn.time_isin_categorical('datetime64[ns]')                                  |
| -        | 62.9±3μs                       | 54.7±0.7μs                         |    0.87 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 12, 'inside')  |
| -        | 8.16±0.8ms                     | 7.13±0.07ms                        |    0.87 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 19, 'outside')   |
| -        | 1.37±0.03ms                    | 1.19±0.02ms                        |    0.87 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 17, 'inside')   |
| -        | 3.66±0.1ms                     | 3.17±0.03ms                        |    0.87 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 18, 'outside')  |
| -        | 2.55±0.1ms                     | 2.22±0.05ms                        |    0.87 | algos.isin.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 0)                     |
| -        | 40.1±0.4ms                     | 34.6±0.9ms                         |    0.86 | algos.isin.IsInLongSeriesValuesDominate.time_isin('int32', 'monotone')                   |
| -        | 16.1±0.9ms                     | 13.8±0.09ms                        |    0.86 | algos.isin.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 2)                     |
| -        | 41.0±2μs                       | 35.1±0.7μs                         |    0.86 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 2000)                  |
| -        | 3.42±0.1ms                     | 2.89±0.05ms                        |    0.85 | algos.isin.IsInFloat64.time_isin(<class 'numpy.float64'>, 'only_nans_values')            |
| -        | 16.5±2ms                       | 14.0±0.1ms                         |    0.85 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 19, 'outside') |
| -        | 3.10±0.05ms                    | 2.64±0.1ms                         |    0.85 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 18, 'inside')    |
| -        | 6.37±0.6ms                     | 5.42±0.1ms                         |    0.85 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 19, 'inside')    |
| -        | 143±7μs                        | 121±3μs                            |    0.84 | algos.isin.IsIn.time_isin_categorical('Int64')                                           |
| -        | 102±1μs                        | 85.1±2μs                           |    0.84 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 8000)                   |
| -        | 202±100μs                      | 168±3μs                            |    0.83 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 13, 'outside') |
| -        | 222±1μs                        | 185±4μs                            |    0.83 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 14, 'outside')  |
| -        | 2.68±0.4ms                     | 2.22±0.04ms                        |    0.83 | algos.isin.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 0)                     |
| -        | 734±7μs                        | 610±10μs                           |    0.83 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 100000)                 |
| -        | 41.3±3ms                       | 33.9±2ms                           |    0.82 | algos.isin.IsInLongSeriesLookUpDominates.time_isin('float64', 5, 'random_hits')          |
| -        | 473±50μs                       | 390±2μs                            |    0.82 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 14, 'inside')  |
| -        | 19.0±2ms                       | 15.5±0.3ms                         |    0.82 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 19, 'inside')  |
| -        | 48.8±0.6μs                     | 39.4±0.8μs                         |    0.81 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 12, 'inside')   |
| -        | 141±3μs                        | 113±3μs                            |    0.8  | algos.isin.IsIn.time_isin_categorical('int64')                                           |
| -        | 8.04±0.6ms                     | 6.45±0.1ms                         |    0.8  | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 19, 'outside')  |
| -        | 51.8±8μs                       | 41.0±0.4μs                         |    0.79 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 12, 'inside')    |
| -        | 168±5μs                        | 132±2μs                            |    0.78 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 14, 'inside')   |
| -        | 18.2±2ms                       | 14.0±0.7ms                         |    0.77 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 20, 'outside')  |
| -        | 82.9±1ms                       | 63.6±6ms                           |    0.77 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000, 'outside')     |
| -        | 3.51±0.06ms                    | 2.65±0.4ms                         |    0.75 | algos.isin.IsInFloat64.time_isin(<class 'numpy.float64'>, 'few_different_values')        |
| -        | 6.62±1ms                       | 4.99±0.05ms                        |    0.75 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 19, 'inside')   |
| -        | 15.8±2ms                       | 11.7±0.7ms                         |    0.74 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 20, 'inside')    |
| -        | 79.8±2μs                       | 58.9±0.7μs                         |    0.74 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 2000, 'outside')       |
| -        | 81.3±5μs                       | 59.6±2μs                           |    0.73 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 12, 'outside') |
| -        | 101±9ms                        | 73.9±3ms                           |    0.73 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 1000000)              |
| -        | 115±8μs                        | 83.5±5μs                           |    0.72 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 13, 'outside')  |
| -        | 14.1±0.5ms                     | 10.1±0.06ms                        |    0.71 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 20, 'inside')   |
| -        | 103±3μs                        | 73.1±0.7μs                         |    0.71 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 8000)                  |
| -        | 156±40μs                       | 106±9μs                            |    0.68 | algos.isin.IsIn.time_isin('int64')                                                       |
| -        | 54.6±1μs                       | 36.2±2μs                           |    0.66 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 12, 'outside')  |
| -        | 733±30μs                       | 448±3μs                            |    0.61 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 100000)                |
| -        | 27.9±2ms                       | 13.0±1ms                           |    0.47 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 1000000)                |
| -        | 26.7±2ms                       | 7.49±0.2ms                         |    0.28 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 1000000)               |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

Haven't done the PyObject case yet as the naming conventions there don't follow the same conventions as described here. Might need to tackle that in a follow up

WillAyd · 2023-12-05T01:44:41Z

OK got everything set up. Some of the prior regressions were due to improper macro use / declarations. Surprised those didn't throw compiler errors...but there are many layers of indirection between tempita / khash. Something to investigate another day...

Here are the results for a full run of the isin benchmarks - looks like this does help with scalability as larger datasets are showing 20-50% improvement

@mroeschke @jbrockmendel @realead for review

| Change   | Before [daa9cdb1] <csv-perf>   | After [5248e1aa] <khash-set-poc>   |   Ratio | Benchmark (Parameter)                                                                    |
|----------|--------------------------------|------------------------------------|---------|------------------------------------------------------------------------------------------|
| -        | 38.9±0.6μs                     | 35.4±0.3μs                         |    0.91 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 2000)                   |
| -        | 35.6±0.9ms                     | 32.3±0.6ms                         |    0.91 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 750000, 'outside')     |
| -        | 86.2±1μs                       | 77.3±0.4μs                         |    0.9  | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 13, 'inside')    |
| -        | 30.7±0.7μs                     | 27.7±0.4μs                         |    0.9  | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 1000)                   |
| -        | 41.9±1μs                       | 37.3±1μs                           |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 11, 'outside') |
| -        | 171±2μs                        | 152±3μs                            |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 13, 'inside')  |
| -        | 317±30μs                       | 283±4μs                            |    0.89 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 15, 'inside')   |
| -        | 60.1±1μs                       | 53.5±1μs                           |    0.89 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 2000, 'inside')        |
| -        | 34.5±2μs                       | 30.4±1μs                           |    0.88 | algos.isin.IsInIndexes.time_isin_index                                                   |
| -        | 96.2±4μs                       | 83.3±0.6μs                         |    0.87 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 8000)                   |
| -        | 138±6μs                        | 119±0.6μs                          |    0.86 | algos.isin.IsIn.time_isin('Int64')                                                       |
| -        | 38.0±0.6ms                     | 32.7±0.3ms                         |    0.86 | algos.isin.IsInLongSeriesValuesDominate.time_isin('int32', 'monotone')                   |
| -        | 164±2μs                        | 141±4μs                            |    0.86 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 14, 'inside')   |
| -        | 95.0±1μs                       | 81.6±1μs                           |    0.86 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 8000)                  |
| -        | 138±4μs                        | 117±1μs                            |    0.85 | algos.isin.IsIn.time_isin_categorical('int64')                                           |
| -        | 51.5±1μs                       | 43.6±0.9μs                         |    0.85 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 12, 'outside')   |
| -        | 47.7±10μs                      | 40.0±1μs                           |    0.84 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 12, 'inside')    |
| -        | 42.5±3μs                       | 35.6±0.3μs                         |    0.84 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 2000)                  |
| -        | 122±2μs                        | 99.6±1μs                           |    0.82 | algos.isin.IsIn.time_isin('int64')                                                       |
| -        | 46.9±0.5μs                     | 38.3±0.3μs                         |    0.82 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 12, 'inside')   |
| -        | 706±9μs                        | 582±5μs                            |    0.82 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 100000)                |
| -        | 142±5μs                        | 115±1μs                            |    0.81 | algos.isin.IsIn.time_isin_categorical('Int64')                                           |
| -        | 721±10μs                       | 587±4μs                            |    0.81 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 100000)                 |
| -        | 87.4±2ms                       | 70.3±2ms                           |    0.8  | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 1000000)              |
| -        | 83.7±4ms                       | 64.7±2ms                           |    0.77 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000, 'inside')      |
| -        | 84.9±1μs                       | 63.9±0.2μs                         |    0.75 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 13, 'inside')   |
| -        | 3.15±0.06ms                    | 2.30±0.1ms                         |    0.73 | algos.isin.IsInFloat64.time_isin('Float64', 'only_nans_values')                          |
| -        | 109±0.9μs                      | 79.4±1μs                           |    0.73 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 13, 'outside')  |
| -        | 51.5±0.9μs                     | 36.8±0.8μs                         |    0.72 | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 12, 'outside')  |
| -        | 79.0±3ms                       | 56.8±1ms                           |    0.72 | algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000, 'outside')     |
| -        | 3.36±0.05ms                    | 2.35±0.07ms                        |    0.7  | algos.isin.IsInFloat64.time_isin(<class 'numpy.float64'>, 'few_different_values')        |
| -        | 77.6±3μs                       | 54.6±0.6μs                         |    0.7  | algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 12, 'outside') |
| -        | 3.16±0.1ms                     | 2.18±0.03ms                        |    0.69 | algos.isin.IsInFloat64.time_isin(<class 'numpy.float64'>, 'only_nans_values')            |
| -        | 25.5±2ms                       | 13.2±0.7ms                         |    0.52 | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.uint64'>, 1000000)               |
| -        | 25.7±2ms                       | 12.8±0.6ms                         |    0.5  | algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 1000000)                |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

mroeschke

I don't really have too much background to comment here but nice that tests are passing

hacks to get int32/int64 hash sets

939529b

realead reviewed May 4, 2023

View reviewed changes

pandas/_libs/src/klib/khash.h Outdated Show resolved Hide resolved

pandas/_libs/khash.pxd Outdated Show resolved Hide resolved

simplify implementation

a05dda6

github-actions bot added the Stale label Jun 24, 2023

WillAyd closed this Jun 27, 2023

WillAyd added 2 commits December 4, 2023 14:33

Merge remote-tracking branch 'upstream/main' into khash-set-poc

72d685d

working excluding pyobject

ed3f046

WillAyd reopened this Dec 4, 2023

WillAyd added 2 commits December 4, 2023 16:16

perf cleanups

5248e1a

whatsnew

7486a77

WillAyd marked this pull request as ready for review December 5, 2023 01:41

Merge branch 'main' into khash-set-poc

f9e70bd

WillAyd added Performance Memory or execution speed performance and removed Stale labels Dec 5, 2023

try set for duplicated

3ff32e2

mroeschke approved these changes Dec 6, 2023

View reviewed changes

WillAyd added 3 commits December 5, 2023 17:26

Merge branch 'main' into khash-set-poc

4ee97f4

compiler warning fixup

8c418e7

more fix

c36a0ba

WillAyd closed this Mar 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

POC: Use khash sets instead of maps for isin #53059

POC: Use khash sets instead of maps for isin #53059

Uh oh!

WillAyd commented May 3, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

WillAyd commented May 24, 2023

Uh oh!

github-actions bot commented Jun 24, 2023

Uh oh!

WillAyd commented Jun 27, 2023

Uh oh!

WillAyd commented Dec 4, 2023

Uh oh!

WillAyd commented Dec 5, 2023 •

edited

Loading

Uh oh!

mroeschke left a comment

Uh oh!

Uh oh!

Uh oh!

POC: Use khash sets instead of maps for isin #53059

POC: Use khash sets instead of maps for isin #53059

Uh oh!

Conversation

WillAyd commented May 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WillAyd commented May 24, 2023

Uh oh!

github-actions bot commented Jun 24, 2023

Uh oh!

WillAyd commented Jun 27, 2023

Uh oh!

WillAyd commented Dec 4, 2023

Uh oh!

WillAyd commented Dec 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mroeschke left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WillAyd commented May 3, 2023 •

edited

Loading

WillAyd commented Dec 5, 2023 •

edited

Loading