use multithreading in basic operations #2647

bkamins · 2021-03-08T14:00:40Z

With this PR operations that subset a data frame will use multi threading.
This should be non-breaking.

I am not sure if it is worth to use threading in operations that operate on short columns (or how costly it is to use it if there is only one thread in the Julia session). I will ask for advice on Slack.

quinnj · 2021-03-08T15:35:14Z

It's my understanding that the Threads.@threads pattern is......discouraged in favor of:

@sync for i = 1:Threads.nthreads()
    Threads.@spawn expr
end

Spawning individual tasks allows nested multithreading to all operate cooperatively, whereas if Threads.@threads is called in a nested context, it will only ever use one thread.

bkamins · 2021-03-08T15:39:17Z

@quinnj - thank you for commenting - just pushed this in parallel after discussing the PR with @nalimilan.

bkamins · 2021-03-08T16:16:57Z

Some benchmarks:

Machine 1 (faster, less memory)

julia> df = DataFrame(x=1:10^6-1, y=1:10^6-1);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  20
  --------------
  minimum time:     1.236 ms (0.00% GC)
  median time:      1.279 ms (0.00% GC)
  mean time:        1.327 ms (3.13% GC)
  maximum time:     1.774 ms (15.46% GC)
  --------------
  samples:          3766
  evals/sample:     1

julia> df = DataFrame(x=1:10^6, y=1:10^6);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  44
  --------------
  minimum time:     1.044 ms (0.00% GC)
  median time:      1.136 ms (0.00% GC)
  mean time:        1.686 ms (18.08% GC)
  maximum time:     71.967 ms (94.49% GC)
  --------------
  samples:          2977
  evals/sample:     1

Machine 2 (more memory but slower)

julia> df = DataFrame(x=1:10^6-1, y=1:10^6-1);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  20
  --------------
  minimum time:     1.178 ms (0.00% GC)
  median time:      1.392 ms (0.00% GC)
  mean time:        1.527 ms (7.76% GC)
  maximum time:     93.650 ms (98.05% GC)
  --------------
  samples:          3273
  evals/sample:     1

julia> df = DataFrame(x=1:10^6, y=1:10^6);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  53
  --------------
  minimum time:     632.105 μs (0.00% GC)
  median time:      1.446 ms (0.00% GC)
  mean time:        3.442 ms (6.28% GC)
  maximum time:     70.976 ms (28.92% GC)
  --------------
  samples:          1453
  evals/sample:     1

So minimum time is OK, but - probably expectedly - GC kills mean performance in threaded code.

src/dataframe/dataframe.jl

test/constructors.jl

test/indexing.jl

src/dataframe/dataframe.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

src/dataframe/dataframe.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

quinnj

Looks like NEWS needs a conflict resolution

NEWS.md

src/dataframe/dataframe.jl

bkamins · 2021-03-17T07:14:27Z

Here are the current benchmarks (code generating them is in /benchmarks/constructor_and_indexing):

                                  |  DataFrames.jl 0.22.5                                   |  PR 1 thread                                            |  PR 2 threads                                           |  PR 4 threads
 Row │ cols   type     op         |  10           999999       1000000      10000000        |  10           999999       1000000      10000000        |  10           999999       1000000      10000000        |  10           999999       1000000      10000000
     │ Int64  String   String     |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?
─────┼──────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────
   1 │     1  integer  copy       |  0.00049914   0.434676     0.395179      36.8135        |  0.000294115  0.399052     0.390589      36.5039        |  0.000345766  0.426377     0.407724      36.6562        |  0.000299443  0.431602     0.430272      36.6989
   2 │     1  string   copy       |  0.000495287  0.443214     0.410268      38.2015        |  0.000335267  0.481872     0.389656      37.4125        |  0.000364807  0.416603     0.397744      37.6125        |  0.000304614  0.434871     0.414354      37.5988
   3 │     1  pooled   copy       |  0.000542687  0.987327     0.977238      23.5596        |  0.000389587  0.996564     0.986212       9.89233       |  0.000386684  0.989482     1.00886        9.75336       |  0.00039128   0.972749     0.985939       9.59844
   4 │     1  integer  :          |  0.000509426  0.432285     0.394343      36.8382        |  0.000552656  0.397713     0.397515      36.6271        |  0.000538832  0.437621     0.394144      36.7785        |  0.000534271  0.421483     0.400633      36.7182
   5 │     1  string   :          |  0.000505482  0.444545     0.411501      38.0902        |  0.00055344   0.484911     0.395183      37.3788        |  0.000550658  0.422138     0.402602      37.6229        |  0.000542283  0.44978      0.422908      37.5575
   6 │     1  pooled   :          |  0.000538759  0.976056     1.0065        23.5789        |  0.000575596  0.998331     0.980433      23.9711        |  0.000582745  0.99482      0.998427       9.75982       |  0.000577386  0.981078     0.973107       9.59767
   7 │     1  integer  1:end-5    |  0.000401286  0.436428     0.367318      36.494         |  0.000425735  0.391743     0.366959      35.6719        |  0.00042145   0.421541     0.362561      35.6557        |  0.000428135  0.423249     0.366974      35.8491
   8 │     1  string   1:end-5    |  0.000390529  1.08254      1.18421       40.3608        |  0.000438739  1.07858      1.06941       39.6897        |  0.0004352    1.06275      0.994047      39.7655        |  0.000438126  1.08354      1.05836       39.7997
   9 │     1  pooled   1:end-5    |  0.00041404   1.01687      1.0972        23.9615        |  0.000454487  0.996458     0.937732      23.7671        |  0.000452657  1.00102      0.941015       9.84778       |  0.000454533  0.921068     0.925264       9.93004
  10 │     1  integer  1:5        |  0.000376495  0.00037991   0.000394724    0.000361663   |  0.000408975  0.000408671  0.000398946    0.000391752   |  0.000401337  0.000410713  0.000408605    0.000384876   |  0.000404695  0.000401329  0.000394566    0.000388118
  11 │     1  string   1:5        |  0.000378382  0.000385529  0.000386164    0.000376354   |  0.000418884  0.00041121   0.000412785    0.000403215   |  0.00041007   0.000423749  0.000422261    0.00040251    |  0.000426365  0.000412881  0.000417495    0.00040021
  12 │     1  pooled   1:5        |  0.000395104  0.00041703   0.000456031    0.00038907    |  0.000494031  0.000493621  0.000498262    0.000473349   |  0.000451525  0.000454116  0.000543836    0.000441071   |  0.000443899  0.000451302  0.00044793     0.000430975
  13 │     2  integer  copy       |  0.000679323  1.19632      1.25001       78.7207        |  0.000494301  1.13333      1.17807       80.0601        |  0.000489908  1.14812      0.99602       45.0066        |  0.000496378  1.19035      0.968695      45.6165
  14 │     2  string   copy       |  0.000674352  1.19876      1.25141      101.261         |  0.000494077  1.154        1.14309       99.0222        |  0.000494658  1.1594       0.996508      44.802         |  0.000488858  1.15069      1.00949       43.6705
  15 │     2  pooled   copy       |  0.000746924  2.23861      2.33095       48.7821        |  0.000545032  2.18363      2.18021       47.8164        |  0.000547377  2.17354      1.56341       13.56          |  0.00054634   2.16493      1.55624       13.4453
  16 │     2  integer  :          |  0.000693057  1.15253      1.26886       73.9197        |  0.000720606  1.12544      1.14131       72.177         |  0.000718427  1.12235      1.00497       37.6585        |  0.000730937  1.1944       1.00433       37.72
  17 │     2  string   :          |  0.000658335  1.20573      1.22101      100.088         |  0.000729825  1.15251      1.15922       99.2691        |  0.000720953  1.18998      1.00154       44.7358        |  0.000711929  1.17151      0.999476      42.3758
  18 │     2  pooled   :          |  0.000759523  2.22298      2.28386       49.0871        |  0.00080247   2.18727      2.19562       48.46          |  0.00079107   2.16733      1.57781       13.5757        |  0.000799646  2.17522      1.58026       15.0985
  19 │     2  integer  1:end-5    |  0.000564213  1.19778      1.17763       70.3685        |  0.000601478  1.13102      1.15087       71.041         |  0.000603538  1.14052      1.1889        73.219         |  0.000617028  1.11685      1.16078       72.2842
  20 │     2  string   1:end-5    |  0.000554858  2.23064      2.24962      108.902         |  0.00060979   2.07214      2.05828      107.68          |  0.000611665  2.00157      2.09059       55.3175        |  0.000611588  2.13818      2.17458       55.5889
  21 │     2  pooled   1:end-5    |  0.000611194  2.12524      2.18084       48.0822        |  0.000659343  2.06895      1.94141       47.9947        |  0.000683734  2.06182      1.95271       13.6609        |  0.000699373  2.05002      2.11956       13.8211
  22 │     2  integer  1:5        |  0.000517598  0.000532129  0.000527154    0.000496299   |  0.000570904  0.000585482  0.000559219    0.000548856   |  0.000573995  0.000580995  0.000579087    0.000561497   |  0.000585151  0.000606278  0.000581359    0.000552296
  23 │     2  string   1:5        |  0.000542005  0.000548079  0.000549989    0.00052801    |  0.000599615  0.000610193  0.000595383    0.000559726   |  0.000600419  0.000618287  0.000619211    0.000572221   |  0.000595847  0.000592152  0.000620763    0.000567183
  24 │     2  pooled   1:5        |  0.000562869  0.000594667  0.000604243    0.000570366   |  0.000733021  0.000744812  0.0007409      0.000725439   |  0.000656928  0.0006644    0.000680158    0.000639155   |  0.000667877  0.000654401  0.000651361    0.000638283
  25 │     3  integer  copy       |  0.000859329  1.97082      1.95491      114.567         |  0.000650855  1.888        1.8943       112.674         |  0.00065903   1.89387      1.82394       81.0544        |  0.000665323  1.89764      1.72183       44.9112
  26 │     3  string   copy       |  0.000825655  1.99185      2.0429       193.773         |  0.000625632  1.90546      1.93181      189.987         |  0.000636364  1.92841      1.81345      131.977         |  0.000630434  1.90891      1.72212       46.0159
  27 │     3  pooled   copy       |  0.000956091  3.55017      3.72087       70.9617        |  0.0007245    3.39532      3.39599       69.1517        |  0.000727739  3.42417      2.79859       26.7378        |  0.000739414  3.40451      2.04515       27.0857
  28 │     3  integer  :          |  0.000820425  1.92865      2.04525      106.17          |  0.00092219   1.88534      1.89839      103.764         |  0.000919854  1.90638      1.80056       71.3591        |  0.000923781  1.89959      1.72614       38.6843
  29 │     3  string   :          |  0.00083474   1.94814      2.04826      194.834         |  0.000895942  1.89318      1.93004      190.013         |  0.000899512  1.92614      1.84079      131.629         |  0.000879516  1.89305      1.72763       47.5389
  30 │     3  pooled   :          |  0.00092332   3.39207      3.4687        70.8639        |  0.000977154  3.42869      3.38522       68.2833        |  0.000967688  3.41824      2.70307       27.0025        |  0.000964636  3.42019      2.06374       28.5991
  31 │     3  integer  1:end-5    |  0.000722382  1.96921      2.0154       103.007         |  0.000791356  1.91271      1.9321       100.212         |  0.000810111  1.90702      2.08992       69.529         |  0.000804184  1.87525      2.0189        38.6179
  32 │     3  string   1:end-5    |  0.000732045  3.11024      3.41474      208.695         |  0.00078456   3.08402      3.16887      204.848         |  0.00078829   3.08848      3.6555       150.572         |  0.000760458  3.35707      3.26024       70.3561
  33 │     3  pooled   1:end-5    |  0.000804563  3.11316      3.1891        70.382         |  0.000875879  3.11359      3.05418       67.0261        |  0.000862971  3.18369      3.13776       27.476         |  0.000863725  3.19809      3.0784        27.4594
  34 │     3  integer  1:5        |  0.000680263  0.000671059  0.000689962    0.00062154    |  0.000744617  0.000767285  0.00074281     0.000692272   |  0.000766342  0.000750638  0.00075643     0.00069304    |  0.000763721  0.000768847  0.00075368     0.000685779
  35 │     3  string   1:5        |  0.000697318  0.000725678  0.000714741    0.000651265   |  0.000774161  0.000754008  0.000765798    0.0006886     |  0.000779261  0.000805847  0.000811194    0.000723278   |  0.000765079  0.000793202  0.000764787    0.000718884
  36 │     3  pooled   1:5        |  0.000764025  0.000779728  0.000795222    0.00074512    |  0.0010056    0.00092625   0.00095872     0.000888093   |  0.000868118  0.000834279  0.000857422    0.000799948   |  0.000875613  0.0008743    0.000854768    0.000812107
  37 │     4  integer  copy       |  0.0009606    2.53692      2.74477      327.169         |  0.000801705  2.59861      2.64846      327.722         |  0.000812529  2.54433      2.21674      115.667         |  0.000780835  2.6188       2.38634       49.0887
  38 │     4  string   copy       |  0.0009968    2.64367      2.79322      505.82          |  0.000775085  2.58778      2.69138      505.33          |  0.00077847   2.5864       2.2484       137.262         |  0.000741447  2.68157      2.36143       49.2196
  39 │     4  pooled   copy       |  0.0010829    4.54012      4.67678       89.1155        |  0.000881453  4.59306      4.57244       86.313         |  0.000893019  4.58468      3.04962       34.8772        |  0.000891813  4.66487      2.48297       31.5785
  40 │     4  integer  :          |  0.0010009    2.59578      2.73846      323.886         |  0.0010807    2.59233      2.68274      326.931         |  0.0010837    2.58282      2.22098      117.776         |  0.0010233    2.60786      2.38909       49.3215
  41 │     4  string   :          |  0.0009899    2.6299       2.78348      506.326         |  0.0010513    2.59005      2.68936      505.215         |  0.0010674    2.57494      2.25078      246.476         |  0.0010094    2.67444      2.34516      129.385
  42 │     4  pooled   :          |  0.0010808    4.53091      4.65456       89.2504        |  0.0011966    4.53459      4.51011       86.1889        |  0.0011763    4.55097      3.05904       34.8609        |  0.0011569    4.66105      2.47399       31.6768
  43 │     4  integer  1:end-5    |  0.000875304  2.6088       2.83297      324.25          |  0.000971176  2.64287      2.75142      328.511         |  0.000984667  2.64139      2.82075      115.837         |  0.000949769  2.71576      2.76518       51.741
  44 │     4  string   1:end-5    |  0.000872286  4.11794      4.47723      521.924         |  0.0009685    4.12215      4.40066      524.939         |  0.000972389  4.23244      4.48262      267.664         |  0.000961294  4.18993      4.50478       82.0119
  45 │     4  pooled   1:end-5    |  0.0009635    4.25499      4.25513       87.2929        |  0.0010739    4.18169      4.14821       85.7372        |  0.001109     4.27362      4.21689       35.1345        |  0.0010745    4.31149      4.16611       32.178
  46 │     4  integer  1:5        |  0.000842098  0.000854595  0.000839681    0.000726559   |  0.000945074  0.000973014  0.000932309    0.000805115   |  0.000946571  0.00100097   0.000942927    0.000807782   |  0.000918314  0.000966369  0.000933725    0.000814506
  47 │     4  string   1:5        |  0.000876455  0.00155402   0.0009366      0.000766748   |  0.000971063  0.000944917  0.000965389    0.000860667   |  0.000983364  0.0010076    0.0009684      0.000849266   |  0.00095644   0.0009879    0.000969833    0.00086492
  48 │     4  pooled   1:5        |  0.000967125  0.0010959    0.000921421    0.000847533   |  0.0012515    0.0012009    0.0012163      0.0010886     |  0.001066     0.0010619    0.0010358      0.000965786   |  0.0010562    0.0010768    0.0010617      0.000964091

bkamins · 2021-03-17T17:50:49Z

This should be done. Could you please comment that you have looked at the PR and all looks OK before merging (these things are more tricky than they seem). Thank you !

nalimilan · 2021-03-18T17:34:02Z

Looks good, but I wonder whether it would be worth introducing a tcollect helper function similar to tforeach at #2661. That would make sense if we expect to use the same pattern elsewhere.

All branches could be replaced with something like this, and the VERSION and nthreads() check could be defined within tcollect:

new_columns = tcollect(AbstractVector, (_columns(df)[i][selected_rows] for i in selected_columns),
                       threaded=length(selected_rows) >= 1_000_000)
return DataFrame(new_columns, idx, copycols=false)

What do you think?

bkamins · 2021-03-18T17:36:54Z

OK - I will add tcollect (unless I run into issues 😄)

bkamins · 2021-03-18T18:59:40Z

I could not find a better solution that what is currently implemented without regression in performance. I will merge this PR tomorrow if no more comments are raised.

bkamins · 2021-03-19T09:20:59Z

Thank you! Of course code organization improvements are welcome - there would be a proposal that would not regress the performance please open a PR.

use multithreading in basic operations

e5ad88e

bkamins added performance non-breaking The proposed change is not breaking labels Mar 8, 2021

bkamins added this to the 1.0 milestone Mar 8, 2021

switch to @Spawn and add conditionw when threading is used

9e05929

bkamins added 2 commits March 8, 2021 16:56

fix typos

3d4cdbc

add @async

47263d9

add threading tests

dbad550

nalimilan reviewed Mar 8, 2021

View reviewed changes

bkamins and others added 2 commits March 8, 2021 18:13

Apply suggestions from code review

d1e134b

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

fixes after code review

3dd9c2e

bkamins commented Mar 8, 2021

View reviewed changes

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved

Update src/dataframe/dataframe.jl

67a45b8

bkamins mentioned this pull request Mar 14, 2021

Add more threading supoprt #2626

Closed

nalimilan approved these changes Mar 14, 2021

View reviewed changes

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved

Update src/dataframe/dataframe.jl

c9ef020

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

quinnj approved these changes Mar 15, 2021

View reviewed changes

Merge branch 'main' into threaded_elementary

5ba5d50

bkamins commented Mar 16, 2021

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

Update NEWS.md

456c40c

bkamins commented Mar 16, 2021

View reviewed changes

src/dataframe/dataframe.jl Show resolved Hide resolved

bkamins added 2 commits March 16, 2021 17:54

add performance benchmarks

5d15fb2

improve performance of getindex and construction for small data frames

b11638d

nalimilan reviewed Mar 16, 2021

View reviewed changes

src/dataframe/dataframe.jl Show resolved Hide resolved

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved

minor code cleanup

62e0a51

bkamins mentioned this pull request Mar 17, 2021

Performance issue on filter #2663

Closed

final cleanup

5646db9

bkamins merged commit c0c8cd3 into main Mar 19, 2021

bkamins deleted the threaded_elementary branch March 19, 2021 09:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use multithreading in basic operations #2647

use multithreading in basic operations #2647

bkamins commented Mar 8, 2021

quinnj commented Mar 8, 2021 •

edited

Loading

bkamins commented Mar 8, 2021

bkamins commented Mar 8, 2021

quinnj left a comment

bkamins commented Mar 17, 2021 •

edited

Loading

bkamins commented Mar 17, 2021

nalimilan commented Mar 18, 2021

bkamins commented Mar 18, 2021

bkamins commented Mar 18, 2021

bkamins commented Mar 19, 2021

use multithreading in basic operations #2647

use multithreading in basic operations #2647

Conversation

bkamins commented Mar 8, 2021

quinnj commented Mar 8, 2021 • edited Loading

bkamins commented Mar 8, 2021

bkamins commented Mar 8, 2021

quinnj left a comment

Choose a reason for hiding this comment

bkamins commented Mar 17, 2021 • edited Loading

bkamins commented Mar 17, 2021

nalimilan commented Mar 18, 2021

bkamins commented Mar 18, 2021

bkamins commented Mar 18, 2021

bkamins commented Mar 19, 2021

quinnj commented Mar 8, 2021 •

edited

Loading

bkamins commented Mar 17, 2021 •

edited

Loading