Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use multithreading in basic operations #2647

Merged
merged 15 commits into from
Mar 19, 2021
Merged

use multithreading in basic operations #2647

merged 15 commits into from
Mar 19, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Mar 8, 2021

With this PR operations that subset a data frame will use multi threading.
This should be non-breaking.

I am not sure if it is worth to use threading in operations that operate on short columns (or how costly it is to use it if there is only one thread in the Julia session). I will ask for advice on Slack.

@bkamins bkamins added performance non-breaking The proposed change is not breaking labels Mar 8, 2021
@bkamins bkamins added this to the 1.0 milestone Mar 8, 2021
@quinnj
Copy link
Member

quinnj commented Mar 8, 2021

It's my understanding that the Threads.@threads pattern is......discouraged in favor of:

@sync for i = 1:Threads.nthreads()
    Threads.@spawn expr
end

Spawning individual tasks allows nested multithreading to all operate cooperatively, whereas if Threads.@threads is called in a nested context, it will only ever use one thread.

@bkamins
Copy link
Member Author

bkamins commented Mar 8, 2021

@quinnj - thank you for commenting - just pushed this in parallel after discussing the PR with @nalimilan.

@bkamins
Copy link
Member Author

bkamins commented Mar 8, 2021

Some benchmarks:

Machine 1 (faster, less memory)

julia> df = DataFrame(x=1:10^6-1, y=1:10^6-1);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  20
  --------------
  minimum time:     1.236 ms (0.00% GC)
  median time:      1.279 ms (0.00% GC)
  mean time:        1.327 ms (3.13% GC)
  maximum time:     1.774 ms (15.46% GC)
  --------------
  samples:          3766
  evals/sample:     1

julia> df = DataFrame(x=1:10^6, y=1:10^6);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  44
  --------------
  minimum time:     1.044 ms (0.00% GC)
  median time:      1.136 ms (0.00% GC)
  mean time:        1.686 ms (18.08% GC)
  maximum time:     71.967 ms (94.49% GC)
  --------------
  samples:          2977
  evals/sample:     1

Machine 2 (more memory but slower)

julia> df = DataFrame(x=1:10^6-1, y=1:10^6-1);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  20
  --------------
  minimum time:     1.178 ms (0.00% GC)
  median time:      1.392 ms (0.00% GC)
  mean time:        1.527 ms (7.76% GC)
  maximum time:     93.650 ms (98.05% GC)
  --------------
  samples:          3273
  evals/sample:     1

julia> df = DataFrame(x=1:10^6, y=1:10^6);

julia> @benchmark copy($df)
BenchmarkTools.Trial: 
  memory estimate:  15.26 MiB
  allocs estimate:  53
  --------------
  minimum time:     632.105 μs (0.00% GC)
  median time:      1.446 ms (0.00% GC)
  mean time:        3.442 ms (6.28% GC)
  maximum time:     70.976 ms (28.92% GC)
  --------------
  samples:          1453
  evals/sample:     1

So minimum time is OK, but - probably expectedly - GC kills mean performance in threaded code.

src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
test/constructors.jl Outdated Show resolved Hide resolved
test/indexing.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
bkamins and others added 2 commits March 8, 2021 18:13
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like NEWS needs a conflict resolution

NEWS.md Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Mar 17, 2021

Here are the current benchmarks (code generating them is in /benchmarks/constructor_and_indexing):

                                  |  DataFrames.jl 0.22.5                                   |  PR 1 thread                                            |  PR 2 threads                                           |  PR 4 threads
 Row │ cols   type     op         |  10           999999       1000000      10000000        |  10           999999       1000000      10000000        |  10           999999       1000000      10000000        |  10           999999       1000000      10000000
     │ Int64  String   String     |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?        |  Float64?     Float64?     Float64?     Float64?
─────┼──────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────  |  ─────────────────────────────────────────────────────
   1 │     1  integer  copy       |  0.00049914   0.434676     0.395179      36.8135        |  0.000294115  0.399052     0.390589      36.5039        |  0.000345766  0.426377     0.407724      36.6562        |  0.000299443  0.431602     0.430272      36.6989
   2 │     1  string   copy       |  0.000495287  0.443214     0.410268      38.2015        |  0.000335267  0.481872     0.389656      37.4125        |  0.000364807  0.416603     0.397744      37.6125        |  0.000304614  0.434871     0.414354      37.5988
   3 │     1  pooled   copy       |  0.000542687  0.987327     0.977238      23.5596        |  0.000389587  0.996564     0.986212       9.89233       |  0.000386684  0.989482     1.00886        9.75336       |  0.00039128   0.972749     0.985939       9.59844
   4 │     1  integer  :          |  0.000509426  0.432285     0.394343      36.8382        |  0.000552656  0.397713     0.397515      36.6271        |  0.000538832  0.437621     0.394144      36.7785        |  0.000534271  0.421483     0.400633      36.7182
   5 │     1  string   :          |  0.000505482  0.444545     0.411501      38.0902        |  0.00055344   0.484911     0.395183      37.3788        |  0.000550658  0.422138     0.402602      37.6229        |  0.000542283  0.44978      0.422908      37.5575
   6 │     1  pooled   :          |  0.000538759  0.976056     1.0065        23.5789        |  0.000575596  0.998331     0.980433      23.9711        |  0.000582745  0.99482      0.998427       9.75982       |  0.000577386  0.981078     0.973107       9.59767
   7 │     1  integer  1:end-5    |  0.000401286  0.436428     0.367318      36.494         |  0.000425735  0.391743     0.366959      35.6719        |  0.00042145   0.421541     0.362561      35.6557        |  0.000428135  0.423249     0.366974      35.8491
   8 │     1  string   1:end-5    |  0.000390529  1.08254      1.18421       40.3608        |  0.000438739  1.07858      1.06941       39.6897        |  0.0004352    1.06275      0.994047      39.7655        |  0.000438126  1.08354      1.05836       39.7997
   9 │     1  pooled   1:end-5    |  0.00041404   1.01687      1.0972        23.9615        |  0.000454487  0.996458     0.937732      23.7671        |  0.000452657  1.00102      0.941015       9.84778       |  0.000454533  0.921068     0.925264       9.93004
  10 │     1  integer  1:5        |  0.000376495  0.00037991   0.000394724    0.000361663   |  0.000408975  0.000408671  0.000398946    0.000391752   |  0.000401337  0.000410713  0.000408605    0.000384876   |  0.000404695  0.000401329  0.000394566    0.000388118
  11 │     1  string   1:5        |  0.000378382  0.000385529  0.000386164    0.000376354   |  0.000418884  0.00041121   0.000412785    0.000403215   |  0.00041007   0.000423749  0.000422261    0.00040251    |  0.000426365  0.000412881  0.000417495    0.00040021
  12 │     1  pooled   1:5        |  0.000395104  0.00041703   0.000456031    0.00038907    |  0.000494031  0.000493621  0.000498262    0.000473349   |  0.000451525  0.000454116  0.000543836    0.000441071   |  0.000443899  0.000451302  0.00044793     0.000430975
  13 │     2  integer  copy       |  0.000679323  1.19632      1.25001       78.7207        |  0.000494301  1.13333      1.17807       80.0601        |  0.000489908  1.14812      0.99602       45.0066        |  0.000496378  1.19035      0.968695      45.6165
  14 │     2  string   copy       |  0.000674352  1.19876      1.25141      101.261         |  0.000494077  1.154        1.14309       99.0222        |  0.000494658  1.1594       0.996508      44.802         |  0.000488858  1.15069      1.00949       43.6705
  15 │     2  pooled   copy       |  0.000746924  2.23861      2.33095       48.7821        |  0.000545032  2.18363      2.18021       47.8164        |  0.000547377  2.17354      1.56341       13.56          |  0.00054634   2.16493      1.55624       13.4453
  16 │     2  integer  :          |  0.000693057  1.15253      1.26886       73.9197        |  0.000720606  1.12544      1.14131       72.177         |  0.000718427  1.12235      1.00497       37.6585        |  0.000730937  1.1944       1.00433       37.72
  17 │     2  string   :          |  0.000658335  1.20573      1.22101      100.088         |  0.000729825  1.15251      1.15922       99.2691        |  0.000720953  1.18998      1.00154       44.7358        |  0.000711929  1.17151      0.999476      42.3758
  18 │     2  pooled   :          |  0.000759523  2.22298      2.28386       49.0871        |  0.00080247   2.18727      2.19562       48.46          |  0.00079107   2.16733      1.57781       13.5757        |  0.000799646  2.17522      1.58026       15.0985
  19 │     2  integer  1:end-5    |  0.000564213  1.19778      1.17763       70.3685        |  0.000601478  1.13102      1.15087       71.041         |  0.000603538  1.14052      1.1889        73.219         |  0.000617028  1.11685      1.16078       72.2842
  20 │     2  string   1:end-5    |  0.000554858  2.23064      2.24962      108.902         |  0.00060979   2.07214      2.05828      107.68          |  0.000611665  2.00157      2.09059       55.3175        |  0.000611588  2.13818      2.17458       55.5889
  21 │     2  pooled   1:end-5    |  0.000611194  2.12524      2.18084       48.0822        |  0.000659343  2.06895      1.94141       47.9947        |  0.000683734  2.06182      1.95271       13.6609        |  0.000699373  2.05002      2.11956       13.8211
  22 │     2  integer  1:5        |  0.000517598  0.000532129  0.000527154    0.000496299   |  0.000570904  0.000585482  0.000559219    0.000548856   |  0.000573995  0.000580995  0.000579087    0.000561497   |  0.000585151  0.000606278  0.000581359    0.000552296
  23 │     2  string   1:5        |  0.000542005  0.000548079  0.000549989    0.00052801    |  0.000599615  0.000610193  0.000595383    0.000559726   |  0.000600419  0.000618287  0.000619211    0.000572221   |  0.000595847  0.000592152  0.000620763    0.000567183
  24 │     2  pooled   1:5        |  0.000562869  0.000594667  0.000604243    0.000570366   |  0.000733021  0.000744812  0.0007409      0.000725439   |  0.000656928  0.0006644    0.000680158    0.000639155   |  0.000667877  0.000654401  0.000651361    0.000638283
  25 │     3  integer  copy       |  0.000859329  1.97082      1.95491      114.567         |  0.000650855  1.888        1.8943       112.674         |  0.00065903   1.89387      1.82394       81.0544        |  0.000665323  1.89764      1.72183       44.9112
  26 │     3  string   copy       |  0.000825655  1.99185      2.0429       193.773         |  0.000625632  1.90546      1.93181      189.987         |  0.000636364  1.92841      1.81345      131.977         |  0.000630434  1.90891      1.72212       46.0159
  27 │     3  pooled   copy       |  0.000956091  3.55017      3.72087       70.9617        |  0.0007245    3.39532      3.39599       69.1517        |  0.000727739  3.42417      2.79859       26.7378        |  0.000739414  3.40451      2.04515       27.0857
  28 │     3  integer  :          |  0.000820425  1.92865      2.04525      106.17          |  0.00092219   1.88534      1.89839      103.764         |  0.000919854  1.90638      1.80056       71.3591        |  0.000923781  1.89959      1.72614       38.6843
  29 │     3  string   :          |  0.00083474   1.94814      2.04826      194.834         |  0.000895942  1.89318      1.93004      190.013         |  0.000899512  1.92614      1.84079      131.629         |  0.000879516  1.89305      1.72763       47.5389
  30 │     3  pooled   :          |  0.00092332   3.39207      3.4687        70.8639        |  0.000977154  3.42869      3.38522       68.2833        |  0.000967688  3.41824      2.70307       27.0025        |  0.000964636  3.42019      2.06374       28.5991
  31 │     3  integer  1:end-5    |  0.000722382  1.96921      2.0154       103.007         |  0.000791356  1.91271      1.9321       100.212         |  0.000810111  1.90702      2.08992       69.529         |  0.000804184  1.87525      2.0189        38.6179
  32 │     3  string   1:end-5    |  0.000732045  3.11024      3.41474      208.695         |  0.00078456   3.08402      3.16887      204.848         |  0.00078829   3.08848      3.6555       150.572         |  0.000760458  3.35707      3.26024       70.3561
  33 │     3  pooled   1:end-5    |  0.000804563  3.11316      3.1891        70.382         |  0.000875879  3.11359      3.05418       67.0261        |  0.000862971  3.18369      3.13776       27.476         |  0.000863725  3.19809      3.0784        27.4594
  34 │     3  integer  1:5        |  0.000680263  0.000671059  0.000689962    0.00062154    |  0.000744617  0.000767285  0.00074281     0.000692272   |  0.000766342  0.000750638  0.00075643     0.00069304    |  0.000763721  0.000768847  0.00075368     0.000685779
  35 │     3  string   1:5        |  0.000697318  0.000725678  0.000714741    0.000651265   |  0.000774161  0.000754008  0.000765798    0.0006886     |  0.000779261  0.000805847  0.000811194    0.000723278   |  0.000765079  0.000793202  0.000764787    0.000718884
  36 │     3  pooled   1:5        |  0.000764025  0.000779728  0.000795222    0.00074512    |  0.0010056    0.00092625   0.00095872     0.000888093   |  0.000868118  0.000834279  0.000857422    0.000799948   |  0.000875613  0.0008743    0.000854768    0.000812107
  37 │     4  integer  copy       |  0.0009606    2.53692      2.74477      327.169         |  0.000801705  2.59861      2.64846      327.722         |  0.000812529  2.54433      2.21674      115.667         |  0.000780835  2.6188       2.38634       49.0887
  38 │     4  string   copy       |  0.0009968    2.64367      2.79322      505.82          |  0.000775085  2.58778      2.69138      505.33          |  0.00077847   2.5864       2.2484       137.262         |  0.000741447  2.68157      2.36143       49.2196
  39 │     4  pooled   copy       |  0.0010829    4.54012      4.67678       89.1155        |  0.000881453  4.59306      4.57244       86.313         |  0.000893019  4.58468      3.04962       34.8772        |  0.000891813  4.66487      2.48297       31.5785
  40 │     4  integer  :          |  0.0010009    2.59578      2.73846      323.886         |  0.0010807    2.59233      2.68274      326.931         |  0.0010837    2.58282      2.22098      117.776         |  0.0010233    2.60786      2.38909       49.3215
  41 │     4  string   :          |  0.0009899    2.6299       2.78348      506.326         |  0.0010513    2.59005      2.68936      505.215         |  0.0010674    2.57494      2.25078      246.476         |  0.0010094    2.67444      2.34516      129.385
  42 │     4  pooled   :          |  0.0010808    4.53091      4.65456       89.2504        |  0.0011966    4.53459      4.51011       86.1889        |  0.0011763    4.55097      3.05904       34.8609        |  0.0011569    4.66105      2.47399       31.6768
  43 │     4  integer  1:end-5    |  0.000875304  2.6088       2.83297      324.25          |  0.000971176  2.64287      2.75142      328.511         |  0.000984667  2.64139      2.82075      115.837         |  0.000949769  2.71576      2.76518       51.741
  44 │     4  string   1:end-5    |  0.000872286  4.11794      4.47723      521.924         |  0.0009685    4.12215      4.40066      524.939         |  0.000972389  4.23244      4.48262      267.664         |  0.000961294  4.18993      4.50478       82.0119
  45 │     4  pooled   1:end-5    |  0.0009635    4.25499      4.25513       87.2929        |  0.0010739    4.18169      4.14821       85.7372        |  0.001109     4.27362      4.21689       35.1345        |  0.0010745    4.31149      4.16611       32.178
  46 │     4  integer  1:5        |  0.000842098  0.000854595  0.000839681    0.000726559   |  0.000945074  0.000973014  0.000932309    0.000805115   |  0.000946571  0.00100097   0.000942927    0.000807782   |  0.000918314  0.000966369  0.000933725    0.000814506
  47 │     4  string   1:5        |  0.000876455  0.00155402   0.0009366      0.000766748   |  0.000971063  0.000944917  0.000965389    0.000860667   |  0.000983364  0.0010076    0.0009684      0.000849266   |  0.00095644   0.0009879    0.000969833    0.00086492
  48 │     4  pooled   1:5        |  0.000967125  0.0010959    0.000921421    0.000847533   |  0.0012515    0.0012009    0.0012163      0.0010886     |  0.001066     0.0010619    0.0010358      0.000965786   |  0.0010562    0.0010768    0.0010617      0.000964091

@bkamins
Copy link
Member Author

bkamins commented Mar 17, 2021

This should be done. Could you please comment that you have looked at the PR and all looks OK before merging (these things are more tricky than they seem). Thank you !

@nalimilan
Copy link
Member

Looks good, but I wonder whether it would be worth introducing a tcollect helper function similar to tforeach at #2661. That would make sense if we expect to use the same pattern elsewhere.

All branches could be replaced with something like this, and the VERSION and nthreads() check could be defined within tcollect:

new_columns = tcollect(AbstractVector, (_columns(df)[i][selected_rows] for i in selected_columns),
                       threaded=length(selected_rows) >= 1_000_000)
return DataFrame(new_columns, idx, copycols=false)

What do you think?

@bkamins
Copy link
Member Author

bkamins commented Mar 18, 2021

OK - I will add tcollect (unless I run into issues 😄)

@bkamins
Copy link
Member Author

bkamins commented Mar 18, 2021

I could not find a better solution that what is currently implemented without regression in performance. I will merge this PR tomorrow if no more comments are raised.

@bkamins bkamins merged commit c0c8cd3 into main Mar 19, 2021
@bkamins bkamins deleted the threaded_elementary branch March 19, 2021 09:19
@bkamins
Copy link
Member Author

bkamins commented Mar 19, 2021

Thank you! Of course code organization improvements are welcome - there would be a proposal that would not regress the performance please open a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-breaking The proposed change is not breaking performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants