gshift as gforce optimized shift #5205

ben-schwen · 2021-10-09T14:21:08Z

Internal optimization

Currently shift is only GForce optimized when its fill argument is not a call, to avoid cases where fill=x[1], fill=x[.N], etc. since those would need evaluation within their group.

General Remark

The gforce optimization of shift would be the first function to always return vectors of the same length as by. The first step in this direction was made in #5089 by allowing ghead to return multiple values per group. Before that every gforce function returned exactly 1 value per group.

The whole datatable.optimize optimization code branch has grown pretty fat and might need to be rewritten. This is also related to #1414 #5032 #3815 #735 #523.

MichaelChirico · 2021-10-09T20:26:37Z

src/gsumm.c

@@ -1162,3 +1167,83 @@ SEXP gprod(SEXP x, SEXP narmArg) {
  return(ans);
 }

+SEXP gshift(SEXP x, SEXP nArg, SEXP fillArg, SEXP typeArg) {


shouldn't we prefer to re-use the code in shift.c here (to the extent possible)? otherwise it'll be a pain to always update code in both places going forward.

It might require some refactoring/modularizing of the original shift.c.

I can try to recycle as much as possible but there are some different things to consider about both implementations.

Basic shift gshift

works on data.tables/vectors/lists operates on vectors

main workhouse is memmove used memory may not be blocked

no subset on i concurrent subsetting on irows

Maybe basic shift could be redirected to call gshift then? It would have to call gforce prep function first, perhaps in a dummy way to trick gshift into working. Then it would be common code and all existing tests of shift would cover the new gforce gshift too.
Perhaps not now. Happy to merge this PR as is, subject to integer64. A task could be created to redirect shift to gshift in future.

use match.call to deactivate gforce for calls where fill expr

codecov · 2021-10-10T22:48:55Z

Codecov Report

Merging #5205 (5f3a10a) into master (fa76197) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 5f3a10a differs from pull request most recent head 5eec8f2. Consider uploading reports for the commit 5eec8f2 to get more accurate results

@@           Coverage Diff           @@
##           master    #5205   +/-   ##
=======================================
  Coverage   99.50%   99.50%           
=======================================
  Files          77       77           
  Lines       14536    14590   +54     
=======================================
+ Hits        14464    14518   +54     
  Misses         72       72

Impacted Files	Coverage Δ
src/init.c	`100.00% <ø> (ø)`
R/data.table.R	`99.89% <100.00%> (+<0.01%)`	⬆️
src/gsumm.c	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa76197...5eec8f2. Read the comment docs.

ben-schwen · 2021-10-14T16:15:27Z

Benchmarks

library(data.table)
library(microbenchmark)
basic_shift = shift
bench = function(DT) print(microbenchmark(
  DT[, shift(x, 1, type="lag"), y],
  DT[, basic_shift(x, 1, type="lag"), y],
  DT[, c(NA, head(x,-1)), y],
  times = 10L, unit = "s"), signif=4)

N = 1e6
DT = data.table(x = sample(N), y = sample(1e2,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr     min     lq    mean  median      uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.01342 0.0146 0.02848 0.02031 0.05030 0.05487    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.01129 0.0116 0.01416 0.01383 0.01560 0.01855    10
#               DT[, c(NA, head(x, -1)), y] 0.01205 0.0169 0.01766 0.01830 0.01901 0.02111    10

DT = data.table(x = sample(N), y = sample(1e3,N,TRUE))
# Unit: seconds
#                                      expr     min      lq    mean  median      uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.01092 0.01152 0.01242 0.01207 0.01339 0.01448    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.02676 0.02710 0.02873 0.02796 0.03022 0.03277    10
#               DT[, c(NA, head(x, -1)), y] 0.02064 0.02085 0.02186 0.02115 0.02273 0.02466    10

DT = data.table(x = sample(N), y = sample(1e4,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr     min      lq   mean  median      uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.01328 0.01653 0.0183 0.01722 0.02066 0.02501    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.22480 0.23000 0.2617 0.23990 0.28300 0.37830    10
#               DT[, c(NA, head(x, -1)), y] 0.10440 0.11080 0.1234 0.11710 0.13030 0.15360    10

DT = data.table(x = sample(N), y = sample(1e5,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr     min      lq    mean  median      uq    max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.02344 0.02761 0.03975 0.03406 0.04527 0.0726    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 1.99000 2.13600 2.17000 2.17700 2.18500 2.3810    10
#               DT[, c(NA, head(x, -1)), y] 0.89570 0.92040 0.93580 0.94220 0.95250 0.9613    10

N = 1e7
DT = data.table(x = sample(N), y = sample(1e2,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr    min    lq   mean median     uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.2044 0.244 0.2705 0.2527 0.2710  0.4261    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.1231 0.140 0.1793 0.1447 0.2030  0.3918    10
#               DT[, c(NA, head(x, -1)), y] 0.1618 0.172 0.2093 0.1931 0.2416  0.2932    10

DT = data.table(x = sample(N), y = sample(1e3,N,TRUE))
# Unit: seconds
#                                      expr    min     lq   mean median     uq    max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.1984 0.2206 0.2307 0.2243 0.2464 0.2722    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.1278 0.1481 0.1746 0.1684 0.1943 0.2404    10
#               DT[, c(NA, head(x, -1)), y] 0.1460 0.1605 0.1850 0.1727 0.2154 0.2414    10

DT = data.table(x = sample(N), y = sample(1e4,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr    min     lq   mean median     uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.2686 0.2795 0.2972 0.2984 0.3148  0.3322    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 0.4850 0.4913 0.5132 0.5072 0.5424  0.5528    10
#               DT[, c(NA, head(x, -1)), y] 0.3696 0.3830 0.4146 0.4233 0.4440  0.4521    10

DT = data.table(x = sample(N), y = sample(1e5,N,TRUE))
bench(DT)
# Unit: seconds
#                                      expr    min     lq   mean median     uq     max neval
#        DT[, shift(x, 1, type = "lag"), y] 0.3696 0.3902 0.4027 0.3939 0.4288  0.4543    10
#  DT[, basic_shift(x, 1, type = "lag"), y] 2.5130 2.5400 2.5810 2.5630 2.6230  2.7160    10
#               DT[, c(NA, head(x, -1)), y] 1.2890 1.3280 1.3650 1.3700 1.4100  1.4230    10

… call (unusual) is distracting; even fill (a call or not) wasn't mentioned in the issue. The main thing is that shift is optimized which is awesome

mattdowle · 2021-10-15T02:37:33Z

src/gsumm.c

+    switch(TYPEOF(x)) {
+      case LGLSXP:  { int *ansd=LOGICAL(tmp);             SHIFT(int,     LOGICAL,   ansd[ansi++]=val); } break;
+      case INTSXP:  { int *ansd=INTEGER(tmp);             SHIFT(int,     INTEGER,   ansd[ansi++]=val); } break;
+      case REALSXP: { double *ansd=REAL(tmp);             SHIFT(double,  REAL,      ansd[ansi++]=val); } break;


Was just about to merge (looks awesome!) and then just saw integer64 missed here (and test_bit64 added to tests). Bug fix 45 starts shift(xInt64, fill=0) so now that shift is optimized I wonder if the shift problems of int64 would return. Maybe existing tests of shift int64 don't test it by group then.
Can't think of anything else.

Added integer64 support. Interestingly we don't do anything special about integer64 anymore in shift.c or gshift except using coerceAs for coercing and copyMostAttrib for copying class integer64.

Interesting. The coerceAs is for fill but fill isn't currently tested by the new gforce test 2224. A loop of differently typed fill needs adding to the test then, and I suspect a specific case here for integer64 will then be needed to pass that? Can I leave that to you, and the C99 warning, hence WIP again.

ben-schwen · 2021-10-15T17:21:11Z

A sample(1e6, N, TRUE) (where N is 1e7 as you have) would be good for the news item; i.e. 1 million groups of 10 rows. You only went to group size 100 so far. The example on S.O. has just 5 rows in each group after all. You can just cut to the chase and test group size 5. Or in the news item just use the test verbatim as reported on S.O. and link to it. That way nobody can say we made up an unrealistic test if we used the test reported from the wild.

Added now both, but either one of them should do.

inst/tests/tests.Rraw

mattdowle · 2021-10-15T18:50:10Z

src/gsumm.c

+    SET_VECTOR_ELT(ans, g, tmp=allocVector(TYPEOF(x), nx));
+    #define SHIFT(CTYPE, RTYPE, ASSIGN) {                                                                         \
+      const CTYPE *xd = (const CTYPE *)RTYPE(x);                                                                  \
+      const CTYPE fill = (const CTYPE)RTYPE(thisfill)[0];                                                         \


Locally I'm seeing this warning (from gcc with -std=c99 I think but it might be due to gcc 10.3.0) :

gsumm.c: In function ‘gshift’: gsumm.c:1207:26: warning: ISO C forbids casting nonscalar to the same type [-Wpedantic] 1207 | const CTYPE fill = (const CTYPE)RTYPE(thisfill)[0]; \ | ^ gsumm.c:1241:59: note: in expansion of macro ‘SHIFT’ 1241 | case CPLXSXP: { Rcomplex *ansd=COMPLEX(tmp); SHIFT(Rcomplex, COMPLEX, ansd[ansi++]=val); } break; | ^~~~~

AFAIC it should also work without the cast.

the casting of fill between NA_REAL and NA_INTEGER64 though?

ben-schwen · 2021-10-16T19:01:34Z

I extended the tests that we introduced in #5189 to also check the coercion between types for gshift. There we also test the casting of fill between NA_REAL and NA_INTEGER64 and it works.

In my current understanding it works since under the hood integer64 is just a "shifted" REALSXP with a class attribute. If we directly write these values into the answer then no problem should occur and the output of numeric and integer64 is depending whether the answer has the class label of integer64.

library(bit64)
.Internal(inspect(lim.integer64()))
#> @55a458cea988 14 REALSXP g0c2 [OBJ,ATT] (len=2, tl=0) -4.94066e-324,nan
#> ATTRIB:
#>   @55a457f3c630 02 LISTSXP g0c0 [REF(1)] 
#>     TAG: @55a455431140 01 SYMSXP g1c0 [MARK,REF(34387),LCK,gp=0x6000] "class" (has value)
#>     @55a457c6f040 16 STRSXP g0c1 [REF(65535)] (len=1, tl=0)
#>       @55a458186ff8 09 CHARSXP g1c2 [MARK,REF(1114),gp=0x61] [ASCII] [cached] "integer64"

mattdowle · 2021-10-20T18:57:15Z

Thanks, I see now. Tweaked the comment there, and all good.

jangorecki · 2022-11-29T21:59:06Z

I suspect this is causing hard to spot issue: #5547

ben-schwen added 7 commits August 31, 2021 18:07

init gshift

5813501

remodel R gshift

13fe55c

fix lead error

55f5775

add tests

6364212

working version

ad63180

Merge remote-tracking branch 'origin/master' into gforce_shift

4eada97

lag/lead working

1b08e59

MichaelChirico reviewed Oct 9, 2021

View reviewed changes

ben-schwen added 5 commits October 10, 2021 00:20

update dispatch & tests

3437ea0

undo newlines

91696e5

reorder

9c88df1

fixed R dispatch

5073709

use match.call to deactivate gforce for calls where fill expr

fix gshift

db16894

ben-schwen added 11 commits October 11, 2021 11:37

coverage

60a7509

long index vector support

0280a45

update testnumber

381eaea

merge master

3187a8c

forgot two testnums

c11d9ba

works for n <= grpsize

98848ce

add cyclic

e0dd981

rm unnecessary code

b591b74

add cov

d03d30e

more coverage

373217f

coverage escaped gshift

bdc7fa0

add NEWS

7d63674

ben-schwen marked this pull request as ready for review October 14, 2021 17:53

removed 'when fill is not a call' from news item because fill being a…

9e339b6

… call (unusual) is distracting; even fill (a call or not) wasn't mentioned in the issue. The main thing is that shift is optimized which is awesome

mattdowle requested changes Oct 15, 2021

View reviewed changes

mattdowle reviewed Oct 15, 2021

View reviewed changes

inst/tests/tests.Rraw Outdated Show resolved Hide resolved

include column 'r' by avoiding repetition

9b9f0bb

mattdowle reviewed Oct 15, 2021

View reviewed changes

inst/tests/tests.Rraw Show resolved Hide resolved

mattdowle reviewed Oct 15, 2021

View reviewed changes

inst/tests/tests.Rraw Show resolved Hide resolved

mattdowle reviewed Oct 15, 2021

View reviewed changes

mattdowle added 2 commits October 15, 2021 13:31

used an .Rdata to test against a fixed y, added RAW support

973f749

added comment about integer64 case

f2e3361

mattdowle added the WIP label Oct 15, 2021

mattdowle and others added 3 commits October 15, 2021 14:13

news item tweak

f5aad19

added coerce test for shift by group

1e5c2b3

add unsupported type test

0f766c0

ben-schwen removed the WIP label Oct 16, 2021

ben-schwen requested a review from mattdowle October 16, 2021 19:02

mattdowle added 2 commits October 20, 2021 12:42

merge master

a6abac3

added/tweaked comments

5eec8f2

mattdowle approved these changes Oct 20, 2021

View reviewed changes

mattdowle merged commit e88826e into master Oct 20, 2021

mattdowle deleted the gforce_shift branch October 20, 2021 19:03

ben-schwen mentioned this pull request Nov 29, 2022

Using variables with names "i" and "j" with shift fails in v1.14.7 #5547

Closed

tdhock restored the gforce_shift branch October 10, 2023 17:24

jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023

DorisAmoakohene mentioned this pull request Jan 29, 2024

data.table gforce benchmarks tdhock/atime#24

Closed

MichaelChirico mentioned this pull request Feb 20, 2024

x[, .(shift(b)), keyby = a] returns list type (should be int) #5939

Closed

DorisAmoakohene added a commit that referenced this pull request Jul 17, 2024

adding an atime test case for gshift as gforce optimized shift #5205

0013772

DorisAmoakohene mentioned this pull request Jul 17, 2024

adding an atime test case for gshift as gforce optimized shift #5205 #6295

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gshift as gforce optimized shift #5205

gshift as gforce optimized shift #5205

ben-schwen commented Oct 9, 2021 •

edited

Loading

MichaelChirico Oct 9, 2021 •

edited

Loading

ben-schwen Oct 9, 2021 •

edited

Loading

mattdowle Oct 15, 2021

codecov bot commented Oct 10, 2021 •

edited

Loading

ben-schwen commented Oct 14, 2021 •

edited

Loading

mattdowle Oct 15, 2021

ben-schwen Oct 15, 2021 •

edited

Loading

mattdowle Oct 15, 2021

ben-schwen commented Oct 15, 2021

mattdowle Oct 15, 2021 •

edited

Loading

ben-schwen Oct 15, 2021

mattdowle Oct 15, 2021

ben-schwen commented Oct 16, 2021

mattdowle commented Oct 20, 2021

jangorecki commented Nov 29, 2022 •

edited

Loading

Basic shift	gshift
works on data.tables/vectors/lists	operates on vectors
main workhouse is `memmove`	used memory may not be blocked
no subset on i	concurrent subsetting on irows

gshift as gforce optimized shift #5205

gshift as gforce optimized shift #5205

Conversation

ben-schwen commented Oct 9, 2021 • edited Loading

Internal optimization

General Remark

MichaelChirico Oct 9, 2021 • edited Loading

Choose a reason for hiding this comment

ben-schwen Oct 9, 2021 • edited Loading

Choose a reason for hiding this comment

mattdowle Oct 15, 2021

Choose a reason for hiding this comment

codecov bot commented Oct 10, 2021 • edited Loading

Codecov Report

ben-schwen commented Oct 14, 2021 • edited Loading

Benchmarks

mattdowle Oct 15, 2021

Choose a reason for hiding this comment

ben-schwen Oct 15, 2021 • edited Loading

Choose a reason for hiding this comment

mattdowle Oct 15, 2021

Choose a reason for hiding this comment

ben-schwen commented Oct 15, 2021

mattdowle Oct 15, 2021 • edited Loading

Choose a reason for hiding this comment

ben-schwen Oct 15, 2021

Choose a reason for hiding this comment

mattdowle Oct 15, 2021

Choose a reason for hiding this comment

ben-schwen commented Oct 16, 2021

mattdowle commented Oct 20, 2021

jangorecki commented Nov 29, 2022 • edited Loading

ben-schwen commented Oct 9, 2021 •

edited

Loading

MichaelChirico Oct 9, 2021 •

edited

Loading

ben-schwen Oct 9, 2021 •

edited

Loading

codecov bot commented Oct 10, 2021 •

edited

Loading

ben-schwen commented Oct 14, 2021 •

edited

Loading

ben-schwen Oct 15, 2021 •

edited

Loading

mattdowle Oct 15, 2021 •

edited

Loading

jangorecki commented Nov 29, 2022 •

edited

Loading