Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn user about executing rolling join on differing column types #1913

Closed
ben519 opened this issue Nov 15, 2016 · 3 comments · Fixed by #5004
Closed

Warn user about executing rolling join on differing column types #1913

ben519 opened this issue Nov 15, 2016 · 3 comments · Fixed by #5004
Assignees
Labels
bug non-equi joins rolling, overlapping, non-equi joins

Comments

@ben519
Copy link

ben519 commented Nov 15, 2016

Got bit by this guy recently

dt <- data.table(ID=1:5, A=c(1.3, 1.7, 2.4, 0.9, 0.6))
buckets <- data.table(BucketID=1:4, BinA=1:4)

dt[, A.copy := A]

# Rolling join, int col with double col
buckets[dt, on=c("BinA"="A"), roll=-Inf]
   BucketID BinA ID A.copy
1:        1    1  1    1.3
2:        1    1  2    1.7
3:        2    2  3    2.4
4:        1    0  4    0.9
5:        1    0  5    0.6

# Rolling join, double col with double col (notice result is different than above)
buckets[, BinA := as.numeric(BinA)]
buckets[dt, on=c("BinA"="A"), roll=-Inf]
   BucketID BinA ID A.copy
1:        2  1.3  1    1.3
2:        2  1.7  2    1.7
3:        3  2.4  3    2.4
4:        1  0.9  4    0.9
5:        1  0.6  5    0.6
@franknarf1
Copy link
Contributor

I was going to say that this (how BinA is displayed differently) can be handled by referring to cols with the x.* and i.* prefixes, but...

buckets[dt, on=c("BinA"="A"), roll=-Inf, .(x.BinA, i.A)]

   x.BinA i.A
1:      2 1.3
2:      2 1.7
3:      3 2.4
4:      1 0.9
5:      1 0.6

So when I ask for BinA from x, it just gives me the first col of x instead...? Seems like a bug.

@ben519
Copy link
Author

ben519 commented Jan 2, 2017

Any chance this could get a [bug] label and added to the queue to be fixed? This issue scares me since it bites silently and subtly (and because I use a lot of rolling joins). Much thanks.

@jangorecki jangorecki added the non-equi joins rolling, overlapping, non-equi joins label Apr 5, 2020
@jangorecki jangorecki self-assigned this May 16, 2021
@jangorecki jangorecki added the bug label May 16, 2021
@jangorecki
Copy link
Member

jangorecki commented May 16, 2021

@franknarf1 it doesn't give you the first column but it gives you BinA after rolling. It looks like it gives BucketID because both columns have the same content.

buckets[dt, on=c("BinA"="A"), roll=-Inf, .(BinA, x.BinA)]
#    BinA x.BinA
#1:   1.3      2
#2:   1.7      2
#3:   2.4      3
#4:   0.9      1
#5:   0.6      1

@ben-schwen it must have been fixed, both queries return same results. Ultimately in non-equi joins prefixes .x and .i should be preferred till #1615 is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug non-equi joins rolling, overlapping, non-equi joins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants