-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
join in [.data.table
could be consistent to SQL
#1615
Comments
So you want to keep both of the (I'm not so SQL-literate as to immediately know where the difference between data.table joins and SQL joins is. Maybe the title could be made more specific?) What would you want to see for rolling joins? Maybe...
If #1494 is done, we'll also be able to show which rows of I guess the behavior also makes sense for overlap joins...? (I'm not very familiar with them.) |
I don't much bother about which columns are being returned by default, and in which order. This can be kept as is, and be easily controlled in |
* master: Allow x's cols to be referred to using 'x.' prefix, addresses #1615 adding some helpful info on dev branch checks fixes invalid vignette urls in manual Remove old vignette. More clarifications to secondary indices. Clarify that 'on' doesn't *create* secondary indices. Remove unwanted text in secondary indices vignette. Rename vignette. Fix size of header for FAQ vignette. New vignette on secondary indices and auto indexing based subsets. More minor fix to vignette. Minor formatting fixes to vignettes. Fix Typos & rm. trailing whitespace in vignettes
@sz-cgt no they are two different issues. |
@jangorecki as another struggling user coming from SQL and STATA, SAS, I can attest that data.table's join behavior seems strange at first, but then you need to remember that in R Regardless of the class of With that in mind if you want to preserve information contained in |
@mbacou thanks for input.
In practice it does right outer join, so confusion may comes from the fact that people usually expects left outer join. The only difference to SQL is match on NA vs NULL which is consistent to base R merge. |
It seems that |
Non-equi joins return a lot more rows usually than equi joins. And more At this time there are no plans to overload DT merge syntax or implement Arun Sent from my phone. On 3 July 2016 at 08:20:34, Wenhao Yang (notifications@github.com) wrote:
|
Currently data.table joins are consistent with base R.
This is somehow awkward for some queries.
Join consistency to base R could be kept in
merge.data.table
method for base Rmerge
generic, while the joins within[.data.table
could be consistent to SQL - which does not impose limitation as base R.[.data.frame
does not allow joins so it wouldn’t break consistency here.Change would generally break the code which relies on invalid base R join behavior.
For reference SQL output from postgres:
Just to link related issues: #1700, #1761, #1469
The text was updated successfully, but these errors were encountered: