Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow.cartesian's precision should be based on the 'all' argument #5606

Closed
DVDVTAL opened this issue Feb 28, 2023 · 2 comments
Closed

allow.cartesian's precision should be based on the 'all' argument #5606

DVDVTAL opened this issue Feb 28, 2023 · 2 comments

Comments

@DVDVTAL
Copy link

DVDVTAL commented Feb 28, 2023

allow.cartesian is a fantastic tool for capturing instances where a many to many join is performed. The condition for being required is when nrow(output) > nrow(x) + nrow(i). When all = T, this is an appropriate condition. When all.x = T or all = F, the condition becomes less reasonable.

  • In instances where all.x = T, the condition should be nrow(output) > nrow(x) (or whichever one is the first argument)
  • In instances where all = F, the condition should be nrow(output) > min(nrow(x), nrow(i))
a <- data.table(A = 1:3)
b <- data.table(B = c(1, 1:3))
merge(a, b, by.x = "A", by.y = "B")

The above code should fail due to the duplication created by value '1'.

@jangorecki
Copy link
Member

#4383
This sounds to be related.
There is an open PR to resolve it already.

@jangorecki
Copy link
Member

@DVDVTAL please let me know if linked issue is not superseding your request here, ideally with example code that is not already handled by solution to the other issue, then we will reopen this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants