-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix type coercion in bmerge #6603
base: master
Are you sure you want to change the base?
Conversation
Generated via commit 1d7a8c2 Download link for the artifact containing the test results: ↓ atime-results.zip
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6603 +/- ##
=======================================
Coverage 98.60% 98.60%
=======================================
Files 79 79
Lines 14516 14520 +4
=======================================
+ Hits 14314 14318 +4
Misses 202 202 ☔ View full report in Codecov by Sentry. |
@MichaelChirico is it worth to make the (length(unique(icols))!=length(icols) || length(unique(xcols))!=length(xcols)) ensuring that either |
@@ -71,7 +71,13 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos | |||
stopf("Incompatible join types: %s (%s) and %s (%s). Factor columns must join to factor or character columns.", xname, xclass, iname, iclass) | |||
} | |||
if (xclass == iclass) { | |||
if (verbose) catf("%s has same type (%s) as %s. No coercion needed.\n", iname, xclass, xname) | |||
if (length(icols)>1 && is(x[[xc]], "Date") && is(i[[ic]], "Date")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To modify only when strictly needed, could be added the next condition?
&& !identical(storage.mode(x[[xc]]), storage.mode(i[[ic]]))
(following https://bugs.r-project.org/show_bug.cgi?id=18782#c9)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use inherits()
, is()
is for S4 classes IMO (even though it happens to work for S3 classes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, IINM POSIXct
can have the same concern -- {base} doesn't care about storing POSIXct
as integer
sometimes if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use
inherits()
,is()
is for S4 classes IMO (even though it happens to work for S3 classes).
Previously I checked if class(x[[xc]])=="Date"
are equal and our linter nudged me towards is
, so we might also want to adjust the linter :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class(x)=="Date"
is bad because class()
can be length>1
, c.f. class(Sys.time())
, hence ==
is rarely intended.
The linter mentions is()
as well as inherits()
to cover all bases since there's no is.Date()
and we didn't hand-code anything to cover base S3 classes:
lintr::lint("class(x) == 'Date'\n", lintr::class_equals_linter())
# <text>:1:1: warning: [class_equals_linter] Instead of comparing class(x) with ==, use inherits(x, 'class-name') or is.<class> or is(x, 'class').
# class(x) == 'Date'
# ^~~~~~~~~~~~~~~~~~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, seems that I overlooked the inherits
part.
Sorry, I just saw that this is a CRAN requirement, checking now. |
@@ -71,7 +71,13 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos | |||
stopf("Incompatible join types: %s (%s) and %s (%s). Factor columns must join to factor or character columns.", xname, xclass, iname, iclass) | |||
} | |||
if (xclass == iclass) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside: {x,i}class
seem like poor name choices here, given that they typically come from typeof()
(and class()
is never called, except maybe internally by inherits()
).
{x,i}type
would be more appropriate, but {x,i}_merge_type
might be even better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another aside: It looks like we do x[[xc]]
and i[[ic]]
extraction at least one and up to several times. Should we just store the extracted column locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside:
{x,i}class
seem like poor name choices here, given that they typically come fromtypeof()
(andclass()
is never called, except maybe internally byinherits()
).
Yes but didn't want to change more than needed for the quick fix
Definitely thanks for triaging a fix here. I think we both agree it's pretty hack-y & ideally not needed. I'm not sure I fully understand the bug yet, but as stated in OP we're doing IMO |
Unfortunately this "contract" does not hold. I have a Windows dev version installed here and get the following > typeof(.Date(0L))
[1] "integer"
> typeof(as.Date(.Date(0L)))
[1] "integer" |
Thanks, also confirmed that on 4.4.1 |
This is strange as Kurt mentioned there wasn't an intention to do this but I figure it would have been fixed by now if that were the case. |
Closes #6602
NB: I'm not happy that a user can get different messages, depending on the number of join conditions, but coercing both Dates to
double
when multiple column conditions exist, seems like the right decision.Base does not encounter this problem since one join column can not be in multiple join conditions.