-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Naming conflict and unexpected behavior with the functional form of data.table DT()
#5129
Comments
Great find. I can reproduce the first error :
Seems like that line is finding
I'm glad you brought this up. I was thinking about that too. I'm not sure about expectations, but I fear the fear. Do we feel the fear and do it anyway, or do we be more cautious and fit into dplyr and R-style chains which copy-on-write? Consider the following. It is currently possible in data.table-dev to modify # fresh R session
> head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
> find("mtcars")
[1] "package:datasets"
> require(data.table)
Loading required package: data.table
data.table 1.14.1 IN DEVELOPMENT built 2021-09-02 03:56:34 UTC; mdowle using 6 threads (see ?getDTthreads). Latest news: r-datatable.com
> mtcars |> DT(2,cyl:=NA)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 NA 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
[ snip data.frame print method output ]
> head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 NA 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
> find("mtcars")
[1] "package:datasets"
> head(datasets::mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 NA 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 So it did actually change |
My 2 cents:
Agree. I suppose the question is whether to disable My strong preference would be for the latter, possibly with a one-time warning. |
Agree with Grant here -- backing up to do what was asked, but safely, seems like the user-friendliest choice. I would hope this is most likely to happen in use cases where a copy is not too expensive |
1-
The issue does not happen only with a data.frame but also with a data.table. 2-
I think that the issue is caused by the line
Side effect of removing 3- |
Thanks a lot for this investigation. Saves me a lot of time investigating that naming conflict. I'll take a look. Just to reply on |
@Kamgang-B PR #5176 now merged should solve issue 2 here. Thanks again for your investigation and tests. Still working on |
Issue 1: When working with a
data.frame
, assignment to a new variable does affect the original dataset while assignment to an existing variable (modification/update) does.I think that the expectation when calling a data.table query on a data.frame is that it should behave in a similar way; that is,
assignments and modifications should affect the original data.frame. This is partially useful to avoid to reassign the data back every time we use DT on a data.frame. This will also make it consistent with what would happen when using a data.table and not a data.frame.
Issue 2: Naming a
data.frame
ordata.table
D
leads to errors when used withDT
function:Info session
The text was updated successfully, but these errors were encountered: