-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility to provide group labels in melt variable
column when melting a list of columns simultaneously
#2551
Comments
variable
column when melting a list of columns simultaneously #Enhancement #meltvariable
column when melting a list of columns simultaneously @Enhancement @melt
variable
column when melting a list of columns simultaneously @Enhancement @meltvariable
column when melting a list of columns simultaneously label:Enhancement #melt
variable
column when melting a list of columns simultaneously label:Enhancement #meltvariable
column when melting a list of columns simultaneously #Enhancement #melt
I don't understand your comment about "information loss." There's a 1-1 mapping between your desired output and the actual output? I don't see much wrong with:
|
What I mean by "information loss" is that you need to attach the labels afterwards as you did above and as I did in the last line of the minimal example. I feel that this increases the risk of mixing up the labels. But you are right, it can be done afterwards using the same vector of group names. Therefore, the proposed feature would be rather "nice to have" than adding something which cannot be done right now. |
Possibly related to FR: expansion of melt functionality for handling names of output |
@MichaelChirico , Glad to hear this is being worked on. I don't know how much of a priority it is for you guys, but it would be great if you also considered how to deal with "unbalanced" data that is in a panel form. Building on #2564, let's imagine we were missing a few columns:
It would still be fantastic to be able to use the
To get the correct output, you would have to do the more traditional
I had previously written
More recently, working on a replacement for
This works, but:
Anyway--as I mentioned, great to hear that this is being worked on. Just thought I would throw this out as a possible test you would want to look into. |
Also, I don't know whether Stata handles unbalanced wide to long transformation, but from what I can tell, pandas handles it with
|
@mrdwab Could you file a separate issue with this? Looks very useful |
variable
column when melting a list of columns simultaneously #Enhancement #meltvariable
column when melting a list of columns simultaneously
hi the original issue is solved via library(data.table)
DT <- data.table(
val_loss = 1:2,
val_acc = 3:4,
loss = 5:6,
acc = 7:8,
id = c("A", "B"))
nc::capture_melt_multiple(
DT,
column="val_|", function(x)ifelse(x=="", "training", "validation"),
variable="loss|acc")
#> id variable training validation
#> 1: A acc 7 3
#> 2: B acc 8 4
#> 3: A loss 5 1
#> 4: B loss 6 2
nc::capture_melt_multiple(
DT,
set="val_|", function(x)ifelse(x=="", "training", "validation"),
column="loss|acc")
#> id set acc loss
#> 1: A training 7 5
#> 2: B training 8 6
#> 3: A validation 3 1
#> 4: B validation 4 2 |
Code is not reproducible due to unkown namespace nc. Anyway before closing it is good to merge unit test to ensure requested functionality will keep working in future. |
install.packages('nc') ;) |
pure data.table solution using #4731 remotes::install_github("Rdatatable/data.table@melt-custom-variable")
#> Skipping install of 'data.table' from a github remote, the SHA1 (c02fa9e8) has not changed since last install.
#> Use `force = TRUE` to force installation
library(data.table)
DT <- data.table(
val_loss = 1:2,
val_acc = 3:4,
loss = 5:6,
acc = 7:8,
id = c("A", "B"))
melt(DT, measure.vars=measure(
value.name=function(x)ifelse(x=="", "training", "validation"),
variable, pattern="(val_|)(loss|acc)"))
#> id variable validation training
#> 1: A loss 1 5
#> 2: B loss 2 6
#> 3: A acc 3 7
#> 4: B acc 4 8
melt(DT, measure.vars=measure(
set=function(x)ifelse(x=="", "training", "validation"),
value.name, pattern="(val_|)(loss|acc)"))
#> id set loss acc
#> 1: A validation 1 3
#> 2: B validation 2 4
#> 3: A training 5 7
#> 4: B training 6 8 |
Right now, there is a loss of information when one melts a list of columns simultaneously since the
variable
column only gives the number of the group (as a factor by default), not a name.It would be great if the functionality of melting a list of columns simultaneously would be extended such that it is possible to provide labels for the factor in the
variable
###column. I would suggest an additional argument, e.g.variable.labels
which makes it possible to provide custom labels.#
Minimal reproducible example
#
Output of sessionInfo()
I know that the R Version is not up to date but at my Company it takes a while until we get the most recent version.
Sorry, for not labeling the issue. Somehow I could not figure out, how to set them.
The text was updated successfully, but these errors were encountered: