Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for melt with na.rm=TRUE and list column value #4737

Merged
merged 5 commits into from
May 9, 2021

Conversation

tdhock
Copy link
Member

@tdhock tdhock commented Oct 3, 2020

hi again, I have been trying to get 100% coverage in fmelt.c and some of the uncovered lines revealed a bug. This PR adds a test for this case, and will eventually add a fix. The problem is shown below:

library(data.table)
melt(data.table(a1=1, b1=1, b2=2), na.rm=TRUE, measure.vars=list(a="a1", b=c("b1","b2")))
#>    variable a b
#> 1:        1 1 1
# only one row as expected (since a2 is missing, b2 is removed as well).
melt(data.table(a1=1, b1=list(1:2), b2=list(c('foo','bar'))), na.rm=TRUE, measure.vars=list(a="a1", b=c("b1","b2")))
#>    variable  a       b
#> 1:        1  1     1,2
#> 2:        2 NA foo,bar
# two rows, second a value missing, incorrect since na.rm=TRUE was specified.

@tdhock
Copy link
Member Author

tdhock commented Oct 3, 2020

There was some code which treated lists as a special case, and always used data->narm=FALSE in the C code, even when the user specified na.rm=TRUE in the R code. Seems to me that this is undesirable, due to the inconsistency with the non-list case. If that behavior is desirable to keep, then I would have at least expected some kind of message telling the user that their request of na.rm=TRUE was ignored (seems like there was a message but only if verbose=TRUE).

@codecov
Copy link

codecov bot commented Oct 3, 2020

Codecov Report

Merging #4737 (cb510cf) into master (d8e0fb3) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4737      +/-   ##
==========================================
+ Coverage   99.45%   99.46%   +0.01%     
==========================================
  Files          73       73              
  Lines       14612    14609       -3     
==========================================
- Hits        14532    14531       -1     
+ Misses         80       78       -2     
Impacted Files Coverage Δ
src/fmelt.c 100.00% <ø> (+0.35%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8e0fb3...cb510cf. Read the comment docs.

@tdhock tdhock marked this pull request as ready for review October 3, 2020 22:25
@tdhock tdhock changed the title WIP: fix for melt with na.rm=TRUE and list column value fix for melt with na.rm=TRUE and list column value Oct 3, 2020
@tdhock
Copy link
Member Author

tdhock commented Jan 22, 2021

This behavior is mentioned on the ?melt.data.table man page:

# return 'NA' for missing columns, 'na.rm=TRUE' ignored due to list column
melt(DT, id=1:2, measure=patterns("l_", "c_"), na.rm=TRUE)

@mattdowle
Copy link
Member

I guess the news item in #4720 covers this one too.

@mattdowle mattdowle added this to the 1.14.1 milestone May 9, 2021
@mattdowle mattdowle merged commit 0a7761c into master May 9, 2021
@mattdowle mattdowle deleted the fix-melt-narm-list-column branch May 9, 2021 11:34
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants