-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tabyl not counting NAs when variable is a factor #111
Comments
Hi and thanks for filing the bug report with an example! I cannot reproduce this behavior, either with janitor 0.2.1 on CRAN or janitor 0.2.1.9000 from GitHub (the current development version). I get:
Can you run |
It looks like I am running 0.2.1.
Here is all the results from sessionInfo() not sure if something else may be causing conflict:
sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.1 wesanderson_0.3.2 Hmisc_4.0-2 Formula_1.2-1 survival_2.41-3 lattice_0.20-35 mice_2.30 VIM_4.7.0 data.table_1.10.4
[10] colorspace_1.3-2 DT_0.2 scales_0.4.1.9000 ggthemes_3.4.0 knitr_1.15.1 tableone_0.7.3 stringr_1.2.0 janitor_0.2.1 readxl_0.1.1.9000
[19] dplyr_0.5.0.9002 purrr_0.2.2 readr_1.1.0 tidyr_0.6.1.9000 tibble_1.3.0 ggplot2_2.2.1 tidyverse_1.1.1 pacman_0.4.1
loaded via a namespace (and not attached):
[1] nlme_3.1-131 pbkrtest_0.4-7 lubridate_1.6.0 RColorBrewer_1.1-2 httr_1.2.1 rprojroot_1.2 tools_3.3.3 backports_1.0.5
[9] R6_2.2.0 rpart_4.1-10 lazyeval_0.2.0 mgcv_1.8-17 nnet_7.3-12 sp_1.2-4 gridExtra_2.2.1 mnormt_1.5-5
[17] rvest_0.3.2 quantreg_5.29 htmlTable_1.9 SparseM_1.76 xml2_1.1.1 checkmate_1.8.2 lmtest_0.9-35 DEoptimR_1.0-8
[25] psych_1.7.3.21 robustbase_0.92-7 digest_0.6.12 foreign_0.8-67 minqa_1.2.4 rmarkdown_1.4 base64enc_0.1-3 pkgconfig_2.0.1
[33] htmltools_0.3.5 lme4_1.1-12 highr_0.6 htmlwidgets_0.8 rlang_0.0.0.9016 bindr_0.1 zoo_1.8-0 jsonlite_1.4
[41] acepack_1.4.1 car_2.1-4 magrittr_1.5 Matrix_1.2-8 Rcpp_0.12.10.1 munsell_0.4.3 yaml_2.1.14 stringi_1.1.5
[49] MASS_7.3-45 plyr_1.8.4 parallel_3.3.3 forcats_0.2.0 haven_1.0.0 splines_3.3.3 hms_0.3 boot_1.3-18
[57] reshape2_1.4.2 glue_0.0.0.9000 evaluate_0.10 latticeExtra_0.6-28 laeken_0.4.6 modelr_0.1.0 vcd_1.4-3 nloptr_1.0.4
[65] MatrixModels_0.4-1 gtable_0.2.0 assertthat_0.2.0 broom_0.4.2 survey_3.31-5 e1071_1.6-8 class_7.3-14 cluster_2.0.6
|
Hm, I wonder if it's the new dplyr (which will require some updates on my end anyway when it hits CRAN). I will try installing your versions of dplyr + tidyr and see what's going on. |
My coworker tried running the code and had similar result as you; different from me.
I just tried to install “tidyverse” over my existing version and did the same with “dplyr”. No change. I’m attempting to just reinstall R currently and see if that changes anything. All other things seem to be working okay today.
From: Sam Firke [mailto:notifications@github.com]
Sent: Monday, April 17, 2017 10:43 AM
To: sfirke/janitor <janitor@noreply.github.com>
Cc: Emile Latour <latour@ohsu.edu>; Author <author@noreply.github.com>
Subject: Re: [sfirke/janitor] tabyl not counting NAs when variable is a factor (#111)
Hm, I wonder if it's the new dplyr (which will require some updates on my end anyway when it hits CRAN). I will try installing your versions of dplyr + tidyr and see what's going on.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#111 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYOC9NvZBPsBb6jGUUTNlSK4fUlvhgoZks5rw6SfgaJpZM4M_UWG>.
|
Okay. Weird! Now it works correctly.
I re-wrote R version 3.3 over my old version. Tried library(janitor) and got an error message about there being no package named “dplyr”. So I reinstalled “tidyverse” and now everything is working as it should.
From: Emile Latour
Sent: Monday, April 17, 2017 10:51 AM
To: 'sfirke/janitor' <reply@reply.github.com>; sfirke/janitor <janitor@noreply.github.com>
Cc: Author <author@noreply.github.com>
Subject: RE: [sfirke/janitor] tabyl not counting NAs when variable is a factor (#111)
My coworker tried running the code and had similar result as you; different from me.
I just tried to install “tidyverse” over my existing version and did the same with “dplyr”. No change. I’m attempting to just reinstall R currently and see if that changes anything. All other things seem to be working okay today.
From: Sam Firke [mailto:notifications@github.com]
Sent: Monday, April 17, 2017 10:43 AM
To: sfirke/janitor <janitor@noreply.github.com<mailto:janitor@noreply.github.com>>
Cc: Emile Latour <latour@ohsu.edu<mailto:latour@ohsu.edu>>; Author <author@noreply.github.com<mailto:author@noreply.github.com>>
Subject: Re: [sfirke/janitor] tabyl not counting NAs when variable is a factor (#111)
Hm, I wonder if it's the new dplyr (which will require some updates on my end anyway when it hits CRAN). I will try installing your versions of dplyr + tidyr and see what's going on.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#111 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYOC9NvZBPsBb6jGUUTNlSK4fUlvhgoZks5rw6SfgaJpZM4M_UWG>.
|
Odd. What's your |
sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] devtools_1.12.0 Hmisc_4.0-2 Formula_1.2-1 lattice_0.20-35 survminer_0.3.1.999 ggpubr_0.1.2.999 survival_2.41-3 scales_0.4.1.9000
[9] ggthemes_3.4.0 knitr_1.15.1 tableone_0.7.3 stringr_1.2.0 janitor_0.2.1 readxl_0.1.1.9000 dplyr_0.5.0 purrr_0.2.2
[17] readr_1.1.0 tidyr_0.6.1.9000 tibble_1.3.0 ggplot2_2.2.1 tidyverse_1.1.1 pacman_0.4.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10.1 lubridate_1.6.0 zoo_1.8-0 assertthat_0.2.0 rprojroot_1.2 digest_0.6.12 psych_1.7.3.21 R6_2.2.0
[9] plyr_1.8.4 backports_1.0.5 acepack_1.4.1 survey_3.31-5 evaluate_0.10 httr_1.2.1 curl_2.5 lazyeval_0.2.0
[17] data.table_1.10.4 rpart_4.1-10 Matrix_1.2-8 checkmate_1.8.2 rmarkdown_1.4 splines_3.3.3 foreign_0.8-67 htmlwidgets_0.8
[25] munsell_0.4.3 broom_0.4.2 modelr_0.1.0 base64enc_0.1-3 mnormt_1.5-5 htmltools_0.3.5 nnet_7.3-12 htmlTable_1.9
[33] gridExtra_2.2.1 km.ci_0.5-2 withr_1.0.2 grid_3.3.3 nlme_3.1-131 jsonlite_1.4 xtable_1.8-2 gtable_0.2.0
[41] DBI_0.6-1 git2r_0.18.0 magrittr_1.5 KMsurv_0.1-5 stringi_1.1.5 reshape2_1.4.2 latticeExtra_0.6-28 xml2_1.1.1
[49] survMisc_0.5.4 RColorBrewer_1.1-2 tools_3.3.3 cmprsk_2.2-7 forcats_0.2.0 hms_0.3 parallel_3.3.3 yaml_2.1.14
[57] colorspace_1.3-2 cluster_2.0.5 rvest_0.3.2 memoise_1.0.0 haven_1.0.0
From: Sam Firke [mailto:notifications@github.com]
Sent: Monday, April 17, 2017 11:27 AM
To: sfirke/janitor <janitor@noreply.github.com>
Cc: Emile Latour <latour@ohsu.edu>; Author <author@noreply.github.com>
Subject: Re: [sfirke/janitor] tabyl not counting NAs when variable is a factor (#111)
Odd. What's your sessionInfo() now that it's working?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#111 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYOC9Id50euzIg94IoPw_GkDizKAJ9S4ks5rw68KgaJpZM4M_UWG>.
|
I upgraded to the soon-to-be-released dplyr 0.5.0.9002 and now get the error above you report. Will look into it. |
The error is that dplyr joins now don't match on NA. Note to self: the following works under dplyr >0.5.0.9002, but requires
|
NA no longer joined to NA. Closes #111
Thanks for reporting this! Now I feel good about the integrity of janitor under the impending launch of dplyr 0.6.0. I had test coverage for this, so would have seen the failing tests, but only after dplyr 0.6.0 launched - it feels good to be proactive. |
Thanks for creating and maintaining such a useful package!! I use it all the time and am glad to know you’re on top of it. Thanks for taking care of this issue!!
Emile
From: Sam Firke [mailto:notifications@github.com]
Sent: Monday, April 17, 2017 1:44 PM
To: sfirke/janitor <janitor@noreply.github.com>
Cc: Emile Latour <latour@ohsu.edu>; Author <author@noreply.github.com>
Subject: Re: [sfirke/janitor] tabyl not counting NAs when variable is a factor (#111)
Thanks for reporting this! Now I feel good about the integrity of janitor under the impending launch of dplyr 0.6.0. I had test coverage for this, so would have seen the failing tests, but only after dplyr 0.6.0 launched - it feels good to be proactive.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#111 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYOC9Mkss-DEb9fDKxwA7DHpOwNv0OQKks5rw88NgaJpZM4M_UWG>.
|
I thought that tabyl used to count the number of NAs when the variable was a factor, but for some reason it doesn't seem to be doing it this morning. Here is an example to illustrate the issue:
Create a data set as an example
Here
my_cars$cyl
is a number class could be a character and it would still work correctlyNow if I change
my_cars$cyl
to a factor, tabyl does not count NAsThis seems to be a new phenomena as I've been using the janitor package with this data set for a few months now and never had this come up
The text was updated successfully, but these errors were encountered: