You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using user_na = TRUE in read_sav, the function preserves user-defined missing values as expected. However, when combining these data frames in R, it causes issues such as Error in if (!any(lossy)) { : missing value where TRUE/FALSE needed. I do not know the reason but this happens for public use files of OECD's PIAAC datasets. My guess would be that the labels too long. I see a similar issue here #427 but I do not see any resolution. See the code below.
library(haven)
library(dplyr)
#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, union
# Define file paths for the downloaded SPSS filesfile1<-"prgautp1.sav"# Replace with the path if you downloaded locallyfile2<-"prgbelp1.sav"# Replace with the path if you downloaded locally# Download the SPSS files (if not already done)
download.file("https://webfs.oecd.org/piaac/puf-data/SPSS/prgautp1.sav", file1, mode="wb")
download.file("https://webfs.oecd.org/piaac/puf-data/SPSS/prgbelp1.sav", file2, mode="wb")
# Read the SPSS files with user_na = TRUEdf1<- read_sav(file1, user_na=TRUE)
df2<- read_sav(file2, user_na=TRUE)
# Check the structure of the data frames to understand the data types and NAs#str(df1)#str(df2)# Attempt to combine using dplyr::bind_rows()
bind_rows(df1, df2)
#> Warning: `..1$D_Q18a_T` and `..2$D_Q18a_T` have conflicting value labels.#> ℹ Labels for these values will be taken from `..1$D_Q18a_T`.#> ✖ Values: 6#> Error in if (!any(lossy)) {: missing value where TRUE/FALSE needed
# Attempt to combine using base rbind()
rbind(df1, df2)
#> Error in if (!any(lossy)) {: missing value where TRUE/FALSE needed
When using
user_na = TRUE
inread_sav
, the function preserves user-defined missing values as expected. However, when combining these data frames in R, it causes issues such asError in if (!any(lossy)) { : missing value where TRUE/FALSE needed
. I do not know the reason but this happens for public use files of OECD's PIAAC datasets. My guess would be that the labels too long. I see a similar issue here #427 but I do not see any resolution. See the code below.Created on 2024-09-19 with reprex v2.1.1
Session info
The text was updated successfully, but these errors were encountered: