Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional imputation: Two solutions #258

Closed
angelrodriguez2020 opened this issue Aug 21, 2020 · 10 comments
Closed

Conditional imputation: Two solutions #258

angelrodriguez2020 opened this issue Aug 21, 2020 · 10 comments

Comments

@angelrodriguez2020
Copy link

angelrodriguez2020 commented Aug 21, 2020

Dear fellows,

I'm using your excellent package MICE.

I have two variables that I need to impute together: caidas is the number of falls (numeric), and cq18b is a variable of the care received by the person who fell; it is a factor with levels

  1. No fall
  2. Medical care received
  3. No medical care needed

When imputing caidas, I want that if caidas==0, cq18b==1 and if caidas>0 cq18b gets either an imputed 2 or 3.

This fragment of code limits the second imputation to 3, but I'd like that it imputes 2 or 3, depending on other predictors.

post["cq18b"] <- "
                  imp[[j]][data$caidas[!r[,j]]==0,i] <- levels(tshauprf2$cq18b)[1]
                  imp[[j]][data$caidas[!r[,j]]>0,i] <- levels(tshauprf2$cq18b)[3]
                        "

Any idea?

Thank you for your time and attention,
Ángel

@stefvanbuuren
Copy link
Member

stefvanbuuren commented Aug 22, 2020

Thanks.

Conditional imputation is recurring issue (#43, #125, #152, #153, #193) that has no elegant solution in mice. Part of the difficulty is that there are so many use cases, so we often need some form of tailoring. In the example below I demo conditional imputation with mice using the post processing option.

library(mice, warn.conflicts = FALSE)

data <- data.frame(
  age = c(60, 54, 82, NA, 34, 58, 60, 58, 76, 40),
  fall = c(0, NA, 1, 0, 1, NA, 1, 1, 0, 1),
  aid = factor(c("-", NA, "Yes", "-", "No", NA, "Yes", NA, NA, "No"))
)

pred <- make.predictorMatrix(data)
pred["fall", "aid"] <- 0
pred["age", "aid"] <- 0

post <- make.post(data)
post["aid"] <- paste(sep = "; ",
                     "fall <- data$fall[!r[,j]] == 1",
                     "idx <- data$fall == 1",
                     "imp[[j]][!fall, i] <- levels(data$aid)[1]",
                     "imp[[j]][fall, i] <- mice.impute.pmm(y = data$aid[idx], ry = r[idx, j], x = model.matrix(aid ~ 0 + age, data[idx, , drop = FALSE]))")
                     
imp <- mice(data, m = 10, pred = pred, post = post, print = FALSE, seed = 1)
> data
   age fall  aid
1   60    0    -
2   54   NA <NA>
3   82    1  Yes
4   NA    0    -
5   34    1   No
6   58   NA <NA>
7   60    1  Yes
8   58    1 <NA>
9   76    0 <NA>
10  40    1   No

There are several features that need commenting:

  1. The data contain 10 rows. If fall is missing, then aid is also missing. The procedure does not deal with the case where fall is missing and aid is observed. If fall is zero, we will want to impute category -. If fall is 1, then we want to build the imputation model for aid on the subset of cases with fall == 1. If fall is missing, then we want to impute it. To maintain consistency among the imputes, I have placed the aid column adjacent to the fall column (alternatively, we may also tamper with the visitSequence).
  2. We need to remove aid from the imputation models for fall and age. Otherwise the problem is multi-collinear. Although mice will detect and handle this, it is better to explicitly control the model.
  3. The post statement will fit two imputation models for aid. If fall = 0 then it imputes simply -, if fall = 1 it imputes Yes or No.
> complete(imp)
   age fall aid
1   60    0   -
2   54    0   -
3   82    1 Yes
4   34    0   -
5   34    1  No
6   58    1 Yes
7   60    1 Yes
8   58    1 Yes
9   76    0   -
10  40    1  No
> imp$imp
$age
   1  2  3  4  5  6  7  8  9 10
4 34 76 54 76 60 58 60 60 82 82

$fall
  1 2 3 4 5 6 7 8 9 10
2 0 1 1 1 1 1 1 1 0  1
6 1 1 0 1 1 1 0 1 1  1

$aid
    1   2   3   4   5   6   7   8   9  10
2   -  No  No  No Yes Yes Yes  No   -  No
6 Yes  No   - Yes  No Yes   - Yes Yes  No
8 Yes Yes Yes Yes  No Yes  No Yes Yes Yes
9   -   -   -   -   -   -   -   -   -   -
  1. The first imputed dataset shows that the Yes/No imputes are only done for those columns with fall equal to 1, as intended. The remaining entries are imputed -.
  2. Rows 2 and 6 have missing fall, which is then imputed as either 0 or 1. Consequently, the aid column in the 10 imputed datasets can contain all three values: Yes, No or -, depending on how fall was imputed.
  3. Row 8 had an observed fall, so imputes for aid are always Yes or No.
  4. Row 9 has an observed no fall, so imputes for aid are always -.

I admit the procedure given above is somewhat complicated, but it does the job, and can be generalised to more difficult problems, for example, to three or more groups and for blocks of variables. Watch out for the case where one of the groups can become empty.

An alternative is to write your own mice.impute.myimpute() function. For example, copy mice::mice.impute.norm(), adapt it as you see fit, store it in your active work space under the name mice.impute.myimpute(), and call mice with the proper elements in meth set to "myimpute".

Hope this is useful.

@stefvanbuuren stefvanbuuren changed the title Conditional imputation allowing more than one imputed value Conditional imputation with the post option Aug 22, 2020
@stefvanbuuren
Copy link
Member

And here's a solution based on a custom imputation function.

library(mice, warn.conflicts = FALSE)

data <- data.frame(
  age = c(60, 54, 82, NA, 34, 58, 60, 58, 76, 40),
  fall = c(0, NA, 1, 0, 1, NA, 1, 1, 0, 1),
  aid = factor(c("-", NA, "Yes", "-", "No", NA, "Yes", NA, NA, "No"))
)

pred <- make.predictorMatrix(data)
pred["fall", "aid"] <- 0
pred["age", "aid"] <- 0

meth <- make.method(data)
meth$aid <- "aid"

mice.impute.aid <- function(y, ry, x, wy = NULL, ...) {
  if (is.null(wy)) wy <- !ry
  
  fall <- x[, "fall"][!ry] == 1
  idx <- x[, "fall"] == 1
  vec <- rep(NA, sum(wy))
  
  vec[!fall] <- 1
  vec[fall] <- as.integer(mice.impute.pmm(y[idx], ry = ry[idx], x = x[idx, "age", drop = FALSE], ...))
  
  levels(y)[vec]
}

imp <- mice(data, m = 10, maxit = 5, pred = pred, meth = meth, print = FALSE, seed = 1)
> imp$imp
$age
   1  2  3  4  5  6  7  8  9 10
4 34 60 58 40 82 34 76 58 34 58

$fall
  1 2 3 4 5 6 7 8 9 10
2 0 0 0 1 1 1 0 1 0  1
6 1 1 1 1 1 1 1 1 1  0

$aid
    1   2   3   4   5   6   7   8  9  10
2   -   -   - Yes  No  No   -  No  - Yes
6 Yes Yes Yes Yes Yes Yes Yes Yes No   -
8 Yes  No  No Yes  No  No  No Yes No Yes
9   -   -   -   -   -   -   -   -  -   -

@stefvanbuuren stefvanbuuren changed the title Conditional imputation with the post option Conditional imputation using post-processing or custom imputation Aug 22, 2020
@stefvanbuuren stefvanbuuren changed the title Conditional imputation using post-processing or custom imputation Conditional imputation: Two solutions Aug 22, 2020
@angelrodriguez2020
Copy link
Author

angelrodriguez2020 commented Aug 24, 2020

Dear Stef,

Thank you for your help.

Your 'post' solution works for numeric variables to predict aid, but not for factors. Please see what happens when I introduce a factor as a predictor in your code:

data <- data.frame(
  age = c(60, 54, 82, NA, 34, 58, 60, 58, 76, 40),
  fall = c(0, NA, 1, 0, 1, NA, 1, 1, 0, 1),
  aid = factor(c("-", NA, "Yes", "-", "No", NA, "Yes", NA, NA, "No")),
  sex = factor(c("M", NA, "F", "F", "M", NA, "M", "F", "F", "M"))
)

pred <- make.predictorMatrix(data)
pred["fall", "aid"] <- 0
pred["age", "aid"] <- 0
pred["sex", "aid"] <- 0

post <- make.post(data)
post["aid"] <- paste(sep = "; ",
                     "fall <- data$fall[!r[,j]] == 1",
                     "idx <- data$fall == 1",
                     "imp[[j]][!fall, i] <- levels(data$aid)[1]",
                     "imp[[j]][fall, i] <- mice.impute.pmm(y = data$aid[idx], ry = r[idx, j], 
                     x = model.matrix(aid ~ 0 + age + sex , data[idx, , drop = FALSE]))")
                     
imp <- mice(data, m = 10, pred = pred, post = post, seed = 1)`

>imp <- mice(data, m = 10, pred = pred, post = post, seed = 1)

 iter imp variable
  1   1  age  fall  aid
 Error in get("printFlag", parent.frame(search.parents("printFlag"))) : 
  object 'printFlag' not found 
```

@stefvanbuuren
Copy link
Member

The devil is in the details. Nothing is wrong with your code.

Apparently, the introduction of mice.impute.pmm() at this - somewhat unusual - point in the algorithm surprised the function that finds the printing flag. We'll correct that.

@stefvanbuuren
Copy link
Member

Solved in mice 3.11.2

@angelrodriguez2020
Copy link
Author

Dear Stef,
Thank you.
I can update to mice 3.11.0, not to mice 3.11.2.
When would that be possible?
Ángel

@stefvanbuuren
Copy link
Member

@angelrodriguez2020
Copy link
Author

Thank you Stef, but I couldn´t. This is what I got (R 4.0.2, Windows 10):

devtools::install_github(repo = "amices/mice")
Downloading GitHub repo amices/mice@HEAD
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: vctrs (0.3.3 -> 0.3.4) [CRAN]
5: backports (1.1.7 -> 1.1.9) [CRAN]

Enter one or more numbers, or an empty line to skip updates:1
vctrs (0.3.3 -> 0.3.4) [CRAN]
backports (1.1.7 -> 1.1.9) [CRAN]
Installing 2 packages: vctrs, backports
Installing packages into ‘C:/Users/arodr/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)

There are binary versions available but the source versions are later:
binary source needs_compilation
vctrs 0.3.3 0.3.4 TRUE
backports 1.1.7 1.1.9 TRUE

installing the source packages ‘vctrs’, ‘backports’

trying URL 'https://ftp.cixug.es/CRAN/src/contrib/vctrs_0.3.4.tar.gz'
Content type 'application/x-gzip' length 995078 bytes (971 KB)
downloaded 971 KB

trying URL 'https://ftp.cixug.es/CRAN/src/contrib/backports_1.1.9.tar.gz'
Content type 'application/x-gzip' length 18903 bytes (18 KB)
downloaded 18 KB

  • installing source package 'vctrs' ...
    ** package 'vctrs' successfully unpacked and MD5 sums checked
    ERROR: cannot remove earlier installation, is it in use?
  • removing 'C:/Users/arodr/Documents/R/win-library/4.0/vctrs'
  • restoring previous 'C:/Users/arodr/Documents/R/win-library/4.0/vctrs'
    Error in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
    (converted from warning) problem copying C:\Users\arodr\Documents\R\win-library\4.0\00LOCK-vctrs\vctrs\libs\x64\vctrs.dll to C:\Users\arodr\Documents\R\win-library\4.0\vctrs\libs\x64\vctrs.dll: Permission denied
  • removing 'C:/Users/arodr/Documents/R/win-library/4.0/vctrs'
  • restoring previous 'C:/Users/arodr/Documents/R/win-library/4.0/vctrs'
    Error in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
    (converted from warning) problem copying C:\Users\arodr\Documents\R\win-library\4.0\00LOCK-vctrs\vctrs\libs\x64\vctrs.dll to C:\Users\arodr\Documents\R\win-library\4.0\vctrs\libs\x64\vctrs.dll: Permission denied
    Execution halted
    Error: Failed to install 'mice' from GitHub:
    (converted from warning) installation of package ‘vctrs’ had non-zero exit status

@stefvanbuuren
Copy link
Member

Not related to mice, so please look elsewhere to solve this.

@LaurenceGeebelen
Copy link

LaurenceGeebelen commented Sep 29, 2020

Dear all,

I'm trying to solve a similar problem, however in my example aid would have an additional level, leaving 3 options when fall == 1 and I would like to use "polyreg" as a method for the imputation (as this is used for other multicategorical variables in my imputation model), I keep getting errors when adapting following code to polyreg , can someone help?

post["aid"] <- paste(sep = "; ",
"fall <- data$fall[!r[,j]] == 1",
"idx <- data$fall == 1",
"imp[[j]][!fall, i] <- levels(data$aid)[1]",
"imp[[j]][fall, i] <- mice.impute.pmm(y = data$aid[idx], ry = r[idx, j], x = model.matrix(aid ~ 0 + age, data[idx, , drop = FALSE]))")

Thank you!
Laurence

@amices amices locked and limited conversation to collaborators Apr 1, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants