Multiple Imputation (mice) and "Conditional Missings" in R

Madcap Source

I am using mice in R to impute random missing data. I ran into a problem when attempting to account for conditional or structured NAs in a dataset.

A simple dataset to illustrate the problem:

TestData <- data.frame(Condition= c(1,1,1,1,2,NA,2,2), 
Dependent1=c(1,NA,2,3,NA,NA,NA,NA),
Dependent2=c(1,12,44,1,NA,NA,NA,NA),
Dependent3=c(NA,2,3,5,NA,NA,NA,NA), 
UnaffiliatedQ=c(1,NA,3,2,27,NA,32,35))

TestData$Condition <- factor(TestData$Condition,
                         levels = c(1,2),
                         labels = c("Yes","No"))

In this example, the variable Condition is a gatekeeper question which determines whether a respondent needs to fill the next three questions, Dependent#. If a respondent answers with "No" and he/she does not see the next three questions, then they are marked as NAs - though not technically missing.

What can I do in this type of situation? If I Impute the NA value in the Condition variable, along with those in Dependent1, Dependent2, and Dependent3, how would I ensure that I don't end up with values in Dependent# that don't make sense?

I've thought of possible solutions, but none that I think would be valid or a good idea,e.g., creating a structured missing value like -999 subsetting the dataframe based on conditional answers.

In reading through the documentation and paper of mices authors I don't see any arguments in mice for this type of situation. The other alternative is that I've simply been running down the rabbit hole of multiple imputation and this is not the correct use of it.

rimputationr-mice

Answers

comments powered by Disqus