I am a lonely peon currently researching and playing around with multiple imputation. I am using MICE in R to impute random missing data; however, I'm run into a problem when attempting to account for conditional or structured NAs in a dataset.
I'll provide a simplistic dataset in an attempt to illustrate my meaning:
TestData <- data.frame(Condition= c(1,1,1,1,2,NA,2,2), Dependent1=c(1,NA,2,3,NA,NA,NA,NA), Dependent2=c(1,12,44,1,NA,NA,NA,NA), Dependent3=c(NA,2,3,5,NA,NA,NA,NA), UnaffiliatedQ=c(1,NA,3,2,27,NA,32,35)) TestData$Condition <- factor(TestData$Condition, levels = c(1,2), labels = c("Yes","No"))
In my example, the variable "Condition" is a gatekeeper question which determines whether a respondent needs to fill the next three questions (Dependent#). If a respondent answers with "No" and he/she does not see the next three questions, then they are marked as NAs - though not technically missing.
I've come to ask what StackOverFlower's would do in this type of situation? If I Impute the NA value in the Condition variable, along with those in Dependent1, Dependent2, and Dependent3, how would I ensure that I don't end up with values in Dependent# that don't make sense (constraints)?
I've thought of possible solutions - but none that I think would be valid or a good idea. (e.g., creating a structured missing value like -999/ subsetting the dataframe based on conditional answers).
In reading through the documentation and paper of Mice's authors I don't see any arguments in mice for this type of situation. The other alternative is that I've simply been running down the rabbit hole of multiple imputation and this is not the correct use of it.
I appreciate your thoughts and help all!rimputationr-mice