averaging imputation of missing values

runningbirds Source

I got a few questions, I couldn't really find anything on with the documentation unless I'm missing something or don't understand imputation process/logic.

Basically the most important is that since sometimes the 'imputed' values are different, I'd like to take the average - if it is numeric - or mode if it is a categorical value.

All the examples that I see showing "complete(miced_model, 1)". If I'm running the mice model with 5 or 10 different iterations I don't see the point in just picking 1. I'd like the average of all of them.

Can anyone show me how to do this?

set.seed(2016)
library(mice)
nhanes # this is the dataset
nhanes[5,1]=NA  # setting up some categorical examples
nhanes[1,1]=NA
nhanes$age = as.factor(nhanes$age)
imputed_values = mice(nhanes, m = 5, method='rf',maxit = 3)
new_nhanes = complete(imputed_values, 'long') # or repeated? or what?

new_hanes_fixed =   # new data frame with averaged values imputed rather than just arbitrary '1st' iteration?

THANKS!!

rmissing-dataimputationr-mice

Answers

answered 1 year ago stats0007 #1

You should look at the comment of SimonG

You are completely on the wrong track. The whole point of multiple imputation is that you have different imputed datasets. (on which you would perform your analysis)

If you don't need multiple imputation you can directly use single imputation methods.( for example kNN or imri function from the VIM package)

answered 1 year ago wissem #2

It sounds like you want to pool your results of your analysis, that way you run your analysis on every imputed data set. Read more here on Pooling Data: https://www.r-bloggers.com/imputing-missing-data-with-r-mice-package/

comments powered by Disqus