Finding the best LCA model in poLCA R package

fritzz Source

I am applying LCA analysis with PoLCA R package, but the analysis not resulted since three days (it did not find the best model yet) and occasionally it gives the following error: "ALERT: iterations finished, MAXIMUM LIKELIHOOD NOT FOUND". So i cancelled the process at 35 latent class. I am analyzing 16 variables (all of them categorical) and 36036 rows of data. When I test the variable importance for 16 variables in Boruta package, all the 16 variables resulted as important, so i used all 16 variables in LCA analysis with poLCA. Which path should i follow? Should I use another clustering method such as k-modes for clustering categorical variables in this dataset? I use the parameters with 500 iterations and nrep=10 model estimation number. The R script i use to find the best model in LCA and one of the outputs is as follows:

    for(i in 2:50){
    lc <- poLCA(f, data, nclass=i, maxiter=500, 
                tol=1e-5, na.rm=FALSE,  
                nrep=10, verbose=TRUE, calc.se=TRUE)
    if(lc$bic < min_bic){
        min_bic <- lc$bic
        LCA_best_model<-lc
    }
}

========================================================= Fit for 35 latent classes: ========================================================= number of observations: 36036
number of estimated parameters: 2029 residual degrees of freedom: 34007
maximum log-likelihood: -482547.1
AIC(35): 969152.2 BIC(35): 986383 G^2(35): 233626.8 (Likelihood ratio/deviance statistic)
X^2(35): 906572555 (Chi-square goodness of fit)
ALERT: iterations finished, MAXIMUM LIKELIHOOD NOT FOUND

rfor-loopcluster-analysisdata-sciencecategorical-data

Answers

comments powered by Disqus