Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Abnormally large estimates in latent class analysis


we conducted a survey with a discrete choice experiment (4 attributes, 1 with 6 levels and 3 with 3 levels, 12 choice sets, 10 blocks ) and used Analysis Manager within Lighthouse for statistical analysis.
We ran a latent class model and decided for a solution with 4 classes (segment sizes: N=50/90/44/44). One of the smallest classes with N=44 has abnormally large coefficients (e.g. -6.2/1.4/4.8). We tried different settings (different starting seed, no. of iterations, convergence limit). The best replication came always up with large coefficients. In a 3- and 5-class solution there is also one class with abnormally large estimates. And it is not always the class with the smallest number of members.

We are not sure how to interpret these large estimates. Shouldn't the coefficients be close to 1 or -1 at most due to effects-coded data? Are the large coefficients a sign for overestimation?

Appreciate any help.

Kind regards,
asked Oct 25, 2017 by Andrew Bronze (780 points)

1 Answer

0 votes
Large coefficients mean that people within that class are very consistent in choosing concepts featuring some levels versus others.  A 4.8 versus a -6.2 in the logit equation will make the likelihood of picking concepts with the first level MUCH higher than if the concept included the last level.  It could be due to overfitting.  Or, it could be due to a true effect that could be reproduced where this segment is very consistent in preferring one level to another.  

Sample size of n=44 for a segment of respondents in choice models tends to be on the thin side.  Many researchers in choice modeling prefer sample sizes of at least n=200 per segment.  Other researchers are willing to dip as low as n=100.  It is somewhat subjective.  But N=44 seems relatively small for a separately analyzed segment in choice modeling, so perhaps the latent class algorithm has capitalized on a pattern of choices seen in segments with small sample sizes.
answered Oct 25, 2017 by Bryan Orme Platinum Sawtooth Software, Inc. (132,290 points)

thank you very much for your comment. It was very helpful.
And you are absolutely right. We discussed the issue with the small sample size and tried to find a balance between reasonable estimates in terms of standard errors and plausibility of results.