Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Likelihood Test for Interaction Effects - Degrees of Freedom

Dear all,

I have computed a likelihood-ratio test to test the external validity of the interaction-effects model vs. the "generic"/ main-effects model.
I thought that the degrees of freedom used to find the critical value for the chi-squared distribution would be equal to the number of parameters added (https://sawtoothsoftware.com/forum/13313/the-2-log-likelihood-test-for-interaction-effects?show=13313)
In my study, there are five attributes with 3 level each leading to 10 degrees of freedom accordning to the information above (see link).
I added 2 interaction effects, i.e. A x B and B x C leading to additional 18 parameters (!) in the model (w/o interaction effects, the model includes 16 parameters incl. the none-option). To find the critical value from the chi-squared distribution, which value do I have to use? DF = 18 (since 18 added parameters)?

Also, my analyses showed that the interaction-effects model fits the observed data significantly better (according to likelihood-ratio test, questions above). However, the goodness of fit criteria such as percent certainty (28.2 main, 28.83 interaction), rmse and mae just slightly increased, the hit rate even decreased. Also, the interaction effects are not interpretable very well across the four segments (I am doing an LC analysis). I could do so, but it just would be become a way much more complex, and the sample size (N= 265) is not that much.
Based on this information, is it fair to say that I go with the main effects model? Could there also be an issue with "overfitting"?

Best regards!
asked Jul 29, 2017 by briniminii (490 points)
retagged Jul 29, 2017 by Walter Williams

1 Answer

0 votes
Best answer
Several things to note here:

First, the percent certainty is on an altogether different scale than log likelihood so the fact that it only goes up what seems like a little really doesn't matter.  

Similarly for your fits of predictions to holdouts (hit rate and, I suspect MAE and RMSE):  they're measuring different things and not with the sensitivity of the log likelihood test.

Finally, if you have 4 segments then I think your interactions add 18 x 4 =72 parameters, so that may be part of the problem. Are the interactions still significant after using 72 d.f. for the chi-squared test?
answered Jul 29, 2017 by Keith Chrzan Platinum Sawtooth Software, Inc. (85,325 points)
selected Jul 29, 2017 by briniminii
Thank you!
Working with 72 d.f. gives me a critical value for chi-square of about 92. The likelhood-ratio test (-2 x (LL interaction - LL main)) gives me -46, so it is not significant - is that correct?

Regarding hit rate, MAE and RMSE: is it correct to say that the hit rate is a measure of the model's predictive accuracy and is used to measure internal validity whereas MAE and RMSE are measures for external validity?

Thanks for your help! :-)
Yes, you are correct that the LL test is not significant.  

Hit rates or MAE or RMSE are all measures are, in a sense, all measures of predictive validity.  If your holdouts are holdout questions and not holdout respondents (as they are necessarily for hit rates) then they are not impressive as measures of external validity, for which you need to be able to predict to new sample.  So if you have holdout respondents you can make a more convincing case that you have external validity.