Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

How should I interpret a low value of percent certainty in Logit analysis?

Dear all,

I set up a survey where respondents had to answer 8 best-worst questions. In each question, 4 attributes were shown.

The initial sample size was around 1800.
But 800 were disqualified (speeders, etc.), resulting in a final sample size of 1001 respondents.

I ran logit analysis and got the following results:

Log-likelihood for this model    -25518,11507
Log-likelihood for null model    -28998,85231
Difference                                              3480,73724

Percent Certainty                                     12,00302
Akaike Info Criterion                             38633,44920
Consistent Akaike Info Criterion    38694,21861
Bayesian Information Criterion    38687,21861
Adjusted Bayesian Info Criterion    38664,97311
Chi-Square                                            5786,33177
Relative Chi-Square                            826,61882

The p-value is 0.003448, significant at p < 0.01-

The percent certainty seems very low.
I rad in the forum that  that percent certainty is the equivalent of Mcfadden's rho-squared (pseudo R²) and I know that a low R² in social and behavioral sciences is not a problem.
But in this case, I have no idea and I didn't find any percent certainty benchmark.

So, I would like to ask your opinion about it.
Is the percent certainty good enough?
What should I do to increase it?

Thank you very much in advance.

Bests regards.

MIkael
asked Feb 7 by Mikael Linder

1 Answer

+1 vote
Mikael,

You're right that a low percent certainty isn't necessarily an odd finding.  That's true in the social sciences literature and it's true in choice experiments.  A lot of things can cause it (e.g. errors in coding, errors in data collection) but the most common cause is respondent heterogeneity in terms of preferences.  We know that respondents often have different tastes from one another, and that very often is the cause of low rho-squared.

To take an extremely simple example, imagine a population of respondents  doing a choice experiment about color, a single attribute with two levels, green and orange.  If half the respondents always choose green and half always choose orange, then each respondent has extremely strong preferences.  And if you accounted for that heterogeneity you might find that you were predicting each respondent's preferences with close to 100% certainty.  But if you run an aggregate logit model on your very heterogeneous sample, the model will predict a utility of 0.0 for green and a 0.0 for orange because the two have the same choice probabilities.   A rho-squared (or percent certainty) of 0.0 will result.

Of course in more complex models with lots of heterogeneity a similar thing happens and that prevents your percent certainty from being very high.  This is evidence that you may want to account for your sample's heterogeneity, say by running a latent class MNL or a mixed MNL model like hierarchical Bayesian (HB) MNL.
answered Feb 7 by Keith Chrzan Platinum Sawtooth Software, Inc. (90,475 points)
Dear Keith,

Thank you very much.

Have a nice weekend.

Kind regards,

Mikael
You, too, Mikael!
Hello Mikael and Keith,

Very useful question and answer.

So the percent certainty is only reported in the Logit analysis or also in other Latent Class and HB? and then dose it have the same meaning as well? [Mcfadden's rho-squared (pseudo R²)]

Thanks a lot
Yes, it gets reported for HB and for LC-MNL as well.
I reviewed the results for the LC-MNL & I found it in the final reported results. But for the HB, I found that it is reported for each iteration individually, How can I extract the final overall Percent Certainty of the applied model? is it by searching for the highest number overall or the average of the numbers after convergence? (the last 10K iterations)
And how important is it for model comparisons and choice?
Thanks again
Rho-squared is commonly reported in academic papers, but not so much in applications.  If you want to report it for HB, do this, here's a pretty good description of that process:  https://sawtoothsoftware.com/forum/12252/rlh-and-percent-certainty
...