Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Optimal Fit statistic for CBC/HB


In a previous post I asked about generating a McFadden's rho squared for individual level HB model, which as Keith pointed out is not a common request (I understand this is applying a frequentist fit stat to a Bayes analysis which is probably not proper). However, my question is simple, what are the optimal fit statistics for a CBC/HB output? Using the pooled aggregate Logit analysis in Sawtooth a number of fit statistics are supplied - stat significance for attribute levels, overall fit for the model (LLmodel vs LLNull) AIC, BIC etc. Yet the HB output provides no obvious fit measures or significance measures. This may be my ignorance to Bayes analysis but there must be some fit statistics that I can quote to indicate how well the model fits to the data... but I don't know what these are? Even the most uninformed client want's to know how well a model fits... so I need so help/advice.

Any advice would be most appreciated.   

asked Nov 8 by Jasha Bowe Bronze (1,680 points)

2 Answers

0 votes

I don't think it's wrong to use rho-squared if that's a statistic you and your clients are comfortable using.  For the most part I think folks use the RLH fit measure when reporting how well HB model utilities fit respondents' choices.   I believe folks find it a bit ,more intuitive than rho-squared and clients have told me that it feels to them more like an R-squared value you'd get from a regression analysis.
answered Nov 8 by Keith Chrzan Gold Sawtooth Software, Inc. (48,525 points)
0 votes
You certainly realize this, but it's worth saying for the benefit of others reading this post: the purpose of HB is not obtain maximum fit to individual data.  If we were trying to do that, we'd just ignore the upper-level of the hierarchy and just use purely individual-level logit to obtain maximum likelihood fit to each respondent's data.  This would obviously lead to overfitting and poor predictions, given the sparse nature of CBC data at the individual level.  So, we don't do that.  We leverage the upper-level model (the population means and covariances).

So, HB sacrifices some fit at the individual level when it considers the upper-level model.  It takes a two-prong approach to fit: we want to find individual-level part-worths that do a good job fitting what individuals have chosen.  But, we also want to come up with individual-level part-worths that we believe have a reasonably high likelihood of belonging to the population (where we assume the population is distributed multivariate normal).

Now, for fit statistics from the HB output...  I really like looking at Pct. Cert. in the runtime statistics shown on the screen of the software.  Pct. Cert. is the Percent Certainty, which is McFadden's Rho-Squared.  This is a measure of individual-level fit.  I like to keep a record of past CBC studies I've done and the Pct. Cert. achieved for each study.  Assuming I keep my HB settings the same (prior variance and prior D.F.), that way I can compare the individual-level fit for the most current CBC study to previous CBC studies I've done.

As another tool I like to work with, our CBC/HB Model Explorer tool (free for downloading from our website...at least free if you have a CBC license already) randomly holds out (repeatedly across multiple replications) a minority of the choice tasks for validation.  Across the replications, it summarizes the hit rate for your data set.  This can be useful for assessing how good the HB utilities are at predicting held out tasks.  And the nice thing is that you don't have to have created fixed holdout tasks in your survey for this tool to report the holdout hit rate; it just jack-knifes across your "random" tasks to use as holdouts.  

When looking at hit rate, if I have 4 concepts in the task, then the null hit rate (by chance) is 1/4 = 25%.  If the holdout hit rate reported by the CBC/HB Model Explorer is 50%, then I know my holdout validation is 2x better than the chance level.  I do this over multiple CBC data sets to gain some experience with what this ratio can be for different studies I've conducted.

Depending on your client, you might find it more intuitive to report Pct. Cert. to them, or the hit rate from the CBC/HB Model Explorer.  If you do multiple studies for the same client, they can see how these two measures compare across the CBC studies they've conducted with you.
answered Nov 8 by Bryan Orme Platinum Sawtooth Software, Inc. (128,365 points)