Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Acceptable individual level hit rate

Dear Forum,

I have conducted an ACBC study with 5 attributes (excluding price) and 4 attributes having 3 levels and 1 attribute with 2 levels.

Because hit rates cannot be calculated automatically for ACBC (see: https://sawtoothsoftware.com/forum/12907/holdout-cards-within-acbc), I have calculated them myself, i.e. get the winning concepts on the individual level and compare these to my holdouts.
I calculated an average hit rate (1 = correct level is chosen for Attribute XY, 0 = incorrect level is chosen for Attribute XY) of 59%.

My question is if this is within the acceptable range regarding the predictive validity. Thanks and best regards,
asked Dec 2, 2017 by Benedikt

1 Answer

0 votes
Could you please explain further how you calculated the hit rates?

Typically, you would show respondents (who had also completed an ACBC exercise) some holdout tasks that look like CBC tasks and include exactly the same attributes and levels as the ACBC exercise, where a None concept is not presented to respondents.  Then, you compare the concept chosen in each holdout task to the concept that you would predict the respondent would be most likely to pick (by summing the utilities estimated from the ACBC experiment).  Raw hit rate is computed by taking 1=hit and 0=miss, and averaging across respondents and holdout tasks.
answered Dec 3, 2017 by Bryan Orme Platinum Sawtooth Software, Inc. (131,990 points)
Dear Bryan,

I think that is exactly what I have done. I have created 2 hold-out tasks which looked just like the ACBC exercise. After I gathered the data I took the following steps: The steps included the output generation of the winning concept on the individual level and the output generation of the choices in the CBC hold-out tasks on the individual level. Next, the attribute levels between the winning concept and the two hold-out tasks were compared on the individual level. If the level of an attribute matches between the winning concept and the hold-out task, then the attribute was coded as “1” for this individual, i.e. it was marked as a hit. If the level of the attribute does not match between the winning concept and the hold-out task, then the attribute was coded as “0”.  Thus, with five different attributes (excluding the attribute price ) the hit-rate ranged from 0 to 5 for one individual. The resulting average hit-rate of the two hold-out tasks above all respondents was 2.83 or 56.66%.
You are doing a few things non-conventionally.  First, it is strange to take the "winning concept" from each respondent (I'm assuming you're using the concept from ACBC that survives to the end of the choice tournament) and to try to use the composition of that winning concept to try to score the holdouts.  Secondly, it is strange to try to count partial hits based on attribute levels.

Let's say you have 2 holdout tasks for each respondent that look like the choice tournament phase of the ACBC survey (3 concepts shown to respondents, with no None concept).  What you do is add the utility (from the estimated utilities from HB) for each concept.  You predict the respondent picks the whole concept from the triple that has the highest total utility score.  If it does not match what the respondent actually picked, then it is a miss.

So, because each respondent has just 2 holdout tasks, the possible hit scores are 2/2 = 100%, 1/2 = 50%, or 0/2 = 0%.  

I have some knowledge and experience about reasonable hit rates (for the sample) when following this procedure.  But, a lot depends on how many concepts you showed in the holdout tasks (2 vs. 3 vs. 4, etc.).
Dear Bryan,

thanks for the fast reply. This process is actually easier than the method I applied. Now my HitRate is 59,26%.

The Hit-Rate is low, because the first Holdout (which had a Hit-Rate of 50.52%)  had a total the utility of Concept 1 =  115.74, Concept 2 = 94.61 and  Concept 3 = 129.08, respectively. Thus many respondents chose Concept 1 instead of Concept 3.  

Best wishes,
Glad you found it easier.  Just to double-check your procedure: you are computing the total utility of the concepts for your two holdouts individually for each respondent, using each respondent's unique HB utilities, right?

Your comment about the first holdout having total utility of concept1=115.74, etc. is just a comment that on average you happened to create a somewhat challenging holdout task where all three concepts on average were very similar in utility, correct?
Hi Bryan,

thanks for correcting me (again). I have re-done the calculations now using each respondent's unique HB utilities (and not the summary utilities) and matched each respondent's estimated hold-out pick with his/her actual pick.
The overall hit-rate was 72.49%. For Hold-Out 1 it was only 66%, while for Hold-Out 2 it was 79%. Again as you mentioned, the low hit-rate for Hold-Out 1 is due to the fact that all three concepts were very similar in utility also on the individual level.

Would you say a hit-rate of 72.49% is sufficient for my experiment?

Thanks a lot and best regards,
Your hit rates for predicting holdouts (CBC-looking triples) look good to me.  Even very consistent respondents will only answer the same way twice (test-retest reliability) about 70% to 80% if you show them the same CBC task with three concepts (separated by a few other tasks, so it isn't obvious to them that you are asking the same question twice within the same survey).

I have a couple data points to refer to that involved ACBC predicting CBC-looking holdouts, though these holdouts were designed to be challenging.  First, we asked respondents three holdout tasks.  Then, we moved the three winning concepts for each respondent into a final holdout task, where the respondent was now choosing among the three winning concepts picked earlier.

Here are the hit rates we got when using ACBC/HB utilities to predict the answers to this customized and challenging holdout triple (CBC-looking choice task with three concepts and no None):

> First Experiment: laptop computers, 10 attributes.  Hit rate 60%.

> Second experiment: recreational equipment product.  Hit rate 62%.

These two examples used holdouts that were probably tougher to answer consistently than your holdouts were.   Your holdout hit rates of 66% and  72% are a bit better.   If I were you, I'd be pleased with your results.
Hi Bryan,

your help was essential for my master thesis so thank you a lot. Before I did not find any explanation in the Sawtooth Software of how to calculate Hit-Rates manually. If that is the case and there is really no explanation I recommend to simply integrate your explanation into the next software update. I think this would be very helpful for future ACBC-experiments.

Best wishes,