Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Do individual level MBC simulator results disregard conditional relationships?


Up front: My question is rather academic I suppose. My results from the simulation are pretty good when looking at predicted aggregate shares and individual choices for holdout tasks.

I found that the MBC simulator assigns utility to options where - based on specified conditions (i.e. choice dependent on other option being chosen) - none should be. I noticed this when I was looking at individual level hitrates. Now, the aggregate market share for an option is the average of choice probabilities for that option (shares tab). Hence, these utilities apparently serve a purpose for aggregate results.

I just felt that the aggregate market share of an option should also be based on the average of choice probabilities from the respondents for which that item was, in fact, an option and whose choices were considered in the estimation of the utility function. Instead, it seems that the utility function is applied to every respondent, irregardless of whether that respondent had fulfilled the condition for making that choice (i.e. availability or first decision in a two stage choice szenario).

Are there particular reasons for the use of the "share of preference" rule across the entire sample for calculating aggregate shares while disregarding imposed conditions? Or is that just a matter of personal preference?

It seems feasible as simulation results are close to observed holdout choices.
I also found that using individual level predictions that accounted for the imposed conditions (first choice rule, counting choices only for eligible respondents) surprisingly yielded aggregate shares very close to the values derived from share of preference rule across the entire sample.

Best Regards,
asked Nov 14, 2012 by alex.wendland Bronze (2,080 points)
retagged Nov 14, 2012 by Walter Williams

1 Answer

+1 vote
This is a good question, and there were at least a couple routes we could have taken for this.  The situation described is when you have a two-stage choice, where the choice in the second stage is dependent on the first stage choice.  To set the stage, imagine there are two items on the menu, making up dependent variables, DV1 and DV2.  Imagine that DV2 is only available to be selected if DV1 was selected.

We do this by estimating a model for DV2 for only the tasks where DV1 was chosen.  But, this can mean that no information is available for building a DV2 model for a respondent (under HB) if that respondent never picked DV1.

We decided to make individual-level MBC simulations (resulting from HB) use a set of utilities for each respondent, even if that respondent didn't see any tasks for the model due to a conditional dependent variable relationship (2-stage choice) like the above.  Let me explain why it works out.

We multiply the predicted probabilities for the DV1 model by the predicted probabilities for the DV2 model to obtain the likelihood of the respondent selecting DV2.  For a respondent who never selects DV1, the predicted likelihood that he would select DV1 is of course extremely close to zero.  For DV2 predictions for such a respondent (who never had an opportunity to select DV2, since he never selected DV1), we use the population means for the DV2 model parameters to predict the likelihood that he'll select DV2 GIVEN that he has selected DV1.  However, for this respondent, the likelihood of selecting DV1 is extremely close to zero, so it matters very little what predictions we make for this respondent for the DV2 model.  Making predictions for DV2 based on the population means therefore seems a reasonable thing to do for such a respondent.

This approach kept all respondents having complete information for all dependent variable models, which made the software development task a bit easier.  It also reflects the fact that there is a non-zero likelihood that such a respondent would be predicted to pick DV2 given a new opportunity, which makes sense.
answered Nov 14, 2012 by Bryan Orme Platinum Sawtooth Software, Inc. (154,105 points)
The approach of conditioning probability for DV2 on DV1 probability makes sense.
In my last MBC project I found it necessary to adjust predicted choices on an individual level and hence the simulators frontpage with the share of preference rule did not reflect that. Hence I had to calulate shares based on predicted (and modified) choices by myself. That wasn't hard but it got me wondering what benefits the use of the share of preference rule has over the first choice approach, not so much in theory but in practical terms (i.e. how the approaches handle different scenarios of what comes out of the estimation).
Again, this is rather academic question, I suppose.
E.g.:"Is the share of preference rule more stable in the light of individual parameter with a lot of variance?" or stuff like that. When should one rely more on the one or the other approach? I would think that there are differences depending on the data and resulting estimation results.