Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Comparing utilities across groups of participants


I try to compare the utilities of several levels from different attributes (from a CBC study, estimated with HB) across groups of respondents (e.g. by experience, by age...).  The goal is to come to conclusions like: level 1 of attribute 1 is more important for group A than for group B.

Two questions arose:
1) Can I use MANOVA to do that (estimated utilities of the different attributes of the different levels as DVs; respondent characteristics as IVs)? Or how can I make this comparison?

2) I also estimated 2-way interactions. Do I need to incorporate this in the comparison or should I make a new estimation without interactions in order to draw valid conclusions from the comparison?


asked May 9, 2014 by Peter (120 points)
retagged May 9, 2014 by Walter Williams

1 Answer

0 votes
This is a very tricky question on a number of levels.  Let me try to help you work through it.

First, you described the goal of your statistical test to be: seeing if one level of an attribute is more "important" for Group A vs. Group B.  I think you meant to use the words "more preferred" rather than "more important".  Importance in conjoint analysis refers to the impact an attribute has on choice, as defined as the difference between the best and worst aspects of that attribute.  A single attribute level has a preference.  An attribute (with its multiple levels) carries an importance score, representative of the impact the varying levels of that attribute can have on choice likelihood for different concepts.

The next tricky thing is that the utilities are zero-centered within each attribute, so the preference for an attribute level is quantified in terms of its preference RELATIVE TO the other levels within the same attribute.  Thus, the utility score that results for Brand A for a respondent depends on what other brands are included within the same attribute.  So, we're not really isolating the utility for Brand A in an absolute sense when we make comparisons between groups.  We are looking at utility for Brand A relative to the other brands included in the same study.

Now, on to even trickier matters.  The absolute scaling (the magnitude of the utilities) within CBC/HB depends on the amount of response error.  If a group of respondents has low response error, all the utilities are uniformly stretched by a positive multiplier effect.    If a group of respondents has high response error, their utilities are shrunk by some smaller multiplier (closer to 0).  Thus, when you try to compare one group of respondents' utilities to another group of respondents' utilities from CBC/HB, you don't necessarily know if differences you are observing are due to substantive differences in preference or just differences in response error to the CBC questions.

A way to try to cope with this last issue is to use the normalized "zero-centered diffs" that our SMRT market simulator can export for you (Analysis + Run Manager + Export... and then choose the zero-centered diffs normalization option).

Once you've done that, then you could use one of many Frequentist statistical tests to compare an attribute level's relative preference between two groups of respondents.

However, the Bayesians would not like this.  The Bayesian test involves using the group variable (e.g. experience, age) as a covariate and then looking at the history of draws of alpha (the population means) between the two groups as reported by CBC/HB in the studyname_alpha.csv file.  The percent of times (across draws) one group of respondent's finds the relative utility of a level higher than the other group is an expression of your certainty that one group's relative preference is higher than the other.  While this Bayesian test is formally more true to the Bayesian spirit since you are using CBC/HB to estimate your utilities, it doesn't absolve you from the response  error/scale issue trouble I raised earlier.

So, while it seems such a question would be straightforward, there are a number of tricky issues to navigate!

Regarding your question about interaction effects, you would only need to incorporate them if your question involved whether the preference of a level for an attribute mattered depending on the context of its inclusion in a concept with a different attribute level.  If your statistical test needs to be dependent on the context of that attribute level existing with another attribute level, then indeed you need to incorporate interaction effects within the statistical testing.  Under the Bayesian test, you just add the main effects plus the interaction effects within each draw when tallying across the draws whether one group thinks a particular level of an attribute is relatively superior than a different attribute level than the other group.
answered May 9, 2014 by Bryan Orme Platinum Sawtooth Software, Inc. (132,290 points)
Thank you Bryan for this fast and extensive answer!

Let me conclude:
- Using normalized zero-centered-diffs (the same as "zero-centered-diffs" from SSI Web estimation) for the comparison is statistically correct. That is good.
- The Bayesian (with or without interaction effects) test would be preferable, but is incorrect because of the scaling issue (that also holds for the alphas) and there is no workaround.
- Even though I have estimated interaction effects, I can use the main effects from this estimation for the comparison described before. This means that the estimated main effects are independent from my choice of interaction effects.

Did I get it right?

Do I have similiar issues when comparing the imporances of attributes (instead of the utilities of levels) between groups?

Just interjecting with my own suggestion (if that works), but wouldn't it be better to just run a simulation where you have products where levels for all other attributes are held constant, and each product has a different level for the attribute that you want to measure, and then observe the impact in purchase likelihood within each subgroup?

You'll end up with percentage results, which you can just throw into a t-test to determine if differences are statistically significant.
Thank your for this interesting suggestion.

I see two possible issues:
- There are interactions between the levels and therefore the "base product" used for the comparison would influence the purchase likelihoods when changing one level
- I can not test the impact of several properties of my participants at the same time (compared to e.g. (M)ANOVA). That makes the test less powerfull

Bryan, do you think this is a feasible workaround?
Comparison between groups using MANOVA