Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

High standard deviations in CBC HB


I wanted to use HB to get the average utilities for my attribute levels.
I got the following results:

Average utilites             Standard deviation

-89.55234                         40.24097
11.93716                         13.56217
77.61518                         45.09184

28.59257                         57.79848
32.43272                         36.19840
-61.02529                         52.07415

25.19890                         22.41492
10.33953                         11.87836
-35.53844                         28.93222

-17.65147                         19.10451
6.22906                         10.61123
11.42241                         16.58672

5.50456                         16.85396
8.55311                         18.88350
-3.69946                         14.89224
33.75657                         32.87012
-44.11478                         40.21998

I am now a little bit concerned about the high standard deviations. I know that high standard deviations (SD) are not a bad thing in general. It just means that the respondents have different preferences. However, I was wondering if there is something like "too high SDs"?

I am planning to seperate the respondents after the general analysis and then use HB to see if the two groups have different preferences. If I expect them to have different preferences the SD should be lower than for the analyses of all respondents, right?

Thank you very much!
asked Feb 16 by Carla Merz (230 points)

2 Answers

0 votes

In an aggregate model a large standard error would keep you from concluding that your coefficient was significantly different from zero.  But with HB you're right, it's just telling you that there's a lot of heterogeneity - and far from being a bad thing, it's telling you that it was a good thing you used HB to account for that heterogeneity.

If you separate the respondents into sub-groups, the standard deviations of those sub-groups will be smaller only if the the division into sub-groups results in groups that are less heterogeneous (i.e. if you divide them by an irrelevant variable like odd or even ID number there's no reason for the standard deviations to shrink).
answered Feb 16 by Keith Chrzan Platinum Sawtooth Software, Inc. (53,875 points)
Thank you, Keith!
0 votes

For comparison, I just ran HB for a real CBC data set I have to see what my utilities (under zero-centered diffs scaling, like you are using) and standard deviations are.  Here's my results:

Label                      Utilities   StdDev

Att1, Level 1    -33.41    47.93
Att1, Level 2    -1.88    58.01
Att1, Level 3    35.28    58.55

Att2, Level 1    -29.79    27.30
Att2, Level 2    0.25    16.42
Att2, Level 3    29.54    33.75

Att3, Level 1    -69.64    54.87
Att3, Level 2    24.97    27.43
Att3, Level 3    44.66    46.90

Att4, Level 1    -26.23    37.41
Att4, Level 2    26.23    37.41

Att5, Level 1    -32.71    33.76
Att5, Level 2    32.71    33.76

Att6, Level 1    53.14    52.11
Att6, Level 2    30.46    25.14
Att6, Level 3    -20.27    27.22
Att6, Level 4    -63.33    47.74

I think you will agree that my standard deviations (relative to the size of the utilities) are just as large or larger than yours.

Large standard deviations generally mean large differences in opinion across people; but if you did have a poor experimental design they also can signal lack of precision on the utility estimates.  To rule out the concern that you have poor precision on the utility estimates, you can run aggregate logit (equivalently, 1-group latent class) and examine the standard errors that are reported.  You are hoping for standard errors for the aggregate logit main effect utilities of 0.05 or less.

Now, if your end-goal is to break respondents into groups and conduct f-tests or t-tests between the groups to test for significant differences, what you are hoping for is big differences on average between the groups in terms of utilities (preferences).  And, at the same time, you are hoping for small differences within the groups in terms of standard deviations.  In other words, to maximize your between-group f-test statistic, you want big differences between groups and small differences within groups.
answered Feb 16 by Bryan Orme Platinum Sawtooth Software, Inc. (135,865 points)
Thank you, Bryan! The standard errors within the aggregate logit are all < 0.05. The high standard deviations really seem to be caused by the heterogeneity of preferences.