Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

counts vs utility part worths

I'm currently facing a discrepancy between counts and utility estimates for one of my categorical attributes that I'm struggling to explain.

Per count analysis level 2 of said attribute is chosen disproportionately less often than the other 4 levels by a small but significant margin.
Level 1    37.62
Level 2     32.13
Level 3    36.95
Level 4     41.87
Level 5     41.21

In estimation, to my understanding, the first level is "dropped" to allow for identification of the model and anchor the utilities of the other levels (as the reference point with a utility of zero).
From that understanding, I would think it follows that - judging by the counts - the utility of level 2 should be lower than that of the reference level 1, no? (assuming there is no massive correlation in the design that couples level 2 with some other impactful attribute-level with large utility which I checked for and the standard errors are also well-behaved so that I don't think correlation is at play here)

However, estimation (within MBC tool, default effect coding w/ level 1 = 0 0 0 0 ) yields the following utility structure:
Level 2   0,525
Level 3   0,723
Level 4   0,753
Level 5   0,871

Am I missing something? Thanks in advance for any pointers!
Alex
asked Jan 5 by alex.wendland Bronze (2,005 points)

1 Answer

0 votes
It sounds like you've got it right in terms of your understanding.  Logit tends to be more accurate than counts.

Check on a couple of things:

1.  Make sure you did not constrain the utility estimation worst-to-best.
2.  Did you use any prohibitions in the design involving this attribute?
answered Jan 5 by Bryan Orme Platinum Sawtooth Software, Inc. (144,140 points)
Hi Bryan, thanks for the quick response!
re 1. no constraints
re 2. no prohibitions, design is almost perfectly balanced
Not sure how to reconcile utils with what looks like a pretty clear signal in the counts. Got lots of observations too so should be fairly stable technically.
The only theory I have can't really be resolved by looking at the data:
The levels correspond with various degrees of transparency w/ regard to the applicable costs, i.e., the costs are more or less explicitly visible or packaged/hidden away (into fine print, expressed in %ages, broken down or vaguely labeled). Level 2 is when the costs are most clearly comprehensible w/o brain gymnastics, in level 1 costs are most "opaque".

Could it be that the low counts for level 2 reflect that costs are generally perceived as higher there as they are more visible?
And once I estimate including the (linear) cost parameter this is resolved?

If this is the case, this would suggest using a price interaction effect parameter, right?
I'm worried that something is going wrong on your logit estimation.  If you have large sample sizes, your aggregate logit utilities really should be very close to LN(counts).

So, if your counts are:

Level 1    37.62
Level 2     32.13
Level 3    36.95
Level 4     41.87
Level 5     41.21

...and if level 1 is the reference level, with zero utility.  I'd expect aggregate logit utilities around:

 0.00
-0.16
-0.02
 0.11
 0.09
That's exactly what I expected but can't wrap my head around how the same data set yields different pictures. I'll try narrowing down the issue with different logit specifications and circle back if I find more implications.
I now specified a model where the transparency attribute in question was used as the only IV - no constraints, no grouping - and again the logit estimates had about the same structure with all levels larger than level1, i.e., > 0. Now I'm worried too. :(
Alex, Walt just let me know that you guys determined that your problems here are due to a bug in the way MBC is reading in non-English locale .CSV files, when used with weighting and aggregate logit.  Sorry!!!  Thanks for your patience and I feel badly for pushing the detective work back on you.  Our programmer, Walt, is working on a fix.
...