Counting Analysis for CBC
Top  Previous  Next

Overview

CBC data can be analyzed in a number of ways. Counting analysis is probably the most simple and intuitive method. It calculates a proportion for each level, based on how many times a concept including that level is chosen, divided by the number of times a concept including that level occurred. This is done automatically for each main effect and for each joint effect (between either two or three attributes). This simple analysis method may be adequate for some studies, but we generally recommend more sophisticated approaches, especially if the sorts of questions that need to be answered can only be approached through the use of a market simulator.
 


Importing CBC Data

Prior to conducting this analysis, you must import the data from your CBC study. We'll assume that you have already downloaded the data from the Web server, and prepared the .CHO and .ATT files using the SSI Web system (by clicking Analysis | Prepare .CHO and .ATT Files for Analysis). To import the CBC data for use within the CBC Analysis Module and Market Simulator (SMRT):

1.Open the SMRT software by clicking Start | Programs | Sawtooth Software | Sawtooth Software SMRT.  
2.Create a new study, by clicking File | New and providing a study name in the desired directory.  
3.Import your CBC data by clicking File | Import, specifying Choice Data (*.cho) as the Import Type, and clicking the Import button.  
 


Counting Choices

Counts provides quick and automatic estimation of the main effects and joint effects for collected CBC data. It calculates a proportion for each level, based on how many times a concept including that level is chosen, divided by the number of times a concept including that level occurred.

When you run Counts, processing begins immediately and the results are displayed in the report window. Every time you click Compute!, a new report is appended below the previous, and the report is automatically scrolled to display the most recent run. You can clear the report window by clicking Clear. You can weight the data, select subsets of respondents or tasks to process, or include banner points.

Following are data for a sample of 100 respondents, each of whom answered 8 randomized choice tasks, each including 4 concepts and a "None" option. The data are real, but the attribute levels have been disguised. Main effects and two-way interactions are included in the report.

Brand         
                        Total  
Total Respondents         100  
   Brand A              0.387      
   Brand B              0.207      
   Brand C              0.173      
   Brand D              0.165      
 
Within Att. Chi-Square 148.02  
D.F.                        3   
Significance          p < .01    
 
Shape         
   Shape 1              0.259      
   Shape 2              0.245      
   Shape 3              0.195      
 
Within Att. Chi-Square  13.45      
D.F.                        2   
Significance:         p < .01   
 
Size  
   Large Size           0.263      
   Medium Size          0.240      
   Small Size           0.196      
 
Within Att. Chi-Square  13.64  
D.F.                        2   
Significance:         p < .01    
 
Price  
   Price 1              0.132      
   Price 2              0.175      
   Price 3              0.254      
   Price 4              0.372      
 
Within Att. Chi-Square 151.32  
D.F.                        3   
Significance:         p < .01  
 
None  
None chosen:            0.068  
   
Brand x Shape  
Brand A   Shape 1      0.415  
Brand A   Shape 2      0.412  
Brand A   Shape 3      0.337   
Brand B   Shape 1      0.232   
Brand B   Shape 2      0.223   
Brand B   Shape 3      0.168   
Brand C   Shape 1      0.201  
Brand C   Shape 2      0.181  
Brand C   Shape 3      0.138   
Brand D   Shape 1      0.188  
Brand D   Shape 2      0.171  
Brand D   Shape 3      0.135   

Within Att. Chi-Square     0.82   
D.F.                          6   
Significance:           not sig  
 
(Due to space considerations, we have not shown the remainder of the report. Five additional tables would follow specifying the remaining two-way joint effects: Brand x Size; Brand x Price; Shape x Size; Shape x Price; Size x Price.)  
 
Each of the main effects is the proportion of times when a concept containing that attribute level occurs that the concept is selected by respondents.

Brand A was the most popular, having been selected 38.7 percent of the times it occurred. Brand D was least popular, having been selected 16.5 percent of the time it occurred. Since there are four brands, and also four concepts per task, each brand appeared once in every task. The sum of proportions for the four brands (not shown) is 0.932. The balance, 0.068, is the proportion of tasks in which respondents selected "None."

Price level 1 was the most expensive, and concepts at that price level were only selected about a third as often as concepts at the lowest price, level 4. Since Price also has four levels, each level appeared exactly once in each task, and the sum of their proportions is identical to the sum for Brand.

The Size and Shape attributes only had three levels, so their levels sometimes appeared twice in a task. That produces proportions with smaller sums. If a level appears twice in the same task, and if one of the concepts including it is selected, then the other concept is rejected. When an attribute has fewer levels than the number of concepts in a task, the sum of its proportions will be lowered. When making comparisons across attributes it is useful first to adjust each set of proportions to remove this artifact. One way to do so is to divide all the proportions for each attribute by their sum, giving them all unit sums.

Counts reports a Chi Square statistic for each main effect and joint effect indicating whether the proportions in that table differ significantly from one another. We refer to that Chi Square test as the "Within Att. Chi-Square." In the case of a main effect count, the Chi Square indicates whether levels of that attribute differ significantly in their frequency of choice. Beware of interpreting the Chi Square from aggregate counts as a measure of "Importance" for an attribute or assuming that the main-effect Chi Square test that is not significant indicates that the attribute had little impact on choice. Disagreement between individuals on what level is preferred can mask the impact of an attribute when respondent choices are aggregated. For example, if only two brands are tested and half of the respondents strongly prefer Brand A over Brand B, whereas the other half feel exactly the opposite, the aggregate count proportions will be equal, and the Chi Square will also be zero. In that case, we would be in error to infer that brand had no impact on choice for individuals.

The tests for joint effects measure differences among the proportions in the table beyond those due to the main effects. For example, suppose the proportions in the first row of a joint-effect table were all just half the size of corresponding proportions in the second row. Such differences would be due to the main effect for the row attribute, and would not show up as a large Chi Square for the joint effect. A large Chi Square value suggests a significant interaction effect between the two attributes. (No Chi Square is reported for 3-way joint effects).

Each effect is classified "not significant," "significant with p< .05," "significant with p< .01." Actual values of Chi Square and the appropriate number of degrees of freedom are also reported, so you can consult a table of Chi Square to determine the precise level of significance for each effect.

The Chi Square tests reported by Counts are computed differently than those reported by Logit, and will not agree precisely, though effects with highly significant Chi Square values for either module should also have highly significant values for the other.

As with other conjoint methods, it is often useful to summarize choice data with numbers representing the relative importance of each attribute. With utility values we base importance measures on differences between maximum and minimum utilities within each attribute. However, with proportions, corresponding measures are based on ratios within attributes. To summarize relative importance of each attribute we might first compute the ratio of the maximum to the minimum proportion for each attribute, and then percentage the logs of those ratios to sum to 100. We generally do not recommend computing attribute importances using aggregate (summary) data. Attributes on which respondents disagree will appear to have less importance in the aggregate, even though respondents feel very strongly about their differences in opinions. We recommend computing attribute importances using Latent Class, HB or ACBC utilities.

The joint effects tables provide the same kind of information as main effects, but for pairs of attributes rather than attributes considered one-at-a-time. For example, consider the following table for Brand and Price:

            Price1   Price2   Price3   Price4      Avg  
Brand A      0.262    0.320    0.398    0.570    0.387  
Brand B      0.083    0.146    0.254    0.347    0.207  
Brand C      0.104    0.100    0.163    0.321    0.172  
Brand D      0.078    0.129    0.206    0.249    0.165  
 
Average      0.132    0.174    0.255    0.372  

We have labeled the rows and columns for clarity, and also added a row and column containing averages for each brand and price. Comparing those averages to the main effects for Brand and Price, we see that they are identical to within .001. The similarity between main effects and averages of joint effects depends on having a balanced design with equal numbers of observations in all cells. That will only be true with large sample sizes and when there are no prohibitions.

Finally, counts data are ratio-scaled. A count proportion of 0.30 versus 0.15 means that respondents on average chose (preferred) the first level twice as much as the second. Since preference for an attribute level depends upon the desirability of the other alternatives within that same attribute, it is not appropriate to directly compare a count proportion from one attribute level to a level from a different attribute.



Some Notes on Counting Analysis


As mentioned at the beginning of this section, Counting analysis is a quick way to summarize the results of choice data. However, for getting the most from your CBC data, we recommend more sophisticated means of analysis, such as Logit, Latent Class, and HB. Counts analysis reflects some known biases that can prove problematic in some situations.

Given a large enough sample size, the number of times each level was displayed should be nearly balanced, in a 1-way, 2-way and even 3-way sense (though CBC's design methods only pay attention to 1- and 2-way representation). But with smaller sample sizes, random imbalances in the design can distort counts proportions. For example, if a particular brand level happened to be shown at a low price more often than other brands, the count proportion for that brand could be distorted upward. Other methods of analysis (Logit, Latent Class, and HB) are not subject to this difficulty.

The counting analysis produces different results from the logit approach. CBC's counting results are just the proportions of times when levels (or combinations of levels) are offered that they are in fact chosen. Since all attributes are constantly varying, the counting result for a specific attribute level does not depend on that level alone. Even an extremely undesirable level may be chosen because it is paired with desirable levels. The result is that the main-effects counts proportions are biased in the direction of being flatter than the results of logit simulations, in which all omitted attributes are assumed to be constant.

Other effects may work in an opposite direction. Under the Complete Enumeration, Shortcut, and to a lesser degree the Balanced Overlap method, CBC tries not to show the same attribute level twice in the same task. This means that a strong level has weaker levels as competition, which tends to accentuate differences between levels, partly overcoming the effect described above. For example, consider the case of the brand x price joint effect counts table, often used to reflect demand curves.

If attributes have different numbers of levels, the maximum possible choice proportion differs between them, which can make it even more difficult to compare results between attributes.