How CVA Calculates Utilities

Top  Previous  Next

Ordinary Least Squares (OLS)


This section assumes the reader has basic knowledge of linear regression.  However, understanding of the technical details in this section is not essential for using CVA.


The OLS calculator does ordinary least squares linear regression using a "dummy variable" approach.  A vector of independent variables for each observation (conjoint question) is built using information from the experimental design.  The vector has elements of 1, 0, or -1, depending on whether respective attribute levels appear in that question, and whether they appear on the left-hand (-1) or right-hand side (+1) of the pairwise questions.


The dependent variable for each observation is obtained by applying the indicated recoding transformation to the corresponding data value.  If the data value has the code reserved for missing data, then that observation (conjoint question) is not included in the calculation.


When using regression to estimate conjoint utilities, it is customary to delete one level of each attribute from the computation.  Otherwise, there is a linear dependence among the variables describing levels of each attribute, which leads to indeterminacy in the computation.  Omitting one level of each attribute from the computation is equivalent to setting its utility at zero, with the other levels measured as contrasts with respect to zero.  The OLS calculator omits the first level of each attribute from the regression computation.  Thus, if there are k attributes with a total of n levels, the regression is done with only n-k independent variables.  The indeterminacy could also be handled by adding side conditions, such as requiring that the utilities for each attribute sum to some constant.  However, our approach has the advantage of greater computational speed.


An intercept term is also computed by CVA, but it is not reported separately.  Since the utilities will most often be used by adding up sums consisting of one value from each attribute, the intercept has been divided by the number of attributes and that fraction has been added to every utility value.  Thus the first level for each attribute, which would otherwise be zero, will be equal to the intercept divided by the number of attributes.


The "r squared" value for each respondent does not contain any correction for degrees of freedom.  If the number of observations is equal to the number of parameters being estimated (levels - attributes +1), then the r squared value will be unity.


If the design is deficient -- containing either too few observations to permit estimation or insufficient information for a particular attribute level -- then a message to that effect will appear on the screen and utilities will not be estimated for that respondent.


If there are degrees of freedom available for error, then descriptive data will be written to a log file with information about the precision of estimation.


A statistic ("rms cor") is provided for each respondent, which describes the amount of correlation among the independent variables.  It is the "root mean square" of off-diagonal elements of the correlation matrix for the n-k independent variables.  Subtle relationships among variables can easily lead to faulty designs that would not be detected, and therefore we caution against paying much attention to this statistic.  We include it only because our users may be accustomed to similar statistics in other software packages.


In an orthogonal design with no missing data, this value will be either zero or a small positive number.  (Orthogonal designs  have correlations within attributes.)  Orthogonal designs are sometimes altered to eliminate "nonsense" questions, and this compromises orthogonality.  Also, some design procedures (CVA's questionnaire design module, for example) produce well-balanced but not perfectly orthogonal designs.


For each individual, standard errors of utility estimates are provided (within a log file), except for the first level of each attribute, which is assumed to have utility of zero.  These standard errors may also be regarded as standard errors of differences between each level and the first level of that same attribute.  These standard errors can be of diagnostic value.  Attribute levels with large standard errors should be given more attention in questionnaire design.  They may appear in too few questions, or they may occur in patterns that compromise the level of independence necessary for good estimation.


For more information on utility estimation, see Avoiding Linear Dependency.




Monotone (Nonmetric) Regression


This option for calculating utilities uses a method similar to that described by Richard M. Johnson in "A Simple Method of Pairwise Monotone Regression", Psychometrika, 1975, pp 163-168.


The method is iterative, finding successive solutions for utility values that fit the data increasingly well.  An initial solution is developed, either randomly or using information in the experimental design.  Two measures of goodness of fit are reported: theta and tau.




Suppose the conjoint questionnaire presented concepts one at a time and asked for a rank order of preference.  Although there would have been many concepts in the questionnaire, consider just four of them, concepts P, Q, R, and S.  Suppose the respondent ranked these concepts 7, 9, 13, and 17, respectively, and at some intermediate stage in the computation, utilities for these concepts are estimated as follows:


           Estimated  Preference

Concept    Utility    Rank

  P           4.5      7

  Q           5.6      9

  R           1.2     13

  S          -2.3     17


We want to measure "how close" the utilities are to the rank orders of preference.


One way we could measure would be to consider all of the possible pairs of concepts, and to ask for each pair whether the member with the more favorable rank also has the higher utility.  Since these are rank orders of preference, smaller ranks indicate preference, so we know that:


 Preference            Utility    Squared

                       Difference Difference

 P is preferred to Q   -1.1        1.21

 P is preferred to R    3.3       10.89

 P is preferred to S    6.8       46.24

 Q is preferred to R    4.4       19.36

 Q is preferred to S    7.9       62.41

 R is preferred to S    3.5       12.25

                        ---       -----                          

                       Total     152.36


Of the six pairs, five have utility differences with the correct signs (the preferred product has the higher utility), and one pair has a utility difference with the wrong sign.


Kendall's tau is a way of expressing the amount of agreement between the preferences and the estimated utilities.  It is obtained by subtracting the number of "wrong" pairs from the number of "right" pairs, and then dividing this difference by the total number of pairs.   In this case,


tau = (5 - 1) / 6 = .667


A tau value of 1.000 would indicate perfect agreement in a rank order sense.  A tau of 0 would indicate complete lack of correspondence, and a tau of -1.000 would indicate a perfect reverse relationship.


Tau is a convenient way to express the amount of agreement between a set of rank orders and other numbers, such as utilities for concepts.  However, it is not very useful as a measure on which to base an optimization algorithm.  As a solution is modified to fit increasingly well, its tau value will remain constant and then suddenly jump to a higher value.  Some other measure is required that is a continuous function of the utility values.




For this purpose we use the statistic "theta."  Theta is obtained from the squared utility differences in the last column of the table above.  We sum the squares of those utility differences that are in the "wrong order," divide by the total of all the squared utility differences, and then take the square root of the quotient.  Since there is only one difference in the wrong direction,


theta = square root(1.21/152.36) = .089


Theta can be regarded as the percentage of information in the utility differences that is incorrect, given the data.  The best possible value of theta is zero, and the worst possible value is 1.000.


Now that we have defined theta, we can describe the nature of the computation.


The process is iterative.  It starts with random values as estimates of the partworths.  In each iteration a direction of change (a gradient vector) is found which is most likely to yield an improvement in the partworths.  A number of small changes are made in that direction, which continue as long as theta improves.   Each iteration has these steps:


1. Obtain the value of theta for the current estimates of partworths and a direction (gradient) in which the solution should be modified to decrease theta most rapidly.


2. Try a small change of the partworths in the indicated direction, which is done by subtracting the gradient vector from the partworth vector and renormalizing the partworth estimates so as to have a sum of zero within each attribute and a total sum of squares equal to unity.  Each successive estimate of utilities is constrained as indicated by the a priori settings or additional utility constraints.


3.  Re-evaluate theta.  If theta is smaller than before, the step was successful, so we accept the improved estimates and try to obtain further improvement using the same procedure again, by returning to step (2).  If theta is larger than before, we have gone too far, so we revert to the previous estimate of partworths and begin a new iteration by returning to step (1).


If any iteration fails to improve theta from the previous iteration, or if theta becomes as small as 1e-10, the algorithm terminates.   A maximum of 50 iterations are permitted, and within any iteration a maximum of 50 attempts at improvement are permitted.  In theory, the iterations could continue almost indefinitely with a long series of very small improvements in theta.  For this reason it is useful to place a limit on the number of iterations.


To avoid the possibility of stumbling into a bad solution due to a poor starting point, the process is repeated 5 separate times from different starting points.  For each respondent, the weighted average of the five resulting vectors of part-worths is computed (weighted by Tau, where any negative Tau is set to an arbitrarily small positive number).  A weighted Tau is also reported with this final estimate of part-worth utilities.




How CVA Utilities Are Scaled


Monotone regression


CVA's monotone regression utility calculator scales utilities in a way that is easy to describe and to understand.  For each respondent, the values for each attribute have a mean of zero, and their sum of squares across all attributes is unity.  Here is an example, assuming two attributes, one with 3 levels and one with 2 levels:


                        utility   square


Attribute One Level 1     .50      .25

              Level 2     .10      .01

              Level 3    -.60      .36

                          ---      ---



Attribute Two Level 1     .44      .19

              Level 2    -.44      .19

                         ----     -----

                         0.00     1.00


OLS regression


CVA's OLS utility calculator scales utilities in a way that depends upon the data, and upon the researcher's use of the recode capabilities.  The calculation has these steps:


1.  If automatic recoding was specified, then the data are automatically recoded.  If no recode was specified, the values in the data file are used without modification.


2.  An array of "dummy" variables is constructed with a row for each conjoint question and a column for each attribute level.  Each cell of this array has a value of 1, 0, or -1, depending on the experimental design.  For single-concept presentation, the values are either 1 or 0.  If the level appears in the concept it is coded as 1, and if absent it is coded as 0.  For pairwise presentation, if a level appears in the left-hand concept it is coded as -1, or 1 if in the right-hand concept.  If an attribute level does not appear in either profile, then the corresponding array element is 0.


3.  The first level (column) for each attribute is omitted temporarily from the design, which avoids technical problems of indeterminacy in the solution. (See Avoiding Linear Dependency)


4.  OLS regression is used to predict the transformed data values from the surviving columns of the array (variables).  A regression coefficient is computed for each variable, as well as a single intercept.  The regression coefficients for the omitted variables are assumed to be zero.  


5.  The intercept is divided by the number of attributes, and the quotient is added to every regression coefficient, including those previously assumed to be zero.  The resulting values are reported as utilities for the attribute levels.  (The intercept is handled in this way to make it easy to calculate total utilities for products during simulations.  Since each product to be simulated will have exactly one level from each attribute, the simulator will be able to include the intercept automatically just by adding the utilities of its attribute levels.)


As can be seen from this explanation, with the OLS calculator the scaling of the utilities is completely under the control of the researcher.  Other things being equal, if the data are collected on a 100-point scale, the utilities will be about ten times the magnitude as they would be if the data were collected on a 10-point scale.



Page link: