Sawtooth Software: The Survey Software of Choice

Conference Nuggets: A Summary of Key Take-Aways

We've distilled some of the most important points from each of the talks. Obviously, we cannot do justice to each presentation in a few paragraphs. The full text of the papers will be available in the 1997 Sawtooth Software Conference Proceedings. If you haven't yet ordered your copy, please consider adding this valuable reference to your shelf. An order form is available on page 7 of this newsletter.

Overcoming the Problems of Special Interviews on Sensitive Topics: Computer Assisted Self-Interviewing Tailored for Young Children and Adolescents (de Leeuw, Hox, Kef, Hattum): The authors presented results from two studies: the first examined bullying in elementary schools, the second surveyed blind adolescents and young adults. Key findings were:

  • Respondents were more likely to share sensitive information under CASI.
  • CASI resulted in fewer missing values and tighter standard deviations than paper.
  • Counting all costs, CASI was significantly less expensive than the paper-based implementation.
  • Interviewers and respondents alike were generally pleased and comfortable with computerized interviews.

Best Practices in Interviewing Via the Internet (Karlan Witt): The rapid growth of the Internet has opened up faster, less costly ways of collecting data. "The Internet brings with it a host of unique limitations that impact any research effort in this area," Karlan explained. According to Karlan, the incidence of Internet access in the U.S. stands at 16% as of Q2 1996. Karlan reported the relative incidence of browsers for Q4 1996: Microsoft Internet Explorer 6%, Netscape Navigator 37%, AOL browser 10%.

Karlan conveyed a lot of advice regarding the use of this new medium, which we unfortunately can't cover for space limitations. Two important points were:

  • Ensure that potential respondents have access to the Internet, and be comfortable in navigating to the desired web site and using their browser to take the survey.
  • Internet surveys must be tested under different platforms and browsers to ensure that the survey performs properly.

Karlan predicted that low barriers of entry will cause overuse of the Internet for conducting surveys. She warned that "Overuse and general abuse will likely lead to a backlash of potential respondents, similar to that currently seen in the telephone arena."

An Alternative Approach to Brand Price Trade-Off (Ray Poynter): While CBC is considered the tool of choice for pricing research in the U.S., Europe is still fond of the BPTO method. The traditional BPTO method focuses on just two attributes: brand and price. Respondents choose from a set of concepts (cards) with all brands starting at the lowest price. When a card is chosen, it is replaced by the same brand at a slightly higher price, while the non-chosen brands remain the same for the next task. Traditional BPTO has been faulted for encouraging patterned and unrealistic behavior.

Ray showed creative ways to break the patterned behavior by:

  • Using more realistic starting prices for the first task,
  • Increasing the price for the chosen concept and simultaneously reducing the price for the non-chosen items,
  • Randomly removing brands from each choice set.

Ray programmed the improved BPTO task in Ci3, but comments that it takes a good Ci3 programmer. Ray described how to calculate PEP (Purchase Equilibrium Prices): dollar amounts that make a respondent indifferent between two brands, and how to incorporate this information for use in a first-choice simulator.

Creating End User Value with Multi-Media Interviewing Systems (Dirk Huisman): Dirk showed examples of how multi-media technology can enhance the realism of surveys. Making interviews better reflect the real world may result in better data. Dirk reported the results of a split sample ACA interview, where some of the attributes were shown in multi-media. Interestingly enough, he found little difference between the utilities calculated from the text-based ACA versus the multi-media ACA.

A Comparison of Full- and Partial-Profile Best/Worst Conjoint Analysis (Keith Chrzan, Ritha Fellerman): Best-Worst is a questioning technique that displays a product described on multiple attributes and asks respondents to identify the features that make them most and least want to purchase the product. "The most unique strength of best/worst conjoint analysis," the authors stated, ". . . is that it eliminates the arbitrariness of the scale origins of the individual attributes."

The authors presented results from a study comparing full- and partial-profile best/worst experiments. They noted that full and partial profile best/worst models may result in different estimates of preference structure and concluded: "Apparently there is something specific to best/worst measurement that makes it not work with partial profiles."

Efficient Experimental Designs Using Computerized Searches (Warren Kuhfeld): Warren introduced the concept of design efficiency and argued that orthogonality in conjoint experiments is less necessary today. Orthogonality was important in days when computers were not widely available. If an orthogonal design was used, relatively simple formulas were available for hand or calculator ANOVA computations. Today, general linear models such as OLS do not require orthogonality for the unbiased estimation of effects.

Warren explained the principles of orthogonality and balance, introduced the measure of D-efficiency, and compared two computerized search routines for finding efficient experimental designs: SAS's PROC OPTEX procedure, and Sawtooth Software's CVA designer. For the size of designs commonly used in conjoint experiments, Warren found the CVA routine to find designs about 97% as efficient as OPTEX, but that CVA's designs tended to be more balanced. He also found CVA easier to use than OPTEX. Warren concluded: "For small problems like you would typically encounter in a full-profile conjoint study, CVA seems to do an excellent job. However, for larger and more difficult problems, it often fails to find more efficient designs that can be found with PROC OPTEX."

Practical Ways to Minimize the IIA-bias in Simulation Models (Rainer Paffrath): Many conjoint simulations suffer from IIA problems which can sometimes cause less-than-satisfactory results. Rainer reviewed the oft-cited red-bus/blue-bus example, which demonstrates how nearly-identical products together in a conjoint simulator can lead to net share inflation for like products. He pointed out weaknesses in Model 3 from the ACA, CBC and CVA simulators. Rainer contended that corrections for product similarity should be customized and usable at the individual level, and that each individual's importance structure should be taken into account.

Design Considerations in Choice and Ratings-Based Conjoint (Jon Pinnell, Sherry Englert): Jon presented results from three choice studies in which the number of concepts (alternatives) was varied within and between respondents. He pointed out that choice tasks with just two alternatives (i.e. A vs. B) would lead to only one inferred inequality (if A is chosen, A>B); whereas a first-choice from a task with six concepts leads to five inequalities (if A is chosen, A>B, A>C, A>D, A>E, A>F). Jon showed that the additional time required to make choices from more complex tasks is comparatively less than the value of additional information gained.

After comparing part-worths from choice sets of 2, 4, and 7 alternatives, Jon concluded: ". . . our findings caution against the use of pairs. Our data show that pairs are processed differently, have lower predictive validity, are less stable, and don't save much time relative to larger tasks."

Extensions to the Analysis of Choice Studies (Tom Pilon): Tom presented some additional types of analysis that can be done using standard CBC data. He reported results from a beer study, and showed how cross-elasticities for brands could be calculated (by regressing the log of choice volume on the log of price) and incorporated into a market simulator.

Tom argued that the standard logit simulator which assumes constant cross-elasticity across brands was not entirely realistic for the beer market. A cross-elasticity simulator lets brands that compete closely (perceived as close substitutes) take relatively more share from one another as a result of price changes than from brands which are not perceived to be as substitutable. Tom also demonstrated how to convert a cross-elasticity matrix into a "brand similarities matrix" for use in an MDS perceptual map. Brands which competed closely with one another were situated close to one another on the map.

Respondents' Behavior in Complex Choice Tasks; A Segmentation-Based and Individual Approach (C.M. (Marco) Hoogerbrugge): Marco spoke of the superiority of individual-level models versus aggregate models. Even though choice modeling has received more attention than traditional conjoint as of late, most choice modeling still is done at the aggregate. Marco compared two methods for segmenting choice data: Latent Class and K-Logit.

Latent Class segments the data based on choices and respondents have a probability of membership in each group. K-Logit is much like cluster analysis in that it finds segments and assigns each respondent to one and only one segment. Marco reported that K-Logit is much faster than Latent Class, but that the results are less robust.

Individual Utilities from Choice Data (Rich Johnson): Rich presented a new method for calculating individual-level utilities from CBC data. He explained that Latent Class assumes each individual belongs to one group or another, with probabilities of membership summing to 100%. In the past, some researchers have calculated individual utilities by multiplying probabilities of membership by class utilities. Rich graphically demonstrated that probability weighting assumes all respondents lie between the Lclass groups. Such solutions may fit average respondents well, but may improperly represent most cases. By recognizing that individual-level utilities can be calculated using a linear combination of group utilities where weights can be both positive and negative, his method captures more heterogeneity and better reflects individuals' positions.

Rich compared results from Monte Carlo simulations and real data sets. His new method performed better than probability weighting in terms of R-squared with known utilities and hit rates for holdout choices. Rich commented that Hierarchical Bayes methods are probably the best overall approach for representing individual utilities from choice, but pointed out that computers are still too slow to make these useful in practice. Rich's method computes much more quickly and can be a practical solution for now.

He concluded, "One of the problems . . . with choice data, is that of predicting the market's response to complex combinations of interactions, differential cross effects, and varying similarities among products. It seems likely that all of these problems will be diminished when modeled at the individual level."

Assessing the Validity of Conjoint Analysis--Continued (Bryan Orme, Mark Alpert, Ethan Christensen): Bryan pointed out that despite over 20 years of conjoint research, very little actual evidence has been published about conjoint's ability to predict real world decisions. He suggested that holdout tasks commonly used in survey research may not be realistic, and that they may better gauge respondent consistency than validity.

Bryan presented the results of a pilot study where respondents received both regular holdout choice tasks, and a more intensive 10-minute exercise which he termed the "Super Holdout Task." The attribute importances did not appear to differ between the two types of holdout tasks. Importances for traditional full-profile conjoint and CBC were shown to be more extreme than ACA. Bryan speculated that the Super Holdout Task might not have been realistic enough to accurately reflect the real world, and challenged the attendees to publish validation studies with actual purchase data.

Solving the Number of Attribute Levels Problem in Conjoint (Dick Wittink, Bill McLauchlan, P.B. Seetharaman): Dick is credited with discovering the number of attribute levels effect in conjoint, and is probably the leading expert on that topic. In his presentation, he demonstrated that researchers can dramatically increase the importance that attributes receive by simply increasing the number of levels on which attributes are described, and provided quantitative evidence on how dramatic this effect can be. He commented that ACA is less susceptible to the number of levels effect than full-profile. Dick argued that the source of the effect in ACA can be largely attributed to two factors:

  • the lack of perfect utility balance in the pairs design,
  • the propensity of respondents to answer toward the middle of the scale even though the predicted response would be more extreme.

When respondents "split the difference" between the predicted value and the midpoint in paired comparison ratings, it tends to increase the importance of the attribute defined on more levels.

A customized version of ACA was developed to achieve better utility balance by expanding the number of levels for more important attributes. A split-sample study was conducted using ACA vs. the customized version. Holdout hit rates were higher for the customized version.

What We Have Learned from 20 Years of Conjoint Research: When to Use Choices vs. Graded-Pairs vs. Full Profile Ratings (Joel Huber): Joel pointed out that respondents adopt different strategies for answering different types of conjoint questions. Researchers should understand these simplification strategies and match the right method to the context of actual marketplace decisions. He summarized the strengths of the methods as follows:

  • Self-explicated models are best in the case of many attributes, where expectations about levels and associations among attributes are stable. They work better in predicting decisions about independent alternatives than for competitive contexts.
  • Paired comparisons are most appropriate for modeling markets in which alternatives are explicitly compared with one another, approximating a deeper search of a broad range of attributes, and where within-attribute value steps are smooth and approximately linear.
  • Full-Profile works best when it is desirable to abstract from short run beliefs, when market choices reflect simplification toward the most important variables, and the decision focus is more within alternative rather than explicitly made using side-by-side comparisons between options.
  • Choice is most appropriate for simulating immediate response to competitive offerings, when decisions are made based on relatively few attributes with substantial aversion to the worst levels of each attribute, and when consumers make decisions based on comparative differences among attributes.

In contrast to what is becoming popular agreement regarding the superiority of choices, Joel cautioned that choices may not always work better than more traditional approaches.

Perceptual Mapping for the Current Millennium (John Fiedler, Tom Wittenschlager): John rehearsed why he prefers using discriminant analysis (DA) based perceptual mapping versus Correspondence Analysis. Among many reasons, he lists:

  • There are more interpretable relationships between attribute vectors and product points than Correspondence Analysis.
  • DA is more efficient at cramming a lot of information into a low dimensional space.

John showed a perceptual map he created, and the various refinements it underwent to reflect the market in the most meaninful way for his client. He argued that the APM software package is elegant in its approach, but that the software is outdated. He provided SPSS code and steps for creating DA maps using APM's method.

John recommended that respondents not just rate brands on the stated most important attributes. "Restricting ratings to most important' attributes may overlook attributes critical to marketplace differentiation," he argued. To maximize the value of each respondent's contribution toward a meaningful and discriminating map, John recommended that each respondent rate more products at the expense of attributes. He maintained that "It is a waste to have a respondent rate only one or two brands on dozens of attributes when he or she could rate five or six brands on seven or eight attributes."

Obtaining Product-Market Maps from Brand Preferences (Terry Elrod): Terry applied a different mapping technique to the same data set used by John Fiedler. John's map had been based entirely on ratings of brands on attributes, whereas Terry's map was based entirely on brand preferences. Terry's technique is a maximum likelihood method that assumes a continuous distribution of individual preferences, and finds the brand locations and the preference distribution that together best fit the data. Terry noted that his map appeared to be similar to John's, but that John had required several re-computations to incorporate client reactions, in contrast to his which was based on the data alone. Terry's method is so computationally intensive that it was not possible until recently, but may become a more useful approach as computer speeds continue to improve.

An Integrated Choice Likelihood Model (Carl Finkbeiner): Carl noted that several different kinds of data are useful in studying respondent preferences although current methods usually employ only one type of data at a time. However, there may be benefit in being able to combine data of several types to estimate part worths for each respondent. Carl described an "Integrated Choice Likelihood Model" which does this by integrating self-explicated ratings of attributes, full-(or partial-) profile conjoint choice likelihood ratings, and choice or constant sum ratings. The model can also be used to estimate choice probabilities for new products, and is not subject to IIA difficulties.

Neural Networks and Statistical Models (Tony Babinec): Tony discussed neural networks, stating that he preferred to regard them as a flexible form of regression or discriminant analysis rather than "simulated biological intelligence." He observed that "In reality, as in conventional statistical modeling, one must invest a lot of sweat equity' and think through one's problem when applying neural nets." He recommended their use when:

  • The functional form relating input variables to the response variable is not known or well understood, but is not thought to be linear.
  • There is a large sample of data.
  • A premium exists for better prediction that makes it worth the added effort to fit a well-tuned neural network.