Technical Papers  Pricing & Ordering  Your Cart

 
Email to a Friend Printer Friendly

SS Winter 2007


Randomized First Choice Simulations: Strengths, Weaknesses, and a New Refinement in SMRT v4.14

Reflections

We’ve now had about eight years of experience working with the Randomized First Choice (RFC) simulation model. It’s the default method for market simulation within the SMRT platform. In general, it has worked well. But as might be expected, we’ve learned a few things: how to improve RFC, but also about some weaknesses.

Eight years ago, most Sawtooth Software customers were using aggregate models: logit or latent class. RFC provided a clear benefit in these cases. Lately, most Sawtooth Software customers are using part-worths estimated under HB. According to our March 2006 customer survey, 68% of Sawtooth Software users typically use HB-estimated part-worth utilities for their final market simulation models. Once one obtains individual-level utility estimates, RFC typically provides only modest improvements. Thus, the popularity and effectiveness of HB has in turn reduced the impact that RFC has in our industry.

“IIA Meltdown”

However, there may be situations in which RFC simulations offer significant benefit over standard logit-rule (Share of Preference) simulations. In the 2004 Sawtooth Software Conference, Allenby et al. pointed out that standard HB models can face what they termed “IIA Meltdown” when very many alternatives (such as 84 alternatives in a beverage category or even more alternatives in the automobile category) are in the choice design. Although they proposed a model different from RFC, their finding that standard HB simulators face greater IIA troubles with large numbers of alternatives suggests that RFC may be even more useful in these cases.

Weakness with RFC and Price

We have also noted a weakness with RFC simulations. One of the problems with simple Share of Preference (logit rule) simulations is that often some kind of correction for product similarity would be useful. A strength of the RFC model is that such corrections occur automatically. The simple RFC model assumes that all attributes involve a correction for product similarity. However, it is not clear that this is always useful. For instance, price represents an attribute for which it isn’t clear that corrections for product similarity should be made as they are with attributes like brand and form factor.

Many analysts like to derive demand curves via sensitivity analysis within choice simulators. Under RFC, if all products are first aligned on the average price (and the “test” product systematically varied across all price levels), an unwanted kink can occur in the demand curves around the point that was artificially chosen as the average (reference) price. The kink is often slight and harmless when price sensitivity is strong and few products are in the simulation. But, as the number of products in the simulation increases and/or price has less impact, the kink becomes more noticeable and problematic.

In the example below, sixteen products were included in the simulation. We derived an estimated demand curve for the first product by holding the remaining 15 products constant at price 3 and systematically varying the price for just product 1. We plot the share of preference (under both RFC and “Share of Preference” simulation models) for product 1 below, as the price varies from level 1 to level 5.

When product 1 is changed from the average (price 3) to either the next lower or higher price points, it receives an extra boost in share due to becoming unique with respect to price. The penalty for product similarity results in a dip in share when product 1 is aligned with the competitors at price 3. Sometimes, the dip is deep enough that it can lead to a “reversal,” where increasing the price from level 3 to level 4 results in an apparent share increase.

Setting all products to price 3 for purposes of sensitivity simulations creates an unrealistic degree of similarity with respect to price. In the real world, we wouldn’t expect all products to carry the same price. This issue is especially problematic for RFC when conditional pricing is used, and thus level 3 of price really refers to different price values, depending on the brand (or other conditional attribute). When using conditional pricing, it clearly would make sense to turn off RFC’s correction for similarity with respect to the price attribute, but to retain it for other attributes.

RFC’s “automatic” correction for product similarity is produced by adding random error to respondents’ part-worths for each “draw,” with the same error values applying to alternatives that share the same level of an attribute. To turn off similarity correction for an attribute like price, we simply apply independent error (rather than correlated “attribute type” error) to the price part-worths for all alternatives in the simulation. This leads to a more sophisticated RFC simulator, where some attributes involve correlated error (when product alternatives share the same levels of those attributes) and other attributes involve uncorrelated errors (when product alternatives receive independent random error draws for these attributes, irrespective of shared levels).

We are now offering the ability to treat attributes differently with respect to correction for similarity (correlated error) within SMRT v4.14 (under the Method Settings… button on the Scenario Specification dialog). This is a free upgrade that may be downloaded from our website at www.sawtoothsoftware.com.

Maintaining Precision

We should also note that as the number of alternatives in the simulated choice scenario increases, the number of draws used in RFC should also be increased. Otherwise, the sampling error associated with the simulated shares of preference under RFC may be uncomfortably large relative to the signal associated with some relatively tiny product shares. By default, RFC simulations use 100,000 iterations (draws). As the shares of preference for some products drop to quite small numbers (especially below 5%), researchers should consider increasing the number of iterations significantly to avoid loss of precision.

For Power Users

There is another opportunity for analysts to improve RFC modeling that is a powerful idea, but involves extra data processing (is not automatically supported in the interface). Some conjoint/choice designs involve many alternatives. Beverages and automobiles are good examples. Suppose we had conducted a choice study with 200 automobile makes, including trucks, minivans, sedans, and coupes. Further suppose that we had treated the makes (for part-worth estimation) as independent levels of a 200-level attribute. However, we know that these 200 makes fall into four clear categories that should reflect increased competition within each category. One could assume a new attribute with four levels (truck, minivan, sedan, and coupe), each level with utility of zero, for which we apply attribute-type error under RFC in choice simulations.

Go back to Index


Call for Papers: Sawtooth Software Conference 2007

We are pleased to announce that the 2007 Sawtooth Software Conference will be held October 15-19, 2007 at the Hyatt Vineyard Creek Hotel & Spa, in Santa Rosa, California. Our US-based research conference is held just once every 18 months. 170 people attended the last meeting.

Our research conference brings together some of the best minds in our industry to talk about practical issues in online interviewing and quantitative market research. It is not a sales-oriented event for our software, but a chance to exchange ideas and receive education from a variety of sources and perspectives. Papers presented at our previous Sawtooth Software Conferences are cited frequently in journal articles. We're looking for exceptionally strong papers. If you'd like to be on the program, please respond promptly (by March 16, email: bryan@sawtoothsoftware.com) with a one-page abstract describing your proposed paper, with special attention to the findings and what the audience will "take away" from the presentation. You must also include a 50-word description of your paper to include in the conference brochure, should your abstract be accepted.

We are interested in papers on a variety of subjects, including Web interviewing, market segmentation, scale development, customer satisfaction modeling, conjoint/choice analysis, MaxDiff, perceptual mapping, hierarchical Bayes methods, forecasting, pricing research, market simulations, and case studies. These papers need not involve Sawtooth Software's programs or approach.

In an effort to provide more balance to the program, we are encouraging papers that are not about conjoint/choice modeling. For all topics, we are eager to see evidence of managerial relevance, external validity, profit impact, etc.

Presenters receive a complimentary conference registration. To be accepted, a paper must show promise of being sufficiently practical to be of use to the least sophisticated members of the audience, while having enough substance to be of interest to the most sophisticated members. In addition to standard presentation slides, authors are required to submit a journal-quality written paper for publication in the Conference Proceedings.

We strive for the highest quality in our conferences. If your abstract is accepted, a member of the steering committee will review early drafts of your presentation and offer suggestions. Authors are expected to consider these suggestions conscientiously and rework their presentations as needed. Sawtooth Software reserves the right to remove any author from the program or proceedings that fails to meet deadlines or produce high quality work.

Sawtooth Software Conference 2007 Steering Committee Members are:

  • Bryan Orme, Sawtooth Software
  • Karlan Witt, Cambia Research Group
  • John Wurst, SDR/University of Georgia
  • Ken Deal, McMaster University

Go back to Index


Writing Better CBC Questionnaires: An Empirical Test

How can we motivate respondents to give reliable, thoughtful, and most importantly realistic responses to CBC (Choice-Based Conjoint) questionnaires? This is especially a concern as the research community fields a greater number of projects among respondents who complete multiple surveys per month. Internet surveys have accelerated the pace of research projects, and respondents seem to have accelerated their processing speed as well. There are certainly cost benefits to both trends, but overall quality may suffer.

Study Design

In August, 2006, we fielded a CBC study over the internet using GMI (Global Market Insite) sample. The subject matter was laptop computers, and we screened respondents for interest in the category and basic knowledge about laptop features. The CBC questionnaire consisted of nine choice tasks designed for utility estimation and six holdout choice tasks (identical in appearance to the utility estimation tasks) sprinkled throughout the other nine tasks (three early in the questionnaire, and the same holdouts repeated very late in the questionnaire, with concept position rotated).

After deleting respondents who were clearly speeding through the interview or providing clearly inconsistent responses (test-retest reliability check), we had 379 respondents for analysis.

Every respondent received an instructional screen prior to the first choice task. The core text (that all respondents received) for the instructional screen was:

We want to learn what aspects are important to you when purchasing a laptop (notebook) computer. To do this, we are going to ask you a series of tradeoff questions.

In each question, the computer running this survey will come up with three different laptop PCs to choose from. Sometimes, these notebooks will have very good features at very reasonable prices. Sometimes, they will have poor features at relatively high prices.

If you really don't like any of the options, you can say that you wouldn't purchase any of them. That's just fine.

We tested six questionnaire writing elements (all as binary factors) that we felt might have a positive effect on the quality of the CBC part-worths:

  1. Explain Reason for Repeated CBC Tasks. In the instruction screen just prior to the first CBC task, we either included or excluded the following additional paragraph:

    We need to ask you repeated questions to learn how you make sometimes complex tradeoffs. If we see how you select laptops in many different situations, we'll do a much better job at learning about your preferences. So, what may seem to you like repetition is very useful to our research study.

  2. Appeal to Respondent to Provide Realistic Answers. In the instruction screen just prior to the first CBC task, we either included or excluded the following additional paragraph:

    We're really depending on you to answer realistically and carefully. Please imagine you were actually shopping for a laptop and really were going to pay for a laptop you might choose during our survey today. Thank you for your effort!

  3. Progress Bar. A Progress Bar was shown (or not) in the footer of the questionnaire.

  4. Instructional Screen after First Choice Task. After the first CBC question, we either displayed (or not) a screen that said:

    Now, the computer running this survey is going to rearrange the brands and features of laptops. This next question is going to have the same layout as the previous one, but the combinations of features for the laptops will be different.

    Again, each time you answer a choice scenario, the computer will scramble the features for the next question.

  5. Countdown after Every Third Task. After each third task, we included (or not) a label like the following at the upper left-hand corner of the choice task:

    (3 of 15 choice scenarios)

  6. Rest Screen after 8th Task. After the 8th choice task, we either displayed (or not) a separate screen that said the following:

    Thank you for your work so far. We know that some of these tradeoffs are challenging. We hope you also find it interesting to consider what features are important when considering a laptop.

    The answers you give to these choice scenarios are very important to help us understand your opinions and preferences. Keep up the good work!

Findings:

We estimated individual-level part-worths using our CBC/HB system. Using multiple regression, we tested whether these binary factors had any effect on the following ten dependent variables:

  • Interview Length
  • Test-Retest Reliability for Holdouts
  • Internal Fit (RLH) of Calibration Tasks from HB
  • Number of Utility “Reversals” for Ordered Attribute Levels
  • Respondent’s Qualitative Feedback on the Questionnaire: 5-point Likert scales reporting whether the respondent found the questionnaire...
    • enjoyable
    • confusing
    • easy
    • made them feel like clicking answers just to get done
    • made them feel like they could express their opinions
  • Importance of Price

Only one effect was significant at p<0.01. Given the number of repeated t-tests (10 regression models x 6 independent variables), it raises the possibility that this effect and especially the few others that emerged at p<0.05 may have been due to chance alone.

Here is the one significant effect at p<0.01: The average interview time was 302 seconds (just over 5 minutes). If the “Countdown” was provided on every 3rd task, respondents took an extra 60 seconds to complete the survey (p<0.01). Why they took extra time is an interesting question. Perhaps the Countdown kept respondents more engaged by giving them a clear indication of a finite number of choice tasks, so they could maintain focus and be less likely to become discouraged (not knowing if/when the end was in sight). Perhaps respondents then gave more attention to each choice task and put forth more effort? The effect of Countdown on test-retest reliability and reversals also suggested it improved data quality (but both effects were non-significant).

So, we are left with evidence suggesting that including the Countdown leads to respondents going slower, and that the Countdown potentially leads to better quality data.

Comments:

Why didn’t more of the experimental factors have a significant effect on the dependent variables we examined? For example, the Appeal to Respondent to Provide Realistic Answers might have led to an increase in price sensitivity, yet it didn’t have a significant effect on price importance (though the coefficient had the expected sign).

Perhaps respondents in online panels are so practiced (and quick) at doing survey research that they are predisposed to commit a certain level of cognitive effort and are not easily swayed by instructional text or visual cues meant to influence their behavior. This reminds us of the finding that that price sensitivity increased with later tasks as reported by Sawtooth Software’s Rich Johnson and Bryan Orme after examining about 20 commercial CBC datasets collected in the early- to mid-1990s. Jon Pinnell (MarketVision Research) has looked at this same issue multiple times based on online panel and has not confirmed this result. One hypothesis is that there were stronger learning effects among respondents to (mostly disk-by-mail) CBC interviews in the early 1990s than there are among online panelists post-2000. There is no doubt that we are dealing with a larger pool of respondents in these online panels who bring to each new survey the experience gained from frequent survey-taking experiences.

This research effort was just one modest experiment using a single sample source. Much more should be done in this area before reaching firm conclusions regarding how questionnaire writing elements affect CBC results. Based on our experience (and suggestions from others, such as Jordan Louviere, who has long advocated a task countdown), the six elements we tested above should lead to higher data quality (the exact wording of the elements might be improved upon, though we think our effort approximates best practice). The results from this single test suggest only one of the elements we tested (the “Countdown”) leads to better data quality, but this certainly should not be taken as the final word on this subject.

(Editor's Note, March 2007: A follow-up study analyzed in February, 2007 found that Countdown had no effect on interview time. Therefore, the finding reported here is questionable. Even so, we recommend using a Countdown in CBC surveys.)

Future Research

We at Sawtooth Software have become increasingly concerned about data quality in CBC surveys. We suspect that researchers are not getting as much (and the right kind of) information at the individual level from standard CBC surveys as they had supposed. Taking a complex CBC interview with many attributes can be a monotonous and mind-numbing experience, so perhaps it’s mostly our fault as researchers if data quality suffers. We’re investigating new ways to address these issues and hope to have useful results to report at the upcoming Sawtooth Software conference in October.

Go back to Index


Feedback on Product Plans 2007

Thank-you to all users who completed our recent questionnaire launched at the first of the year. The survey asked respondents to evaluate 13 possible product upgrade ideas for 2007. We used your feedback to drive the discussion in our January strategy planning meeting. We designed the questionnaire to examine both the overall use of our different product components and the enthusiasm that users of each component express toward seeing suggested improvements. Your open-end comments were also extremely helpful, as they help new ideas surface and gain traction.

For your information, the ideas that seemed to capture the most interest among our users were:

  • Excel-based market simulator
  • Rewrite of market simulator within the SSI Web platform
  • An upgrade to CBC/HB

We also took the opportunity to ask users about their use of advanced features within CBC/Web and also CiW. We were curious regarding whether you use some of these more advanced features, and here are the results:

Which of the following features in CBC/Web do you currently use (at least occasionally)?

66% Prohibitions
59% Conditional Pricing
54% Display attribute label at left of task
47% Alternative-Specific Design
46% Partial-Profile
29% Conditional Graphics
26% Custom CBC Layout using Free Format
18% Shelf-Display

Which of the following features in SSI Web do you currently use (at least occasionally)?

39% Constructed Lists
29% SSI Scripting
29% Quota Control
25% Free Format
19% SSI Web CAPI (standalone interviewing, not connected to web)
15% Double-byte character support (e.g. Chinese, Japanese, etc.)
14% Unverified Perl

© 2010 Sawtooth Software, Inc. All rights reserved.