Sawtooth Software: The Survey Software of Choice

Dual-Response MaxDiff for Anchored Scaling

MaxDiff (best-worst scaling) has been successful and popular among Sawtooth Software users. It is a powerful and flexible scaling technique for scoring multiple items. In each question, respondents are shown typically four or five items and are asked to mark which item is best and which is worst (or most important/least important). It takes longer to complete a MaxDiff questionnaire than a ratings grid on the items. But, the results typically are worth the extra effort.

Despite all the benefits of MaxDiff, an issue that sometimes leads to concern is that the scores are placed on a relative (ipsative) scale rather than an absolute scale. For example, for each respondent, we obtain a priority ordering of items on an interval scale. But, each question only allows respondents to say which item is best and which is worst. Respondents cannot tell us that most or all the items are really good, or most or all the items are really bad. This means that the scale is not anchored to any meaningful reference point, such as a point indicating zero preference or a threshold indicated the boundary between good and bad items.

Some pretty sophisticated models that fuse traditional rating data with MaxDiff choices have been proposed and presented at our conferences. These are challenging to implement, and they also rely on the problematic ratings question (such as the 5-point scale).

The inventor of MaxDiff, Jordan Louviere, made a suggestion at the last Sawtooth Software regarding the relative scaling quandary. We followed up with him regarding details, and validated his approach with a methodological study (“Anchored Scaling in MaxDiff Using Dual-Response,” available in our Technical Papers Library at www.sawtoothsoftware.com).

Jordan’s idea is very straightforward. Rather than trying to resolve the relative scaling issue with a sophisticated model fusing rating scale data with choice data, he simply adds another (a “dual”) choice question to each MaxDiff task.

After asking which item is most and least important, Jordan suggests that we ask another question (directly below the standard MaxDiff question):

Considering just these 4 features...
        O   All 4 are important
        O   None of these 4 are important
        O   Some are important, some are not

Our research suggests that this dual-stage add-on question only takes an additional 3 seconds on average per MaxDiff question.

The dual-response gives us the data to establish an anchor point for the scaling of items: the utility threshold between items deemed important vs. not important. For example, a respondent can express that most of the items are unimportant or most of the items are important. The “zero-point” in the resulting scores indicates the boundary between unimportant and important items. Items judged not important carry negative scores, and those that are important carry positive scores.

With existing SSI Web software, it’s very easy to add a dual-response question to MaxDiff tasks. But, analysis requires modifying the data file prior to submitting the data for Latent Class or HB estimation. The details for doing this are described in the white paper we referenced earlier in this article.

We are generally pleased with this approach, and plan to offer it as an option in a future version of MaxDiff software.