﻿ Randomized First Choice

# Randomized First Choice

The Randomized First Choice (RFC) method combines many of the desirable elements of the First Choice and Share of Preference models.  As the name implies, the method is based on the First Choice rule, and can be made to be essentially immune to IIA difficulties.  As with the Share of Preference model, the overall scaling (flatness or steepness) of the shares of preference can be tuned.

RFC, suggested by Orme (1998) and later refined by Huber, Orme and Miller (1999), was shown to outperform all other Sawtooth Software simulation models in predicting holdout choice shares for a data set they examined.  The holdout choice sets for that study were designed specifically to include product concepts that differed greatly in terms of similarity within each set.

Rather than use the utilities as point estimates of preference, RFC recognizes that there is some degree of error around these points.  The RFC model adds unique random error (variability) to the utilities and computes shares of preference in the same manner as the First Choice method.  Each respondent is sampled many times to stabilize the share estimates.  The RFC model results in a correction for product similarity due to correlated sums of errors among products defined on many of the same attributes.  To illustrate RFC and how correlated errors added to product utilities can adjust for product similarity, consider the following example:

Assume two products: A and B.  Further assume that A and B are unique.  Consider the following product utilities for a given respondent:

Avg. Product Utilities

A            10

B            30

If we conduct a first choice simulation, product B captures 100% of the share:

Avg. Product Utilities   Share of Choice

A            10                         0%

B            30                       100%

However, let's assume that random forces can come to bear on the decision for this respondent.  Perhaps he is in a hurry one day and doesn't take the time to make the decision that optimizes his utility.  Or, perhaps product B is temporarily out-of-stock.  Many random factors in the real world can keep our respondent from always choosing B.

We can simulate those random forces by adding random values to A and B.  If we choose large enough random numbers so that it becomes possible for A to be sometimes chosen over B, and simulate this respondent's choice a great many times (choosing new random numbers for each choice iteration), we might observe a distribution of choices as follows:

Avg. Product Utilities   Share of Choice

A            10                       25.0%

B            30                       75.0%

(Note: the simulation results in this section are for illustration, to provide an intuitive example of RFC modeling.  For the purposes of this illustration, we assume shares of preference are proportional to product utilities.)

Next, assume that we add a new product to the mix (A'), identical in every way to A.  We again add random variability to the product utilities so that it is possible for A and A' to be sometimes chosen over B, given repeated simulations of product choice for our given respondent.  We might observe shares of preference for the three-product scenario as follows:

Avg. Product Utilities   Share of Choice

A            10                       20.0%

A'           10                       20.0% (A + A' = 40.0%)

B            30                       60.0%

Because unique (uncorrelated) random values are added to each product, A and A' have a much greater chance of being preferred to B than either one alone would have had.  (When a low random error value is added to A, A' often compensates with a high random error value).  As a simple analogy, you are more likely to win the lottery with two tickets than with one.

Given what we know about consumer behavior, it doesn't make sense that A alone captures 25.0% of the market, but that adding an identical product to the competitive scenario should increase the net share for A and A' from 25.0% to 40.0% (the classic Red Bus/Blue Bus problem).  It doesn't seem right that the identical products A and A' should compete as strongly with one another as with B.

If, rather than adding uncorrelated random error to A and A' within each choice iteration, we add the same (correlated) error term to both A and A', but add a unique (uncorrelated) error term to B, the shares computed under the first choice rule would be as follows:

Avg. Product Utilities   Share of Choice

A            10                       12.5%

A'           10                       12.5% (A + A' = 25.0%)

B            30                       75.0%

(We have randomly broken the ties between A and A' when accumulating shares of choice).  Since the same random value is added to both A and A' in each repeated simulation of purchase choice, A and A' have less opportunity of being chosen over B as compared to the previous case when each received a unique error component (i.e. one lottery ticket vs. two). The final utility (utility estimate plus error) for A and A' is always identical within each repeated first choice simulation, and the inclusion of an identical copy of A therefore has no impact on the simulation result. The correlated error terms added to the product utilities have resulted in a correction for product similarity.

Let's assume that each of the products in this example was described by five attributes.  Consider two new products (C and C') that are not identical, but are very similar—defined in the same way on four out of five attributes.  If we add random variability to the part-worths (at the attribute level), four-fifths of the accumulated error between C and C' is the same, and only one-fifth is unique.  Those two products in an RFC simulation model would compete very strongly against one another relative to other less similar products included in the same simulation.  When C received a particularly large positive error term added to its utility, chances are very good that C' would also have received a large positive error term  (since four-fifths of the error is identical) and large overall utility.

RFC Model Defined

We can add random variability at both the attribute and product level to simulate any similarity correction between the IIA model and a model that splits shares for identical products:

Ui = Xi (ß + Eattribute) + Eproduct

where:

Ui        =        Utility of product i for an individual or homogenous  segment at a moment in time

Xi        =        Row of design matrix associated with product i

ß        =        Vector of part-worths

Eattribute        =        Variability added to the part-worths (same for all products)

Eproduct        =        Variability (Gumbel) added to product i (unique for each product)

Repeated draws are made to achieve stability in share estimates, computed under the First Choice rule.  In RFC, the more variability added to the part-worths, the flatter the simulations become.  The less variability added to part-worths, the more steep the simulations become.  Under every possible amount of attribute variability (with no product variability applied and all attributes set to receive correlated error), shares are split exactly for identical products, resulting in no "inflation" of net share.  However, there may be many market scenarios in which some share inflation is justified for similar products.  A second unique variability term (distributed as Gumbel) added to each product utility sum can tune the amount of share inflation, and also has an impact on the flatness or steepness of the overall share results.  It can be shown that adding only product variability (distributed as Gumbel) within the RFC model is identical to the familiar logit model (Share of Preference Model), given enough draws (iterations).  Therefore, any degree of scaling or pattern of correction for product similarity ranging between the First Choice model and Share of Preference can be specified with an RFC model by tuning the relative contribution of the attribute and product variability.

The exponent also can play a role in RFC, similar, but not identical to, product variability.  Decreasing the exponent (multiplying the utility estimates by a value less than unity) decreases the variance of the utility estimates relative to the variance of the random variation added within RFC simulations, in turn making simulated shares flatter.  There is a subtle difference between increasing product variability and lowering the exponent, though both result in a flattening of shares.  If only attribute variation is being used in an RFC simulation, decreasing the exponent flattens the shares, but the overall model still does not reflect the IIA property.  Adding product variability, however, flattens the shares and causes the RFC model to reflect at least some degree of IIA behavior.  Though the exponent is not required to simulate different patterns of correction for product similarity and scaling, in the next section we show that it is useful to retain the exponent adjustment from an operational point of view.

The RFC model is very computationally intensive.  With the suggested minimum of 250,000 total sampling iterations for a conjoint data set, the results tend to be fairly precise.  But, if you have dozens of products in the simulation scenario, some product shares can become quite small, and greater precision would be needed.  The market simulator makes some automatic adjustments to increase the number of sampling iterations as you increase the number of products in your simulation scenario.  You can also increase the precision by manually increasing the number of sampling iterations.  The RFC model is appropriate for all types of conjoint simulations, based on either aggregate- or individual-level utilities.  It provides the greatest benefits in the case of aggregate (logit and Latent Class) models, which are more susceptible to IIA difficulties than individual-level models.

If you plan to use the RFC model with individual-level utilities to compute reasonably stable estimates of share at the individual level, you should sample each respondent at least 5,000 times, and preferably more.  This can take a good deal of computing time.

The greatest complexity of the RFC model from an operational point of view is that the magnitude of the attribute variability multiplier must be adjusted whenever the number of products or number of attributes on which products differ changes across simulations to maintain comparable scaling of shares of preference.  Our implementation of the RFC model includes an auto-calibrating attribute variability multiplier (developed through Monte-Carlo simulations) so that this issue is transparent to the user.