﻿ Sparse Designs and Large Samples

# Sparse Designs and Large Samples

With typical CVA studies, one assumes OLS estimation at the individual level and therefore must ask each respondent to complete as many questions as parameters to be estimated, equal to (N-n+1), where N is the total number of levels in the study and n is the total number of attributes.  However, with HB estimation, it is possible to achieve better results than OLS with the same number of (or even fewer) questions.  Because HB borrows information across the sample, it stabilizes parameters (part-worth utility estimates) even when the data are very sparse at the individual level (perhaps involving fewer questions than parameters to estimate).

Sometimes, researchers have the luxury of very large sample sizes (such as 1000+), and may willing to sacrifice precision at the individual level to dramatically shorten the length of the survey.  For example, if a researcher was comfortable using a standard CVA questionnaire with 18 choice tasks and 500 respondents, it would seem reasonable that the researcher could obtain quite similar overall results by using 1000 respondents where each respondent answered 9 questions, or 1500 respondents where each respondent answered 6 tasks.  In all three cases, there are 9,000 answers to conjoint tasks, and sampling error is reduced if increasing the sample sizes.  And, HB has proven to do a creditable job with estimating useful part-worth utilities for conjoint studies with as few as 6 tasks per person.

CVA allows you to randomly skip a subset of the conjoint questions for each respondent.  Of course, if the number of tasks answered by the respondent is fewer than the number of parameters to be estimated, then OLS estimation cannot be used. We only recommend this approach for situations where you have very large sample sizes and where market simulation results are the key output, rather than individual-level classification.  If you want to purposefully use sparse designs (within each individual), we suggest developing the questionnaire using standard rules (generate 2x to 3x as many tasks as parameters to estimate, and multiple design versions).

Aggregate Estimation via OLS:  The simplest method of analysis is to compute the mean ratings for conjoint questions across the population (or subgroups of the population).  Then, use the paper-and-pencil methodology to import those means from a text (.csv) file into CVA for OLS estimation.  Rather than each record in the .csv file representing a unique person, format just a single record to represent the total population (or, for subgroup analysis, format a record for each subgroup to be analyzed).  The rating for each card is the mean rating for the population  (the paper-and-pencil import supports decimal places of precision).

HB Estimation: HB estimation can be used even with sparse data.  If the data are quite sparse at the individual level relative to the number of parameters to be estimated, we would recommend decreasing the Prior Variance assumption (and potentially increasing the strength of the prior assumptions by increasing the Degrees of Freedom) in the Advanced Settings so that much more "shrinkage" to the population mean occurs.  Otherwise, individual respondents may display wild deviations from reasonable individual-level utilities.  You may also need to significantly increase the number of initial and used iterations to obtain convergence.

Example:

Assume each version of your questionnaire has 21 CVA questions.  Further assume you wanted each respondent to complete a random 6 out of 21 tasks (you plan to use HB estimation).

First, make sure each of your CVA questions (CVA1_1 to CVA1_21) is located on its own page.

Randomize the pages, by clicking Randomize... | Pages, and by selecting your first and last CVA questions (CVA1_1 and CVA1_21) as the range.

To randomly skip a subset of questions, add a new HTML/Filler question directly beneath CVA1_1 (on the same page as CVA1_1).  Call that new question something like Skip1.  Within Skip1, click the Skip Logic tab, add a skip  (Post Skip, with a destination question being the next question directly following the last CVA question), and specify the following skip logic:

Begin Unverified Perl

if (

SHOWN("CVA1_1")+

SHOWN("CVA1_2")+

SHOWN("CVA1_3")+

SHOWN("CVA1_4")+

SHOWN("CVA1_5")+

SHOWN("CVA1_6")+

SHOWN("CVA1_7")+

SHOWN("CVA1_8")+

SHOWN("CVA1_9")+

SHOWN("CVA1_10")+

SHOWN("CVA1_11")+

SHOWN("CVA1_12")+

SHOWN("CVA1_13")+

SHOWN("CVA1_14")+

SHOWN("CVA1_15")+

SHOWN("CVA1_16")+

SHOWN("CVA1_17")+

SHOWN("CVA1_18")+

SHOWN("CVA1_19")+

SHOWN("CVA1_20")+

SHOWN("CVA1_21") >=6

)

{

return 1;

}

else

{

return 0;

}

End Unverified

Finally, copy the Skip1 question directly beneath CVA1_2, CVA1_3, etc.  It will automatically be renamed Skip2, Skip3, etc.  Test your survey to ensure it works properly.