I would like to define a solid sample size formula for Express MaxDiff.

I'm wondering if the following would be sufficient or maybe we need a bit more (e.g. +20%) to ensure robust results taken the typical amount of missing data in the model:

General Population:

n = total_number_of_items / number_of_items_per_respondent x 200

Target Groups:

n = total_number_of_items / number_of_items_per_respondent x 150

Example:

total_number_of_items = 100

number_of_items_per_respondent = 40

number of exposures of each item per respondent = 3

number of items per task = 5

number of tasks per respondent = 24

sample size (GP) = 100 / 40 * 200 = 500

sample size (TG) = 100 / 40 * 150 = 375

Kindly share your recommendation, ideally with rationale / link to a paper with details on the same.

Regards

Piotr

Thank you very much for prompt reply.

Our reason behind Express MaxDiff is that on one hand at times we need to measure >60 items and on the other - we don't want to stretch excessively MaxDiff module length (which can be the risk in case of Sparse MaxDiff). We are investigating possibility of implementation of Bandit MaxDiff for our needs and for now we might need to use Express MaxDiff.

As per what you wrote, I should modify the formula and the example I shared, to obtain minimum and prefereable sample size as follows.

Minimum

n = 500 x number_of_exposures_per_respondent / ( total_number_of_items / number_of_items_per_respondent)

Preferable

n = 1000 x number_of_exposures_per_respondent / ( total_number_of_items / number_of_items_per_respondent)

The example

total_number_of_items = 100

number_of_items_per_respondent = 40

number of exposures of each item per respondent = 3

number of items per task = 5

number of tasks per respondent = 24

minimum sample size (GP) = 500 * 3 / (100 / 40) = 600

preferable sample size (GP) = 1000 * 3 / (100 / 40) = 1200

For TG I would assume 75% of the above values.

Please correct or confirm

Thank you

Piotr