Identifying "Bad" Respondents
Top  Previous  Next

Fit Statistic and Identifying Random Responders

MaxDiff displays an average "fit statistic" to the screen during HB estimation and also writes to file the individualized fit statistic along with each respondent's scores on the items. This fit statistic ranges from a low of 0 to a high of 1000. It characterizes the internal consistency for each respondent. In technical terms, the fit statistic is the Root Likelihood (RLH) times 1000.

It is natural to ask what minimum fit statistic respondents should have to be able to recognize more thoughtful responses from purely random data. The fit statistic can help you determine if a respondent has provided purely random answers and should be discarded.

The fit statistic largely depends on how many items were shown in each set. If four items are shown in each set, then any random set of scores should be able to predict respondents' answers correctly with 25% likelihood (fit statistic=250). If two items are shown per set (paired comparison approach), then the likelihood given random data is 50% (fit statistic=500). Thus we can generalize that the fit statistic should be at a minimum 1/c x 1000, where c is the number of items displayed per set. We should hope that respondents don't provide random answers and that the estimated scores should perform considerably better than chance. However, the score estimation algorithm used in MaxDiff (HB) attempts to fit the respondent's choices, even if they are just random. Thus, the actual fit we observe even from random data is mostly above the chance rate.

The table below displays a suggested minimum fit statistic assuming we want to achieve 95% correct classification of random responders. In developing this table, we assumed a 20-item study wherein respondents are shown each item either two or four times across all sets.
 
Suggested Minimum Fit Statistic to Identify Random Responders with 95% Correct Classification

Items per Set
Suggested
Minimum Fit. Each Item Shown Four Times to Each Respondent
Suggested
Minimum Fit. Each Item Shown Twice to Each Respondent
2
574
634
3
371
402
4
282
305
5
227
247
6
191
208
 

Technical Notes: We simulated 1000 respondents answering randomly to each questionnaire, then estimated the scores using HB: prior variance=2, d.f.=5. Both "best" and "worst" answers were simulated for each respondent (with the exception of the 2 items case). For each simulated data set, we sorted the respondents' fit statistics from highest to lowest and recorded the 95% percentile fit. If asking only "bests," because the amount of information for each respondent is reduced, the Suggested Minimum Fit would be higher.