Dear Sawtooth Team,
I have created a CBC and only use pictures and no verbal description. My master thesis deals with using CBC in an unusual field, namely schedules of students. Since there are many dependencies that lead to many prohibitions, I allow violations of attribute levels (these are defined for all days equally) at some days to a small extent (which might cause problems with the goodness-of-fit). I have 15 tasks (10 random, 2 reliability holdouts, 3 validity holdouts) and per task two concepts are compared (no none option available). I have 5 attributes with 2 levels each. When I run HB I get the following average goodness-of-fit values:
Question 1: With two concepts, the worst possible RLH value is 0.5, so 0.7 is acceptable. In one study Orme showed that Pct-Cert. should be at least 0.6, so I assume that there are problems with my study design?
As a result, I have removed unreliable participants through the holdout tasks and the values have improved as follows:
Since this is still not satisfactory, I have eliminated participants who took less than 2 minutes (is appropriate for my study, as only images are compared). The values have improved as follows:
The same I have repeated with 3min
I also tested my orthogonality study with the preliminary counting test and the logit efficiency test and got optimal values. However, I have seen through an aggregate logit test that two attributes have standard error of about 0.6.
Question 2: Can I conclude from this that my images may not be representative for the levels (this is certainly the case because, as mentioned above, violations were allowed for some images) and therefore my approach of deviations shows problems?
Question 3: What are optimal values for Avg. Variance and RMS? Are they as important as Pct-Certainty?
Question 4: In the next step I'll calculate the hit rate. Am I right to assume that based on my findings, the hit rate might be low?
Thank you very much!