Regarding ACBC, internal validity measure using the internal fit from HB is not as clear-cut as for CBC or CVA.

With CBC, respondents are always shown the same number of concepts per task. Let's say it's 4 concepts plus an additional None concept. That's 5 possible choices per question. CBC/HB reports root likelihood and percent certainty. With 5 concepts per task, the null root likelihood (assuming random utilities fitting the data) is 1/5=0.2. Percent Certainty is a psuedo R-squared that refers to how good the utilities fit the data relative to the null (naive) solution.

With CVA under OLS estimation, you get an R-squared which reports what percent of the dependent variable is explained by the regression weights applied to the independent variables.

But, with ACBC, the logistic regression is a mixture of choice sets with differing number of concepts per task:

1. For the BYO section, if an attribute has 6 levels, then it is coded as a choice of a single level among six alternatives. If an attribute has 2 levels, it's coded as a choice among two alternatives. Further complicating matters, if you use constructed lists per attribute (to only bring the relevant levels forward to the BYO questions), then the number of concepts per BYO "task" can be different across people.

2. For the Screeners, each choice is a binary choice (a possibility or not a possibility).

3. For the Choice Tournament, each choice is typically a choice among three alternatives (though the software allows you to ask this section as pairs rather than triples). And, the number of choice tasks is customized to the respondent depending on how many concepts are marked "a possibility" from the Screener section.

And, the response error is different across the three sections.

So, you can see that internal fit statistics are no longer so clear to interpret for ACBC! The number of concepts per task and number of tasks per section can vary depending on the respondent's previous answers! This affects the scaling of the RLH.

-Regarding RLH: Without a lot of effort put into computing the null solution, it is hard to interpret for ACBC.

- Regarding Percent Certainty: This could be interpreted because it is already relative to a null (naive) solution?

- Which Percent Certainty would I use? The average across all iterations?

Or would you even say that because of the high computational effort and the probably low additional information provided by these measures, predictive validity is enough to assess the goodness-of-fit of my model (I am not comparing this to other models)?

Thanks in advance!

Best,

Felix