I imagine you have been using the standalone "HB Model Explorer" tool that when it runs looks like a series of Command Prompt windows that "take over your computer" and run for multiple hours. This Explorer tool uses jackknife sampling to systematically hold out one or a few of your CBC tasks to validate the model and uses the other tasks for estimating the model.
You are saying that when using this "HB Model Explorer" tool, it improves the holdout hit rate very little.
Now, unless you have a very large number of true holdout tasks (say, 7 or more), I doubt you will be doing just as well or any better than the way the HB Model Explorer has exploited (via the jackknife holdout procedure) the choice tasks you've collected. The default 2 holdout tasks that the software suggests as default are rarely never enough to provide robust validation at the level of the jackknife approach done by the HB Model Explorer.
However, if you told me that you had true out of sample holdout tasks wherein an entire group of respondents (separate from the group of respondents used to estimate the HB utilities) completed a large number of fixed holdout tasks (held constant across out of sample respondents), then I would say you should leverage those true out of sample holdouts and do more work to validate against those out of sample holdouts. But, hardly anybody ever has such luxury to have collected an entire new group of respondents to server as holdout respondents.
And, to answer another question, if you are using our software to estimate utilities and you have included the interaction effects in that utility run, then when our simulator uses that utility run it will automatically use the interaction effects as well as the main effects in producing the share of preference predictions for your simulation scenario.