Within SMRT, I'm assuming you are using the default method to estimate the shares of preference (Randomized First Choice), right?
The easiest way to compute how well your model is fitting the two holdout tasks is to use the Mean Absolute Error test. You simply figure out how big your absolute misses are for each of the concepts, and you sum them up across concepts and across holdout tasks. Then, take the average of those misses.
For example, for task #1, concept #1, your absolute miss is 5.92%. Across all six concepts, your average miss (MAE) is 4.24. In my experience, for holdouts that involve 3 concepts, this is a good result. But, so much depends on sample size.
Hit rates are an individual measure of predictive validity. You look at your simulation results for each respondent and see if the predicted highest probability concept was also the one that the respondent chose. If it is, count a hit. If not, count a miss. The hit rate is the percentage of hits.