This question actually is a bit complicated, so I'll try to address the main issues.
If there were only 23 people in total in your marketplace and you could interview all 20, then you would get quite good results with MaxDiff to make inferences about that population of 23 people. (Assuming you showed each item to each respondent about 3x or 4x times, to reduce measurement error at the individual level when dealing with such tiny samples.) If you took these same data and looked at the standard errors of the estimates, you would think it was pretty poor (but that's because you only have 20 total respondents, due to the micro size of the market).
Regarding your specific issue...and considering the merits of 400 respondents...you can do statistical testing in the latest versions of SSI Web pretty easily, giving you aggregate logit standard errors (similar to the way you do testing in CBC, Advanced Design Test with aggregate logit on dummy respondent data). For MaxDiff, just use SSI Web's ability to generate random respondent data (Test + Generate Data...).
Then, download the test data as if it were real respondent data (Test + Download data). Next, analyze the results using Analysis + Estimate MaxDiff Scores - Other Methods... and choose Logit. The third section in that output report (Raw Scores) gives you the standard errors. If you suscribe to the same rules of thumb as we teach in our training regarding CBC analysis, you want the standard errors from aggregate logit to be around 0.05 or less, if you can get it.