The equations are in the help documents. But, since you ask for an intuitive explanation, I'll try to do that.
Both approaches use the logit equation to convert raw utilities into probabilities of choice that sum to 100 across the items.
The key difference is in the assumed response error (scale factor) that respondents use if they were evaluating all items at a time versus the number of items at a time that were shown in the MaxDiff questionnaire.
The market simulator's ability to receive all items as inputs and conduct a simulation assuming respondents were seeing and selecting among all items in the list (which they actually never did) leads to shares of choice that sum to 100%. But, it assumes that the same response error for which respondents evaluated sets of items in the questionnaire (typically 4 or 5 items at a time) would also apply if respondents were viewing all items at a time. This could be viewed as too rosy a scenario: the response error is understated and thus the differences in the scores are accentuated and made more extreme.
The other approach of averaging the rescaled probability scores (that sum to 100 for each respondent) makes the scores less accentuated.
This is the same issue as the "Exponent" (scale factor) adjustment that can be tuned in the market simulator for conjoint data. Whether you use a tuning that leads to steeper scores (more accentuated scores) or flatter scores depends on your assumptions about how much response error or not should be involved in your predictions.
Within each respondent, tuning the exponent (scale factor) larger or smaller does not change the rank order of preference for the items. But, when averaging the results across respondents, the rank order of items can and does often change slightly (because it is a non-linear transformation...the way the scale factor affects the logit equation).
Hope that helps.