Very interesting question.
We found that the reproducibility of the first step (of the cluster analysis on cluster membership indicators across the emsemble of segmentation solutions) was the better approach for examining the reproducibility: how well the number of dimensions the researcher requested seemed to reflect structure found in the data. Please note that this is the initial step that takes the ensemble (which usually involves all sorts of different partitions of the data...some 2-group, some 3-group, some 12-group, etc.) and tries across 30 different replicates (using k-means clustering) to make a single segmentation solution of the dimension size the researcher requests.
But, the final solution that is given as the consensus solution is based on the subsequent repetition of clustering on cluster membership, where now we are just boiling the dimensionality down to that requested of the researcher (in the indicator coding matrix). Once the clustering on cluster membership repetitions fail to reclassify people, then we have converged. But (and this may now be obvious to you) if you were to take reproducibility from that very last iteration of clustering on cluster membership (the step prior to breaking out with convergence, where we tried 30 different replicates), the reproducibility would be near 100% every time (because the lack of people moving groups very much was our signal to break out with convergence)!