Re: Unigram distance curiosity
Here is another result to add to your pondering.
Earlier I posted the outliers in my unigram distance sum per symbol tests:

Z408 has 13 outliers and Z340 has 25 outliers.
Compared to 1000 shuffles, are the outlier counts unusual?
Result:
1) Z408's outlier count is -1.72 sigma below the mean outlier count observed during shuffles.
2) Z340's outlier count is 0.68 sigma above the mean outlier count observed during shuffles.
So, I suppose we can conclude that Z408 shows fewer unigram distance outliers than expected, and Z340 shows a little more than expected, at least compared to randomizations.
Earlier I posted the outliers in my unigram distance sum per symbol tests:

Z408 has 13 outliers and Z340 has 25 outliers.
Compared to 1000 shuffles, are the outlier counts unusual?
Result:
1) Z408's outlier count is -1.72 sigma below the mean outlier count observed during shuffles.
2) Z340's outlier count is 0.68 sigma above the mean outlier count observed during shuffles.
So, I suppose we can conclude that Z408 shows fewer unigram distance outliers than expected, and Z340 shows a little more than expected, at least compared to randomizations.
