FBI analysis (by Dan Olson) of Z340 might be wrong
Now that I got your attention
, here's my reasoning.
More specifically, Dan Olson of FBI says:
"Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake."
http://zodiackillerciphers.com/wiki/ind ... _Dan_Olson
While the first part is certainly true, lines 1-3 and 11-13 do not contain any repeats, so they are "more random", but I believe the conclusion he makes is without merit. I've tried constructing multiple ciphertexts similar to Z340, that contain a valid plaintext taken from other Zodiac letters, using a simple homophonic substitution used in Z480. It turns out it is very trivial to get the same characteristics of no repeats on some lines and several repeats on others. In fact, you can do it to pretty much any line without even having to alter your plaintext in any way. The key is to have multiple homophones for each plaintext letter, which is certainly the case for Z340 with whopping 63 cipher symbols for 26 letters of presumably English alphabet, and secondly, you need to switch from sequential use of homophones to random order from line to line. Here's why -- if you pick homophones strictly sequentially, and you are lucky with your plaintext not to have too many repeating rare letters, you won't have any repeats for a long time. If you pick homophones for the same letters randomly, you bound to get more repeats. So if Zodiac used sequential homophones selection for lines 1-3, but then switched to random for lines 4-10, and then went back to sequential for lines 11-13 and then back to random for the rest of cipher, you'd have the exact same "unevenness of randomness" exhibited by Z340 for pretty much any plaintext. Why would he do that? To make deciphering harder, to break up the homophone cycles.
But in fact, Zodiac didn't even do that. If you observe more carefully, only lines 1-3 are "special", and lines 11-13 just happen to look "more random" because the ciphertext has 17 columns. Don't believe me? Go the the excellent Webtoy tool:
http://www.oranchak.com/zodiac/webtoy/stats.html
and change the layout of Z340 from the default "20x17" grid to double the number of columns (click on "10x34"). It combines each of 2 original rows into 1 long row. You'll see now that only row 1 (original rows 1-2) has a few repeats, which is expected for a sequential homophonic substitution cipher. Row 6 (starts with U+R), which is a combination of original rows 11-12, now has a huge number of repeats, so is row 7 (original rows 13-14). Row 8 has surprising low number of repeats, but original rows 15-16 also had low repeats, and it could just be due to plaintext being non-repeating in that section.
So you see, rows 11-13 in the original ciphertext are *not* that special, and the whole idea of the cipher being split in two and then the left half placed on top of the right half, which is what Dan Olson suggested on a few occasions, doesn't seem likely.
I know that I'm contradicting what a well known and respected FBI crypto-analyst has said, and I have a total of a couple of months of experience and knowledge trying to break Z340, compared to decades of experience on Dan's part, so I could be totally off in my conclusions. But logic above suggests otherwise. What do you think, am I missing something?
Now, the fact that rows 1-2, even when combined into one, have much lower number of repeats compared to the rest of the ciphertext is a very good sign! It does suggest that we are looking at a homophonic substitution, and that Z likely started with sequential assignment of homophones. Although it could also be because the plaintext has few repeats in the first two lines, but it is less likely.
More specifically, Dan Olson of FBI says:
"Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake."
http://zodiackillerciphers.com/wiki/ind ... _Dan_Olson
While the first part is certainly true, lines 1-3 and 11-13 do not contain any repeats, so they are "more random", but I believe the conclusion he makes is without merit. I've tried constructing multiple ciphertexts similar to Z340, that contain a valid plaintext taken from other Zodiac letters, using a simple homophonic substitution used in Z480. It turns out it is very trivial to get the same characteristics of no repeats on some lines and several repeats on others. In fact, you can do it to pretty much any line without even having to alter your plaintext in any way. The key is to have multiple homophones for each plaintext letter, which is certainly the case for Z340 with whopping 63 cipher symbols for 26 letters of presumably English alphabet, and secondly, you need to switch from sequential use of homophones to random order from line to line. Here's why -- if you pick homophones strictly sequentially, and you are lucky with your plaintext not to have too many repeating rare letters, you won't have any repeats for a long time. If you pick homophones for the same letters randomly, you bound to get more repeats. So if Zodiac used sequential homophones selection for lines 1-3, but then switched to random for lines 4-10, and then went back to sequential for lines 11-13 and then back to random for the rest of cipher, you'd have the exact same "unevenness of randomness" exhibited by Z340 for pretty much any plaintext. Why would he do that? To make deciphering harder, to break up the homophone cycles.
But in fact, Zodiac didn't even do that. If you observe more carefully, only lines 1-3 are "special", and lines 11-13 just happen to look "more random" because the ciphertext has 17 columns. Don't believe me? Go the the excellent Webtoy tool:
http://www.oranchak.com/zodiac/webtoy/stats.html
and change the layout of Z340 from the default "20x17" grid to double the number of columns (click on "10x34"). It combines each of 2 original rows into 1 long row. You'll see now that only row 1 (original rows 1-2) has a few repeats, which is expected for a sequential homophonic substitution cipher. Row 6 (starts with U+R), which is a combination of original rows 11-12, now has a huge number of repeats, so is row 7 (original rows 13-14). Row 8 has surprising low number of repeats, but original rows 15-16 also had low repeats, and it could just be due to plaintext being non-repeating in that section.
So you see, rows 11-13 in the original ciphertext are *not* that special, and the whole idea of the cipher being split in two and then the left half placed on top of the right half, which is what Dan Olson suggested on a few occasions, doesn't seem likely.
I know that I'm contradicting what a well known and respected FBI crypto-analyst has said, and I have a total of a couple of months of experience and knowledge trying to break Z340, compared to decades of experience on Dan's part, so I could be totally off in my conclusions. But logic above suggests otherwise. What do you think, am I missing something?
Now, the fact that rows 1-2, even when combined into one, have much lower number of repeats compared to the rest of the ciphertext is a very good sign! It does suggest that we are looking at a homophonic substitution, and that Z likely started with sequential assignment of homophones. Although it could also be because the plaintext has few repeats in the first two lines, but it is less likely.