Page 2 of 4

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 2:42 pm
by Jarlve
Norse wrote:
Jarlve wrote:We may not know the full story of their analysis. What we know about this is that 9 rows in the 340 have no repeats and 6 of them happen to be 1,2,3,11,12,13. They may have tested this to be somewhat uncommon perhaps?

I think it's certainly a possibility that the 340 is in an untested language.


That's certainly possible, yes. But there has to be some twist in addition to that, no?

The 340 can't simply be the 408 all over again *, only with a non-English language as the plain text - or am I wrong?

* In terms of the method he used.


I believe it's a possibility. :)

Or even in a broader sense, something that daikon may be hinting at, that something at the language level differs from what is expected, for English.

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 3:29 pm
by daikon
Jarlve,
You are very perceptive. :) That's exactly what I'm trying to test/confirm/disprove: whether Z340 *is* plain English and *is* a straight homophonic substitution cipher, but the plaintext is just in some way very different from an expected properties of English text. Z408 was already quite a bit "unusual" due to its liberal use of the letter "L" for example, so its chi2 stat was way off the expected for a normal English text. Which is also why I think ZKD v1.2 is a step in the wrong direction, as unlike v1.0 it appears to use chi2 as one of the factors to score the solved plaintext by default. It doesn't hurt too much overall though, as IoC and 4/5-gram stats will still pull the solver in the right direction.

So I thought maybe Z340 uses a lot of somewhat rare words. What would they be for someone calling himself "Zodiac"? The obvious first thought - something about stars, constellations and zodiac signs. That hunch paid off right away as I was able to come up with a simple plaintext encoded with a 63-symbol straight homophonic substitution which fooled the current auto-solvers (ZKD+AZD): viewtopic.php?p=37034#p37034
The version of excellent AZD that you currently have in development was able to crack it though. Do you have plans to release this new improved version soon I hope?

So in what other ways an English text can be so different from "normal" English texts, as to fool auto-solvers? Misspellings obviously don't matter, as Z408 was solved with a sizable number of those, and even several encryption errors on Zodiac's part. I have another hunch, but I just don't have the time right now to come up with a meaningful non-random text for another test. Maybe tomorrow.

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 4:57 pm
by Norse
Hm. Let's say the plain text is written in a language which uses a non-Latin alphabet, then - would that qualify?

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 6:05 pm
by Holmes201
People used slide rulers in those days.

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 6:45 pm
by daikon
Norse wrote:Hm. Let's say the plain text is written in a language which uses a non-Latin alphabet, then - would that qualify?


If you mean that it's an English text that uses different letters (different font?) then that would make no difference, as the substitution would be exactly the same, it would just map to different letters. If you mean it's in a completely different language, other than English, I would love to test this hypothesis, and it should be fairly easy by using the correct IoC and 4/5-gram corpus for the corresponding language. The only problem is you need to be fluent in that language to see if the resulting plaintext means something, since you'll be parsing a continuous stream of letters without word breaks. Personally, I'm only capable of doing that in English, and even then barely enough to read Z408. :)

I can see Zodiac writing a message in, say, Spanish. Or maybe even Latin. What other languages might have been popular in the '60-'70s in the US? Esperanto maybe?

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 7:11 pm
by Norse
daikon wrote:If you mean it's in a completely different language, other than English...


Yes, that's what I have in mind: Hebrew, for instance - or Russian, where a completely different alphabet is in play.

One of my first theories about the Zodiac case was that the 340 plain text is Old Norse. I've moved pretty far away from that by and by, though, for various reasons - but that language would fit the bill in two possible ways:

a) If the plain text is - basically - runes, then you'd deal not only with a non-Latin alphabet but also potentially (depending on whether an older or a newer rune alphabet is used) one which consists of significantly fewer letters/signs than English.

b) If the text is non-runic Norse there would be several unusual letters/symbols in play in addition to the familiar Latin letters you get in modern European languages.

It's an intriguing possibility, I guess - but to be honest I do think it's a bit far fetched.

If this idea has merit, I'd say the likeliest candidates for a non-Latin alphabet plain text is something a fairly regular American guy might be able to produce because he learned the language as a child: Possibly Hebrew. Or - that would be my first choice - some Slavic language or other which uses the Cyrillic alphabet. I doubt Z was someone who mastered a dead language like Old Norse, or otherwise any language he had studied to the required degree.

Problem is indeed, as you say, that you'd need to be pretty fluent in any test language - on top of knowing what you're doing crypto wise.

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Wed Jul 08, 2015 9:06 pm
by daikon
The more I think about the possibility of Z340 being in a different language, the less likely it seems. Let's assume for a second that Zodiac was a smart guy. Judging by his clever ways of avoiding capture it seems to be quite certain (such as applying plastic model glue to his fingertips to avoid leaving fingerprints, which is much better than wearing gloves in a number of ways). He also clearly didn't want to be caught, so he'd want to try to hide any personal details about himself. Which is why, I think, he was using so many spelling mistakes in his letters - to make them significantly different from his normal writing style. Which by the way also makes me think he was either a published writer or a journalist of some sort, so he was afraid of being identified by his writing style. So even if he was fluent in a foreign language, he'd want to keep that information to himself as much as possible, as not to give any more clues to the police. Otherwise if it is discovered that Z340's plaintext was in Hebrew, or Russian, then police can start looking closely at people of that descent, or at attendants of the corresponding language courses. I don't think you can learn fluent Hebrew or Russian on your own, if I'm not mistaken? So that makes me think it's either in a language that's common in California alongside English, such as in Spanish, or that it's in a dead language that someone can learn from a book on their own, such as Esperanto or Latin. But that's just a guess at this point.

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Thu Jul 16, 2015 2:06 am
by daikon
After some more experiments with constructing ciphers similar to Z340, I've realized that there is one more common assumption that can be thrown out of the window because it is incorrect. The one about reading Z340 by rows, and not columns. It is based on the fact that there are much fewer repeats in the rows, vs columns, as WebToy clearly shows (see "Repeated symbols by row" stat). The mistake here is that, yes, the low repeat counts by rows tells us that the *cipher* was constructed by rows, and likely left to right as well. However, and this is the key, it tell us nothing about how the *plaintext* (i.e. the original message) was written into the rows/columns before it was encrypted.

Here's an example.

I submit to you the following cipher:
Code: Select all
K+bGHTm8qIC9DR4Q0
15jEOAS2pZ6Wa3iBo
UjCfN9JT4L+esK7HP
rRnhVQAIFJSWGMKfD
Hldce+U0b8iBI+TV1
2CEg563mOa40fJr1d
s2NbFsXh9R3WUj4V+
++cmSGmK+0HY1m2qm
3D4TPIiEd0QFWjeJa
piOK1gRUHLSmfIo2J
skKh3VAPMe7gNHG+4
01Q8qI2W++BJnUDbV
KW5chUfO3EdCFr9jH
67GT48+plZrVLWcAU
V0I+dJ12eB3Wf4506
78aCMKH19A2UV+ec5
WqNI++JLB6CU3M7uK
4Hsi89XPQVspADoNO
I0L56j+7+Y+85fRkM
P+NJl6K1BUH0r5+0n


Looking at WebToy stats, you'll see that it is very much like Z340. It has very little repeats in the rows, and plenty of repeats column-wise. In fact, I managed to get the first 5 rows without repeats, and the first 2 rows combined have 0 repeats (evident if you put the cipher in 34-symbol rows). It has almost the same number repeated bigrams as Z340, and it even has one repeated 3-gram (vs. 2 in Z340). I've even mimicked the '+' symbol being twice as frequent as the next most frequent symbol, which plays no role in making my point, but I just wanted to make this cipher very much like Z340 in all respects.

And here's the kicker - I can even tell you that it's the exact beginning of Z408, that we all know so well, truncated at 340 characters, encoded using a straight homophonic substitution, and yet trying to crack this cipher using ZDK/AZD will yield absolutely no result for one simple reason.

Because before I encrypted it, I have transposed the plaintext like this:

Code: Select all
INCOIHWTBIAAOHEREETGORAHEBDTLCEGAY
LGAMSAIHESNMKITIRVHYFLRAIOIHLOSIMO
IPUUMNLECTGAINHLEEAOFTTEWRCEEMIVEU
KESCOKDFAHELLGELNNNUWHOWINEIDEWEBW
EOEHRIGOUEROLGMICBGRIEFHLISHWMIYEI
KPIFELARSMTFSIONEEERTBIELNNAIYLOCL
ILTUFLMREOUAOVAGITTOHETNBPDVLSLUAL
LEINUIEEMAELMETETTTCASIIEAAELLNMUT
LBSINNISATALESTXIEIKGTADRRLKBAOYSR
IESTTGNTNDNTTMHPSRNSIPTIEALIEVTNEY


If you don't see it, start reading the first letters of each row, then the second letters, and so on.

There you go. We have another cipher that has the same stats as Z340, and yet it is clearly written vertically, top to bottom. I could've written it diagonally, if I wanted to. Or using any other number of "routes" or columnar transpositions. You just have to *encrypt* it horizontally, left to right, after you are done "transposing" the plaintext.

Which simply means that we cannot rule out that Z340 was written "vertically", or that columnar transpositions were used, etc.. I.e. this part of FBI's analysis can be crossed out as well: "This indicates that the cipher is written horizontally and rules out any transposition patterns that are not strictly horizontal." Maybe that's why Z340 hasn't been cracked yet - nobody tried applying "transposition patters that are not strictly horizontal"?

I might be embarrassing myself here, of course, since I'm not a professional cryptographer, so please do point out any flaws in my reasoning above. :)

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Thu Jul 16, 2015 3:05 am
by glurk
daikon wrote: Maybe that's why Z340 hasn't been cracked yet - nobody tried applying "transposition patterns that are not strictly horizontal"?

I think that the problem here is that there are a nearly infinite number of these transpositions. It would take a little while to try them all.

It's not so much that "nobody tried" - many have - but the problem of "it would take billions of years." Just my opinion.

-glurk

Re: FBI analysis (by Dan Olson) of Z340 might be wrong

PostPosted: Thu Jul 16, 2015 4:46 am
by Jarlve
That's what I said in my last post at: viewtopic.php?f=81&t=267&start=140

There is however a way to look through the encoding, kinda. Let me share some data, note that there is some bias towards the horizontal direction just because of the encoding.

Bigram counts in percentages (undoing directional transposition):

408:
Horizontals: 44.33%
Verticals: 15.76%
Diagonals 1: 23.15%
Diagonals 2: 16.74%

340:
Horizontals: 28.42%
Verticals: 24.21%
Diagonals 1: 27.36%
Diagonals 2: 20%

daikon:
Horizontals: 36.19%
Verticals: 22.01%
Diagonals 1: 20.52%
Diagonals 2: 21.26%

daikon2:
Horizontals: 34.50%
Verticals: 24.79%
Diagonals 1: 19.94%
Diagonals 2: 20.75%

daikon3:
Horizontals: 39.76%
Verticals: 20.47%
Diagonals 1: 20.47%
Diagonals 2: 19.29%

daikon4: 17 by 20 (transposition)
Horizontals: 27.81%
Verticals: 21.30%
Diagonals 1: 27.81%
Diagonals 2: 23.07%

daikon4: 34 by 10 (transposition)
Horizontals: 21.30%
Verticals: 37.39%
Diagonals 1: 22.60%
Diagonals 2: 18.69%

daikon4: 17 by 20 (transposition undone)
Horizontals: 41.48%
Verticals: 18.08%
Diagonals 1: 23.40%
Diagonals 2: 17.02%

daikon4: (transposition undone)
Code: Select all
KUHs3sKVWI+jl2DkW
0q0bCdN4K5INLGfcb
Thc+I5HNeFP3hd+6T
9+sIVUJ+jmJUXiAf1
J+8T0hEPO2L7q4b9d
M3eB+IL8R0eEB6YC+
i3Q7d3C+9eBWFgCWU
8DsIUWNFf35RK+jjH
r4Mf47T4eG957RQHV
VJ+j0uk0P1+a4H6KM
1r2+p0674P5RC+i17
8H+jnEcOQGasNEhgm
K8TCiJOV5S1q4M8lA
Q6GgI8K96SA3mR2+H
XK2ImKUWp1P1pFO+H
+l9QBZJa0L+ZAVU6S
4HSBr2sHWW0YmJVUp
0aGf1fnLVAr3MJmIU
W+D5iKr2oDceo+Bf1
q2bAcN0oDdmJVU5On