by Quicktrader » Wed Sep 16, 2015 4:51 am
Jarlve, I do appreciate your answers as they really get it to the point. Please, for a minute, let's go a little bit deeper into the issue of how your examine tool can help to solve the cipher. Please do also read it carefully and do not hesitate to ask, if any questions may occur:
1.) The complexity of the 340 is higher than that of the 408. The cleartext-to-homophone ratio is - unqually distributed - 5.397 compared to 7.418, making the 340 much harder to crack.
2.) A similar complexity ratio of the 340 would require to 'guess' 18 symbols correctly, thus leading to 340/45 = 7.555
3.) The 340 has 63^26 variations, thus 60,653,000,000,000,000,000,000,000,000,000,000,000,000,000,000 or 6.06E+46. The examine tool can handle about 100,000,000 variations in one run.
4.) Donald Gene and Bettye June Harden had cracked the 408 by applying certain cipher structures ('kill', 'I like killing')
5.) Such cipher structures do exist in the 340, too. Double letters, two trigrams appearing twice each (although additional ones might be hidden behind different homophones), etc.
6.) Based on the number of homophones Z had used (according to a separate (!) frequency table), Bernoulli formula shows that the odds for + being an 'L' is 71.1% (with 'S' being the second highest with a value of 2.3%). Other letters either are expected to have more than 3 homophones (e.g. 'S', leading to the low value) or are not expected to appear that often as a double letter (e.g. 'VV')
Therefore I do believe that due to the amount of variations (6 Billions^5 or 6.06E+46), such existing cipher structures should be considered when using the examine tool. This step may enable the examine tool to crack the cipher, which is currently not yet the case. Such considerations of cipher structures may be e.g.:
- considering e.g. + being an 'L' due to Bernoulli (or trying + to be an L, S, R,...)
- considering frequent trigrams, e.g. expectations >1 for the two at least twice reappearing trigrams (> expected frequency of trigrams)
- considering related structures, e.g. a vowel previous to '++' or two frequent trigrams in combination as being a 5-gram (line 13, row 9)
otherwise the Examine tool possibly would have to run for a very long time..
In addition to that: Hillclimb method is the correct one, however regularly locks in to a local optima (value of ~13,000), while we are actually trying to force it to the global optima (the correct or nearly-correct solution). To force hill climb / examine tool doing so, it imo is necessary to configure additional assumptions to reduce the cipher's complexity. This can be done, preferrably according to the cipher structure, by various methods such as but not limited to:
a.) Assuming + being an 'L' (or R, S,..)
b.) Assuming the first symbol of the cipher to be a vowel (or a consonant)
c.) Considering the two frequent trigrams to be statistically 'frequent' ones (therefore not being e.g. 'ZZG' or 'GQM' but rather 'THE' or 'ERE')
d.) Considering the 5-gram in line 13, row 9, being a combination of two of such frequent trigrams ('THERE' rather than 'ZZGQM')
e.) Considering a frequent symbol (e.g. the reversed P symbol) to not be a non-frequent letter (e.g. X, Q, Y would most likely not appear with a frequency of 5% in the cipher)
f.) Considering the bigram appearing three times in the cipher to be a frequent bigram, too.
Therefore the examine tool ideally would be adaptable so that certain symbols (or symbol combinations, e.g. trigrams) can exclude or include certain cleartext letters (e.g. exclude X, Y, Q for the reverse P symbol; using a list of frequent cleartext trigrams for the repeating and therefore frequent, too, cipher trigrams).
I would like to give an example of how I'd love to apply the Examine tool:
1. Setting the + symbol (e.g. 'L')
2. Setting the trigrams to a list of frequent trigrams (e.g. all trigrams that would appear 1.5 in a 340 cipher [44 per 10,000])
3. EASIER: Setting the 5-gram (line 13, row 9) to a list of (possibly frequent) 5-grams consisting of two frequent trigrams ('THERE', 'THANT', 'WHICH',..)
4. Setting the bigram in line one to a list of frequent bigrams ('EN', 'TH',..)
5. Excluding certain letters for certain symbols (e.g. X, Y, Q for the reverse P symbol)
6. RUN the examine tool and let it then check out some 100,000,000 variations..
It is obvious that especially step no. 3 is dramatically reducing the amount of variations. My expectation is that, considering this approach, the examine tool should become able to crack the 340. As I said, however, I do believe that it is a necessary step to solve it (if not accidentially).
Regarding your point that such modification would 'interrupt' the process:
The process is indeed influenced in a negative way: The solving process is now definitely slower than before (is it really?). The program can't just use any 5-gram to be placed in, but has to use one that e.g. has one L on the third position and one T on the fifth position. But this exactly is the solving process we actually need. The other two methods, trying all variations or solving the cipher like a newspaper puzzle, both don't work out. The first one because of the huge amount of variations (too high..), the second one because there is a lack of cipher structures such as more 4-grams or trigrams.
By using millions of variations under the pre-condition of existing cipher structures in combination with certain frequencies, the example tool should become successful.
Criticism:
Q: What if the reappearing trigrams or the 5-gram are not among the e.g. 2,000 most frequent ones?
A: Not very likely but tThere is the possibility to try the second most frequent 2,000 trigrams/5-grams, too. It will end up that the 5-gram is not ZZDZQ.
Q: What if the + is not an 'L'? Or the reverse P symbol in fact is a Y?
A: Exclusions and inclusions could be done by anyone as he/she prefers. E.g. trying a certain group of letters to be excluded for the reverse P, then a second group of letters.
Q: What if the homophones are not distributed according a certain frequency table?
A: Nevertheless the preconditions helps to reduce the amount of variations to be checked through.
Q: What if the cipher then still cannot be solved?
A: Do not think so. But additional steps are possible, e.g. focussing on other cipher structures. Or it may even be not solvable, but I clearly doubt it.
My estimation is that the method described above leads to a reduction of the present 6 billion^5 or 6.06E+46 variations to approximately
63
minus 5 (5-gram list)
minus 2 (bigram list)
minus 1 (+ symbol)
minus 1 (e.g. vowel before plus symbol)
minus 1 (exclusion of revers P letters)
leading to an overall of 53^26 variations (still to be multiplied with the number of 5-grams, bigrams etc. used - which can actually be one by trying one run after another). This still is a number, but the complexity ratio has then increased to 6.415 (which is good, meaning a reuced complexity) instead of 5.397. It is therefore at least closer to a value of 7.418 (408 cipher), latter one has been proven to be solvable by methods of computation in the past.
In addition to that, the most frequent variations are covered instead of rare ones, too. This one I like most. A comparison of the amount of variations before and after the modification shows the advantage:
340 (unsolved):
63^26=
6.06E+46 variations
408 (solved):
55^26=
1.78E+45 variations
340 with modification:
53^26=
6.78E+44 variations
So the modifications in fact are leading to a 340 that has even less variations to be hill-climbed than the - already solvable - 408.
Again, I most appreciate your work and guess you therefore will be the first person to become able to read the cleartext. Would love if you share this idea (and the solution).
QT
Last edited by
Quicktrader on Wed Sep 16, 2015 5:21 am, edited 1 time in total.
*ZODIACHRONOLOGY*