Thank you for the hint. The IoC is based on
frequency of the (alphabetical) letters. However, we deal with homophones' frequencies (first), thus the IoC of an encrypted message (homophones = 63) is way different than that of the same message in decrypted status (alphabet = 26). Because different frequencies as well as overall amount of homophones lead to
different IoC values of encrypted and decrypted texts.
From en- to decrypted messages, the IoC value might therefore be
multiplicated by the average amount of homophones per alphabetical letters (in use). Should be aware that latter could be wrong as homophones are not necessarily used equally often for each letter..
It is possible to test the aspect of different IoCs regarding encrypted and decrypted messages: If you enter the
encrypted message of the Z408 cipher into an IoC analysis tool, you should get an IoC value of about 0.0269 instead of 0.066 (for English language). Nevertheless, the Z408 - as a cleartext - is written in English, has a higher IoC value when decrypted:
54 homophones vs. 23 alphabetical letters in use ['in use' correct?] is 2.3478 homophones per alphabetical letter, thus the cleartext should actually have an IoC of
0.0269 x 2.3478 = IoC 0.063 (Z408 expected)
(IoC English language ~ 0.066)
which is actually very close to the English language, while the IoC of the encrypted Z408 message has only an IoC value of 0.0269. The Z408 cleartext has a IoC of 0.0634, thus the multiplication is actually a 'match'.
Same with the Z340, even more homophones, though.
It would be nice to engineer the computation process to a degree that it decides while computing on which IoC level the computation itself is actually deciding where to compute..the problem is. If there is a short python script for that, I'd love to implement it (based on multiple IF/FOR loops, eg. 'interrupting' or rowing back if the first loop has a bad IoC). I'd appreciate..maybe sometime in the future.
The problems are manifold..with a solid dictionary it is obviously possible to find 10 words of length >4 or more (like ZDK does often). But that is not necessarily the cleartext solution. As if that wasn't worse enough, the more 'steps' are performed, the more computational effort is actually required. To find three words in the cipher can be done in a second..to find more than 10-12, however, could take us into the year 3,000..
Please keep in mind that the encrypted IoC values are not 'wrong' - they just deal with a different sized alphabet (e.g. 54...like an alien language

). Of course that has to be a different situation than (cleartext) English language (with a max. of 26 letters). Most likely, a language with 54 letters actually
has a IoC level of less than three.
IoC calculator
https://planetcalc.com/7944/IoC based on frequencies
https://pages.mtu.edu/~shene/NSF-4/Tuto ... g-IOC.htmlBecause of the effects above, I wouldn't rely too much on the IoC for homophone ciphers, at least not without considering the differences between encrypted and decrypted text.
Based on the previous thoughts, with 63 homophones divided by a max. of 26 letters (2.423 homophones per letter), as well as an English language IoC of 0.066, we could reconstruct if the Z340 is English language or not:
0.066 / 2.423 = 0.0272 IoC encrypted (Z340 expected).
Entering the encrypted message in the IoC calculator (considering 63 homophones!) indeed leads us to an
encrypted IoC value of 0.0277 for the Z340, which is a deviation of only 1.7% from our expectation.
Thus, in a range of <3%, the encoded message of Z340 represents English (or similar language) cleartext. Therefore, with a probability of with 98.3%, the Z340 is not a hoax either, by the way.
QT