Page 2 of 8
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 3:55 am
by glurk
daikon-
Actually, I am glad you are here. The more cipher people the better, they come and go. And often they just "go" and are never seen again. I'm not mad or bitter or anything about ZKD, I just wish that people understood better how to use it!
ZKD, over time, uses random restarts to find those "spikes" in the hill-climb space. I think it works, at least eventually.
What I meant to say, and should have said, is WELCOME!! Glad you are here, and the more the better as far as the ciphers!
I'm a nice person, once you get past my Ogre exterior.
-glurk
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 5:29 am
by doranchak
My ini file has: extra = *
Is that the default in the latest version (v1.2)? I don't recall changing it. Daikon, I'm not sure why mine has the wildcard and yours doesn't.
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 5:54 am
by traveller1st
doranchak wrote:My ini file has: extra = *
Same here. I'm using 1.2
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 9:14 am
by Jarlve
Welcome to the forum daikon,
Thanks for your cipher! It's cyclic for the most part, correct?
I ran your cipher with AZdecrypt 0.94 on thorough and did see a 98% recovery rate (for 100 copies of your cipher). At this setting my program solves more than one cipher per second on my old i7. I also wish to refute your statement that not all homophonic substitution ciphers can be auto-solved - within reasonable limits - and from experience I strongly believe the opposite is true. Certainly with a strong solver like ZKDecrypto.
Does this mean the 340 is not homophonic substitution? No.
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 1:26 pm
by daikon
Ok, here's my second attempt at creating a homophonic substitution cipher that cannot be auto-solved. Hopefully it won't be defeated as easily as the first one. :)
- Code: Select all
i76O<d6P01ZSAj6A[i8
GH2I6QK=^gR6OAL634B
]>CJ65TU96A?DG6P06k
1Vn26M:WA6A@E;Sr<XA
7Z[=FH63TU86A>BIjV_
J?iQA`]R^GkOAZPAC64
Np5[6Hm06k12DI]@KjA
<9EYJ:hA66;LA_G6k3S
n4HAgFI=er>WA7Z8M5f
ANJ69KA0GH1[?BX6:`L
^I6QAMYh2aACAd@_N6R
];DZ3KeAf6OT4<JAq6P
nALg=G5n0E6QUAmM71A
hR6OH[>NiA?8WXA6A@F
9V<jPA`]Q^IBAC6RJ:g
A66;KA_G62D6OZ7iLA[
=ME3S4n5F6P8s`9NA^H
6k0Td6QjAI]1Jr>YA:Z
The plaintext was taken from Wikipedia (with slight edits). It has IoC of 0.0661, so no tricks there. It does make use of some less common words, but not entirely out of Zodiac's possible vocabulary. I also confirmed that if I reduce the number of ciphertext unique symbols to around 50, both ZKD and AZD do auto-solve it eventually after a long run.
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 3:38 pm
by Jarlve
This seems to be a good solve. It's about the constellations.
- Code: Select all
pathantheeclipticpa
ssesthroughthirteen
consteelationsthetw
elvetraditionolgadi
acconsteelationsplu
sophimchuswhichinte
rfectsbetweenscorpi
aandsagittoriustwel
vesignsofgodiacaref
irstariessecondtamr
usthirdgemininourth
concerfiftheeasifth
virgoseventhlibraei
ghthscorpioaddition
alaphimchusninthsag
ittoriustenthcapric
orneleventhadmarius
tweenthpiscesgodiac
Re: Not all homophonic substitutions can be auto-solved

Posted:
Tue Jul 07, 2015 4:24 pm
by daikon
Jarlve,
That's pretty close, congrats! The second part is still mostly unreadable, but it uses all of the actual names of zodiac signs, so that's understandable. What did it take? I'm guessing you used AZdecrypt, right? How many iterations? I tried 10,000,000,000 (i.e. 10 Billion) and didn't get anything even close to your solve.
By the way, if I can make a suggestion for AZdecrypt. From what I gather, you use "THE ADVENTURES OF SHERLOCK HOLMES" to build 4-gram stats and then score the plaintext based on that, correct? That probably explains why your solve above was still so far from actual plaintext -- since your corpus likely didn't use any of the zodiac sign names. You can improve the solve rate quite a bit by using a more comprehensive 4-gram stats from a bigger corpus, and you can get an even bigger improvement if you switch to 5-gram stats. You can get that data from here, for example:
http://practicalcryptography.com/crypta ... equencies/
Re: Not all homophonic substitutions can be auto-solved

Posted:
Wed Jul 08, 2015 8:13 am
by Jarlve
hey daikon,
I used a version of AZdecrypt that is in development which can use up to 6-grams drawn from a new 90 megabyte corpus (Project Gutenberg books). 5-grams were used to solve your second cipher. Yes, it would probably be worthwhile to draw from even a bigger and more diverse source. 4-grams are used for all current versions of AZdecrypt out there because they provide a very good ratio between solving power and speed. I like the website Practical Cryptography allot and have used its information since the start of writing the first iteration of my solver.
I don't think the second part is mostly unreadable.
pathan the ecliptic passes through thirteen consteelations
the twelve traditionol gadiac consteelations plus ophimchus
which interfects between scorpia and sagittorius twelve signs
of godiac are first aries second tamrus third gemini nourth
concer fifth eea sifth virgo seventh libra eighth scorpio addition
alaphimchus ninth sagittorius tenth capricorn eleventh admarius
tweenth pisces godiac
Re: Not all homophonic substitutions can be auto-solved

Posted:
Wed Jul 08, 2015 6:54 pm
by daikon
Jarlve,
Fair enough, apparently I'm not too good at parsing continuous streams of letters into English words. Probably just not enough experience doing that over and over again, compared to you. :)
Using 6-grams to score solves would be awesome and I think should improve AZD even more. Although I think 90 Mb corpus would be too small. Practical Cryptography data used 4+ Gb corpus. Although for some reason if you add up all counts for 3-grams you get 4,274,127,909, for 4-grams you get 4,224,127,912, and 5-grams = 4,174,127,916. Perhaps they didn't cross sentence boundaries when counting N-grams, so you get higher counts for lower Ns?
By the way, which algorithm are you using in AZD? Hill-climb? Simulated annealing? Genetic algorithm? Something else?
Re: Not all homophonic substitutions can be auto-solved

Posted:
Thu Jul 09, 2015 12:07 pm
by Jarlve
daikon,
This is a new solve using 5-grams from Practical Cryptography.
path of the ecliptic passes through thirteen constellations
the twelve traditional bodiac constellations plus ophimchus
which interfects between scorpio and sagittarius twelve signs
of bodiac are first aries second tamrus third gemini fourth
cancer fifth leo sinth virgo seventh libra eighth scorpio addition
alophimchus ninth sagittarius tenth capricorn eleventh atmarius
twelfth pisces bodiac
It's certainly better and on first impression cipher recovery rate improved as well. Your tip has panned out, thank you. Though the program loses the flexibility to manipulate n-grams at start-up, for instance slightly randomizing or removing characters from the corpus. Which has proven valuable for smokie_treats's wildcard hypothesis.
I don't want to go to much into program details but it's very similar to simulated annealing (performs about the same also) and it was something I came up with myself before I even learned of SA. In general my program has not much intelligence and relies on it's speed.