Page 6 of 144

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 7:46 am
by smokie treats
You are getting me interested in the bigrams. After thinking about what daikon said, maybe Zodiac didn't just plop down random wildcards.

Here is a scenario that can be tested ( or maybe already has been ):

1. Zodiac created a cipher key like the one for the 408, with cycles.

2. He encoded his message.

3. Then he visually checked for repeating bigrams and circled them.

4. Then, with his little menu of wildcards, he worked his way down the message from top to bottom, replacing one of the two symbols in each repeated bigram by marking over one of the existing symbols with a wildcard.

5. On a separate piece of paper, he kept a list of the two newly created bigrams, a sequence of three numbers (e.g. 6 19 36). Each time he added a wildcard, he checked above on the list to make sure that he wasn't repeating any bigrams that he had just created in the message above. If so, then he just switched to the next wildcard on his menu. EDIT: The primary wildcard was +. When adding a + created a new repeating bigram, then he switched to the B or the F. In the beginning of the masking process, using a new wildcard was easy because he had not used that number before. The longer the list for each wildcard, the more difficult it became.

6. Then he made a second draft, with the wildcards, which is the one that we are so familiar with.

Something like that. Maybe he did know what a repeating bigram was. He knew about frequency analysis.

EDIT: I tried it and came up with a message that is more cyclic than the 340, but has much fewer repeating bigrams than the 340 (Only 20 count). There are four wildcards, with counts 21, 9, 8 and 5 for a total of 43 symbols. It won't solve. If anyone is interested, I highly encourage them to try the process for themselves to see how Zodiac would have done such a thing. I used my computer to identify the repeating bigrams so that I didn't have to do it visually. If you want to see the cipher key, message grid, and statistics, let me know. I don't want to post it without permission because it's not in the suite.

Also note that in doranchak's list of repeating bigrams, there are a lot of +'s, B's and F's. That's because he wasn't perfect when tracking his masking efforts. Those are mistakes, or just the product of masking of bigrams that happened to sit next to each other, or new repeats that he wasn't aware that he was creating.

See: http://zodiackillerciphers.com/wiki/ind ... ength:_2_2

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 12:24 pm
by daikon
smokie treats wrote:5. On a separate piece of paper, he kept a list of the two newly created bigrams, a sequence of three numbers (e.g. 6 19 36). Each time he added a wildcard, he checked above on the list to make sure that he wasn't repeating any bigrams that he had just created in the message above. If so, then he just switched to the next wildcard on his menu. EDIT: The primary wildcard was +. When adding a + created a new repeating bigram, then he switched to the B or the F.


Hmm, this will definitely work! You came up with a way where using wildcards will destroy bigram repeats, instead of creating ones. I forgot that you can cycle wildcards, just like homophones. I think if Z was indeed using wildcards, this would be much more likely the way he did it, as it would result in lower bigram repeats we are seeing in Z340. And you can even probably check if any of the symbols that interrupt homophone cycles (i.e. they are suspected wildcards), if they form a cycle of their own?

It still leaves the issue of ambiguities in the decoded text due to too many wildcards, but Z could have been crazy enough not to care about that.

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 12:39 pm
by Jarlve
smokie treats wrote:If you want to see the cipher key, message grid, and statistics, let me know. I don't want to post it without permission because it's not in the suite.

That's not a problem at all. And I'll add it to the main post, please share the cipher.

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 1:15 pm
by smokie treats
O.k.

Here is the key:

A 1 26 42 52
B 2
C 3 27
D 4 28
E 5 29 43 53 60 63
F 6
G 7 30
H 8 31 44 54
I 9 32 45 55 61
J 10
K 11
L 12 33 46
M 13
N 14 34 47 56
O 15 35 48 57 62
P 16 36
Q
R 17
S 18 38 50 58
T 19 39
U 20 40
V 21
W 22 41
X 23
Y 24
Z 25

Here is the original message, before adding wildcards to mask the bigram repeats:

16 20 17 36 12 5 8 1 25 29 26 33 46 9 14 13 24
2 17 42 32 34 12 52 19 43 33 24 39 31 45 47 7 18
4 15 56 19 38 53 60 13 39 44 63 50 1 13 5 26 3
19 55 14 6 40 34 47 24 2 20 39 61 28 35 56 19 11
14 48 22 41 54 24 58 27 40 18 29 13 43 22 8 9 46
53 32 11 45 38 50 39 31 60 58 11 24 16 20 17 36 12
63 44 42 25 5 52 33 46 1 17 57 40 34 4 28 62 47
19 11 56 15 41 55 6 61 13 3 35 13 9 14 30 20 16
48 17 4 57 22 34 26 13 32 54 42 36 16 24 62 17 45
47 13 55 18 29 17 24 41 8 52 39 43 21 53 17 61 19
9 38 39 31 1 19 7 32 17 12 36 40 39 26 50 16 60
33 46 15 56 13 63 44 5 12 36 13 29 54 43 33 16 13
53 35 8 14 48 57 31 24 60 42 44 36 20 17 16 46 63
54 52 25 5 1 12 33 45 34 13 24 29 24 43 58 28 62
47 19 11 56 15 22 55 6 61 39 18 4 26 24 35 17 14
9 30 8 19 24 48 40 21 53 7 57 39 13 60 2 46 62
41 32 34 30 2 12 15 22 45 47 7 13 24 13 55 56 28
61 38 9 19 39 35 13 48 17 17 57 41 62 17 10 20 50
19 39 31 63 5 14 4 15 6 19 32 13 29 44 43 33 36
13 53 24 60 42 54 16 40 17 36 46 63 8 52 25 5 23

Then I performed the masking process, using 37, 49, 51 and 59 as wildcards to get the final message (the worksheet with 43 or 44 rows of three numbers each, most of which have the wildcard in the middle, is handwritten on a piece of paper):

16 20 17 36 12 5 8 1 25 29 26 33 46 9 14 13 24
2 17 42 32 49 12 52 19 43 33 24 39 31 45 47 7 18
37 15 56 19 38 53 60 13 39 44 63 50 1 13 5 26 3
19 55 14 6 40 34 47 24 2 20 39 61 28 35 56 59 11
14 37 22 41 54 24 58 27 40 18 29 13 43 22 8 9 46
53 32 11 45 38 50 51 31 60 58 11 24 37 49 17 37 12
63 44 42 25 5 52 49 46 1 37 57 40 51 4 28 62 47
19 37 56 15 41 55 37 61 13 3 35 13 37 14 30 20 16
48 17 4 57 22 34 26 13 32 54 42 36 16 24 62 17 45
47 37 55 49 29 17 24 41 37 52 39 43 21 53 17 61 19
9 38 39 51 1 19 7 32 17 12 36 40 39 26 50 16 60
33 49 49 56 13 63 44 5 51 36 13 29 54 43 51 16 13
53 35 8 14 48 57 31 24 59 42 44 36 20 49 16 46 63
54 52 25 37 1 12 33 45 34 13 37 29 24 43 58 28 37
47 49 11 56 59 22 55 6 61 39 18 4 26 24 35 17 14
9 30 8 19 24 48 40 21 37 7 57 39 13 60 2 46 62
41 32 34 30 2 12 15 22 45 51 7 13 37 13 55 56 28
61 38 9 19 49 35 49 48 51 17 57 41 62 59 10 20 50
19 39 31 63 5 14 4 15 6 19 32 13 37 44 43 33 36
13 53 37 60 42 54 16 40 37 36 46 51 8 52 37 5 23

Here is the solution:

p u r p l e h a z e a l l i n m y
b r a i n l a t e l y t h i n g s
d o n t s e e m t h e s a m e a c
t i n f u n n y b u t I d o n t k
n o w w h y s c u s e m e w h i l
e I k i s s t h e s k y p u r p l
e h a z e a l l a r o u n d d o n
t k n o w i f I m c o m i n g u p
o r d o w n a m I h a p p y o r i
n m i s e r y w h a t e v e r i t
i s t h a t g i r l p u t a s p e
l l o n m e h e l p m e h e l p m
e o h n o o h y e a h p u r p l e
h a z e a l l i n m y e y e s d o
n t k n o w i f i t s d a y o r n
i g h t y o u v e g o t m e b l o
w i n g b l o w i n g m y m i n d
i s i t t o m o r r o w o r j u s
t t h e e n d o f t i m e h e l p
m e y e a h p u r p l e h a z e x

Here are the cycle stats. It actually has more, longer, consecutive alternating two symbols cycles than the 340. But there was no random symbol selection. It started out with perfect cycles to begin with. Compare to column B.

PH.cycle.stats.png


Here are the repeating bigrams before the masking process:

PH.orig.bigraphs.png


Here are the repeating bigrams after the masking process:

PHbigrams.png


Note that I do have some repeats with wildcards 37 and 59. That happened because I wasn't trying to be a perfectionist. I didn't do that on purpose.

Check it out. Maybe you guys can solve it. ZKD got stalled up at about 30k for me, but you guys are a lot better at solving these things than I am. If you are interested, try the process to get a feel for what it would be like. It took me about an hour.

Smokie

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 1:28 pm
by smokie treats
Here is part of my masking worksheet:

PH.worksheet.png


It's tiny but does the job. I'm not an IT guy.

Smokie

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 2:04 pm
by daikon
smokie treats wrote:Check it out. Maybe you guys can solve it. ZKD got stalled up at about 30k for me, but you guys are a lot better at solving these things than I am. If you are interested, try the process to get a feel for what it would be like. It took me about an hour.


Yes, I can confirm that I couldn't solve the wildcards version of this cipher. The original cipher (no wildcards) gets solved pretty easily, even though it is song lyrics, so it's somewhat different from a normal English text, and it also has pretty rare words (haze) and a couple of contractions (actin' and 'scuse). But I couldn't even see any word fragments in the wildcards version. You definitely did a really good job making it unsolvable. I also compared different stats of your cipher with Z340, and it is pretty close. You might've overdone the bigram repeats reduction a bit, as you only have half as many as Z340, but since Z didn't use a computer, he could've simply missed many bigram repeats, or just got tired and thought it was enough.

All in all, I think this is a very plausible method of encryption that results in a cipher with very similar stats to Z340, and that can't be solved (just like Z340). Now we need to find a way to crack it. Did you use perfect cycles when assigning wildcards? Or was it done mostly randomly? I'm just thinking if we can detect the wildcards cycling somehow?

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 2:23 pm
by daikon
Here's the plaintext with the wildcards/blanks. It is still mostly readable:
PURPLEHAZEALLINMYBRAI_LATELYTHINGS_ONTSEEMTHESAMEACTINFUNNYBUTDONT_NO_WHYSCUSEMEWHILEKISSTH_SKY
PU__L_HAZEALL_RO_ND_ONTKN_WIFM_OMING_PORDOWNAMHAPPYORINMISE_Y_HATE_ERITISTHATG_RLPUTASPELLONM_
_ELPME_ELPME_HNOOHYEAHP_RPLE_AZEALL_NMYEYE_DONTK_O_IF_TSDAYORNIGHTYOUVEGOT_EBLOWINGBLOWING
MY_IN_ISITTOMO_R_W_RJUS_THEENDOFTIMEHEL_MEYEAH_URPLE_AZ_X
Wildcards do make some sections nearly impossible to discern the intended meaning, but Z might have not cared enough about that...

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 2:25 pm
by Jarlve
Interesting cipher smokie. I have included it with the main post.

There are about 50% less bigrams (period 1) in the horizontal direction than in other directions (vertical, diagonal). For higher periods (2,3,4,etc) there are then more bigrams in the horizontal direction. This does not correlate well to the 340 in a whole but does correlate highly to last 10 rows of the 340!

It's also interesting to see how much appliance of wildcards reduce patterns found in ZKDecrypto. Like going from the 408 to the 340.

I can't even solve your "unwildcarded" cipher with AZdecrypt, it probably suffers from the same problem as daikon3 (word entropy, same words repeating over and over again). This troubles me a bit.

Question, I see high count symbols in the normal version, are these 1:1 substitutes? If so, how many and what are the counts (this is information I like to add in the main post).

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 2:28 pm
by Jarlve
daikon wrote:The original cipher (no wildcards) gets solved pretty easily, even though it is song lyrics, so it's somewhat different from a normal English text, and it also has pretty rare words (haze) and a couple of contractions (actin' and 'scuse).

With ZKDecrypto or your own solver? I can't seem to solve it with AZdecrypt096.

Re: Homophonic substitution

PostPosted: Sat Aug 08, 2015 2:38 pm
by smokie treats
I started with symbol 37 and kept working with that for a while until trying to mask a bigram was not easy with 37. Then I switched to 49, then to 51, then to 59. The first several of each wildcard symbol is really easy. Most of it is pretty easy. I need to take a closer look at doranchak's bigram list, but it does include several +'s, B's and F's. I am wondering if this is due to some kind of a flaw or lazyness in the masking process, causing some of the wildcards to be next to each other in the message. Thanks for checking this out for me.