Z340 Kasiski Examination

Re: Z340 Kasiski Examination

Postby smokie treats » Thu Jun 09, 2016 6:54 am

I was playing around and made a heatmap, showing the involvement by position ( left ). For example, at position 1, there is only one occurrence where at ANY period where there is a repeat ( marked with blue box right ). Symbol 19, the +, scores higher because for any particular position the spreadsheet is just counting all of the period x unigram repeats with symbol 19.

coincidence.counting.12.png

If I shuffle the message, symbol 19 always scores high.

coincidence.counting.13.png

My spreadsheet slides the x position through the message for x = 1 to 170. You can see that this causes the second half of the message to score lower than the first half, because I did not "wrap" to the top of the message when sliding the x in the bottom half of the message. Is that what I am supposed to do? I flipped and mirrored the 340, using the same spreadsheet, EDIT: I found another spike at 78, but counting more of the bottom half of the message. So I am wondering should I always test the regular message, and the flipped mirrored, and add the counts together for better period detection? If the bottom half and top half are consistent, does this mean anything?

coincidence.counting.14.png
You do not have the required permissions to view the files attached to this post.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby doranchak » Thu Jun 09, 2016 9:57 am

Smokie, thanks for posting your findings - I will give them a closer look soon.

Here's a visualization of the "shift 78" Kasiski examination peak done by untransposing at period 78 and highlighting the doubles that appear:
Screen Shot 2016-06-09 at 10.41.03 AM.png


I only counted 18 doubles there. BartW, your code counts 19, I think because it thinks there's an extra double "+" somewhere. Is 18 the correct count or did I miss something?

Here's the same thing with corresponding pivot positions highlighted:
Screen Shot 2016-06-09 at 10.56.01 AM.png
You do not have the required permissions to view the files attached to this post.
User avatar
doranchak
 
Posts: 2360
Joined: Thu Mar 28, 2013 5:26 am

Re: Z340 Kasiski Examination

Postby BartW » Thu Jun 09, 2016 6:27 pm

Sorry for the quietness i have been tied up with work and i really need some time to understand everyone's maths and analysis.

doranchak wrote:I only counted 18 doubles there. BartW, your code counts 19, I think because it thinks there's an extra double "+" somewhere. Is 18 the correct count or did I miss something?

David there is a wrap over which i think you are missing (Pair 19 @, 281,19)
EDIT I start at location 0 so the the 282nd char and 20th char
note that my code doesn't short the 19 wrap around char due to a bug/laziness :)
Code: Select all
Shift = 078
................d
.................
........G....#...
............+.R..
............+....
p........d.....D.
.......2.c.......
.G....#......+...
.....+.R.F.....4.
.....+....p......
....5..|D........
2.c......T.......
......+..........
..F.....4.....t+.
..............5..
|................
..T......+.......
.................
.......t.........
.................
Pair 1 @, 16,94
Pair 2 @, 42,120
Pair 3 @, 47,125
Pair 4 @, 63,141
Pair 5 @, 65,143
Pair 6 @, 80,158
Pair 7 @, 85,163
Pair 8 @, 100,178
Pair 9 @, 109,187
Pair 10 @, 111,189
Pair 11 @, 132,210
Pair 12 @, 145,223
Pair 13 @, 151,229
Pair 14 @, 158,236
Pair 15 @, 174,252
Pair 16 @, 177,255
Pair 17 @, 196,274
Pair 18 @, 235,313
Pair 19 @, 281,19

http://codepad.org/YUzrcWgq#output
Last edited by BartW on Fri Jun 10, 2016 3:27 am, edited 2 times in total.
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

Re: Z340 Kasiski Examination

Postby smokie treats » Thu Jun 09, 2016 7:33 pm

smokie treats wrote:My spreadsheet slides the x position through the message for x = 1 to 170. You can see that this causes the second half of the message to score lower than the first half, because I did not "wrap" to the top of the message when sliding the x in the bottom half of the message. Is that what I am supposed to do? I flipped and mirrored the 340, using the same spreadsheet, EDIT: I found another spike at 78, but counting more of the bottom half of the message. So I am wondering should I always test the regular message, and the flipped mirrored, and add the counts together for better period detection? If the bottom half and top half are consistent, does this mean anything?


Never mind. I moved through positions 1 through 170, and made x = 1 to 170. The heatmap only shows the lowest position counts, not the highest position counts. I could make one that does both, but it doesn't really matter. The high count symbols show up with higher period x unigram counts, no matter what you do. I don't think that the heatmap tells us much.

There are 15 symbols involved, and I wanted to find out if there is any relationship between the period 78 unigram repeats and the cycles. So I made a little table. On the top and left are the symbols, and in the boxes the count of consecutive alternations.

coincidence.counting.15.png

Symbols 3 and 36 have 9 consecutive alternations, and a few of the symbol positions are shared with the period 78 unigram repeats. But I don't see anything particularly interesting here.

coincidence.counting.16.png


Doranchak, I think that Bart found this one, marked in bold outline. It is wraparound. I found it too, but I honestly don't know if wraparound is appropriate because the number of positions would have to be a multiple of the period I think.

coincidence.counting.17.png
You do not have the required permissions to view the files attached to this post.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby BartW » Fri Jun 10, 2016 3:23 am

Hi Smokie... Silly question time...
What is your definition of a "shuffle" you and David mention it often but i am assume the following
PT is based on Random chars which are then keyed with 63 symbol homophonic randomly is this correct?

In the following
score = ln ( 1 / ( ( ( symbol count / 340 ) * ( symbol count / 340 ) ) ^ number of repeats ) )
I assume Symbol count = 63?
I know natural log etc but what is the relevance to the equation?
Natural log (squared (chance) to the power of instances)

Regards
Bart
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

Re: Z340 Kasiski Examination

Postby smokie treats » Fri Jun 10, 2016 6:58 am

BartW wrote:Hi Smokie... Silly question time...
What is your definition of a "shuffle" you and David mention it often but i am assume the following
PT is based on Random chars which are then keyed with 63 symbol homophonic randomly is this correct?


We just randomly re-organize the 340. Or scramble it. I just re-draft the message into a 340 column x 1 row grid, generate a random number for each ciphertext, sort by the random number, and then re-draft again into a 17 x 20 grid.

I think that we use a shuffle test before anything else because it is very quick and easy to do, and we want to determine whether we should take a closer look at any particular phenomenon. The combination Vigenere homophonic message that I made for you had spikes at increments of 12. How many times would you have to shuffle that message to get such spikes at any increment? Probably a lot, so if you didn't actually know what kind of a message it was you would think that the spikes may be evidence of the cipher and need to be further investigated. Doranchak shuffled the 340 one million times and there were only 2,782 spikes ( 0.28% ) at 18 or higher, so we are taking a closer look. Personally I think that the shuffle test has only so much value. But I think that it is a tool of economy. People have tried so many different theories on the 340 and none of them have so far worked. I think that doranchak is using the shuffle test to start with because he wants to use his time as efficiently as possible.

BartW wrote:In the following
score = ln ( 1 / ( ( ( symbol count / 340 ) * ( symbol count / 340 ) ) ^ number of repeats ) )
I assume Symbol count = 63?
I know natural log etc but what is the relevance to the equation?
Natural log (squared (chance) to the power of instances)


Symbol count is the count of a particular symbol. There are 24 of the + symbol ( my symbol 19 ), and four repeats. Four positions where a + symbol occurs 78 positions away from another + symbol. So the score is 21.25. But I shuffled the 340 for a while and found that with the + symbol it is very easy to duplicate four or more repeats with the +. One possible explanation for the spike is that the spike is created by the + symbol repeats alone. Without them, you wouldn't have a spike. Maybe homophonic encoding just caused some of the + symbols to align themselves at intervals of 78 positions, and that is what you detected. I just shuffled the 340 six times and got four repeats for the + symbol at x = 10. The spike is 17, but that is easier to achieve than 18.

I am going to keep working on this for a while. I am thinking of some ideas to explain the period 78 unigram repeats and the period 19 bigram repeats as if they are both produced by the same combination cipher. I score the period 19 bigram repeats similarly, but you cannot shuffle the message and get a distribution of similar scores like you can with the period 78 unigram repeats.

Question for you: What about spikes at near but not perfect increments. Say 20, 41, 60, 81 and 100 for example. Can they still be considered clues that a message is Vigenere and has a key length of 20, even if the increments are not perfect?
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby doranchak » Fri Jun 10, 2016 7:34 am

I would add to smokie's comments that the shuffle test is best at identifying which phenomena really are happening purely by chance (for example, the Jazzerman patterns). It's such an easy test to implement, so it's a bit of "low hanging fruit". Where things get difficult, however, is making conclusions when the test suggests phenomena are rare. For example, you could say that the string "HER>pl^VPk" is rare because it seldom or never comes up in shuffles. But the number of similarly rare strings is astronomical. So the fact that we saw one rare example doesn't mean much. Here you have a case of a specific very rare pattern representing a very common category. So when we say, "the pivots are very rare", could they possibly belong to a very common category of similarly interesting patterns that appear in shuffles but we just aren't looking for them?

Because of that difficulty, I've moved away from making overly conclusive statements about things we observe in the Z340. To me it is better to collect all the interesting observations and compare them to examples from known encipherment schemes. And the observations themselves help guide the search for potential encipherment schemes (hopefully).
User avatar
doranchak
 
Posts: 2360
Joined: Thu Mar 28, 2013 5:26 am

Re: Z340 Kasiski Examination

Postby doranchak » Fri Jun 10, 2016 8:04 am

BartW wrote:David there is a wrap over which i think you are missing (Pair 19 @, 281,19)


Ah - thanks for pointing it out. Here is another attempt to visualize where they appear.
Screen Shot 2016-06-10 at 9.01.15 AM.png

To get from one symbol to its repeat, we can use the rule "move left 7 positions, then down 5 positions".
You do not have the required permissions to view the files attached to this post.
User avatar
doranchak
 
Posts: 2360
Joined: Thu Mar 28, 2013 5:26 am

Re: Z340 Kasiski Examination

Postby doranchak » Fri Jun 10, 2016 8:41 am

Things line up better when you cast the cipher at width 26 (similar to what smokie did with his spreadsheets):
Screen Shot 2016-06-10 at 9.40.20 AM.png
You do not have the required permissions to view the files attached to this post.
User avatar
doranchak
 
Posts: 2360
Joined: Thu Mar 28, 2013 5:26 am

Re: Z340 Kasiski Examination

Postby doranchak » Fri Jun 10, 2016 10:43 am

While I'm at it, here's what the repeating bigrams look like at period 19:
Screen Shot 2016-06-10 at 11.42.04 AM.png


Writing the cipher to 19 columns makes them a little easier to spot:
Screen Shot 2016-06-10 at 11.39.11 AM.png


The "voids" (regions lacking repeating bigrams) seem interesting to me. The seemingly diagonal void in the first image can be seen as a 2 column void in the second image.

And here's how the pivots appear when the cipher is written to 19 columns:
z340-period19-bigram-repeats-width19-with-pivots.png
You do not have the required permissions to view the files attached to this post.
User avatar
doranchak
 
Posts: 2360
Joined: Thu Mar 28, 2013 5:26 am

PreviousNext

Return to Zodiac Cipher Mailings & Discussion

Who is online

Users browsing this forum: BDHOLLAND, Chaucer, Goodkidmaadtoschi, Jarlve, Mr lowe, tGkTcy2W9B4p60o and 47 guests

cron