Z340 Kasiski Examination

Re: Z340 Kasiski Examination

Postby BartW » Wed Jun 01, 2016 3:24 am

Smokie, Looks like you been up to some good work.

smokie treats wrote:I made 100 messages randomly selected from the plaintext library found here:
http://zodiackillersite.com/viewtopic.php?f=81&t=2435

They are not Vigenere messages. Just homophonic substitution. I made the keys so that they would diffuse the plaintext inefficiently, so as to increase coincidence counting values. In other words, I made keys that look like this with a slightly higher count of ciphertext mapping to low frequency plaintext, and slightly lower count of ciphertext mapping to high frequency plaintext.


Ok walk me through this...
you generated 100 random messages.
you homophonicly keyed them with 63 symbol space
but can you please define "diffuse the plaintext inefficiently" i don't understand this.

smokie treats wrote:Then I randomized my homophonic symbol selection at 25% as with the message above to roughly approximate the 340 cycles.
I saved all of the results for x=1 to x=170 for the 100 messages, and tallied them.
Below is a column chart of the tallies, with y values written in where they did not show on the chart.
There were four messages with y values of 17 or above, and two with y values over 19 ( my y value for x=78 for the 340 is 19 ).
concidence.counting.3.png



Ok so just check me here.
You tallied up all the Shift78 across all the messages.
The average spread expected would be 340/63 ~ 5.4 which is what appears you have with a vaguely standard bell curve.
This would mean that the peak of the coincidence counting has moved position between messages.

Hmmm... I wonder if we have a frequency mixing going on...
if you mix (two or more) signals together F1 and F2 then depending on the operation you will get Fout = F1+F2, F1*F2 and every mix and combo imaginable.
I wonder if we are seeing Fout and that we have the Letter frequency and symbol frequency beating together.
This is complicated more by the fact that neither is linear.

smokie treats wrote:So maybe coincidence counting detects something else besides just the period for a Vigenere cipher? Maybe there is roughly less than 5% chance of making a homophonic substitution message with a spike that you detected, and that is what we are looking at?


Yes i think my previous comment of the mixing is likely.
I guess in this case we just Keep the coincidence count of 78 as a known quirk which may come in handy in the future.

smokie treats wrote:Please stay on the message board.
We don't have enough programmer cryptanalysts working on the 340.
I am not one. I am just a person with a small laptop, Excel and a hobby.


You guys are doing pretty well compared to some groups and You a pretty wicked with the Excel then :)
I intended to stick around but like most have limited availability due to work and family life.
In my job i am a hardware engineer but do some programming for hardware bring up and i have only been doing crypto for a short while so i still have plenty learning to go.

smokie treats wrote:I will use my coincidence counting spreadsheet to look at transcription from all four corners and two directions from each corner. Then I will show the unigram repeats at period 78 compared to other similar period stats.

Are you aware of the "prime phobia" phenomenon? You would probably be interested in this thread:
http://zodiackillersite.com/viewtopic.php?f=81&t=2841


Yes I remember seeing it in Doranchak's presentation I am not sure what to make of it or the other quirks he discusses.
Regards
Bart
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

Re: Z340 Kasiski Examination

Postby BartW » Wed Jun 01, 2016 4:27 am

Today at Lunchtime I After looking at your Graph Smokie i thought i would have a go at making a quick and nasty Frequency analysis on Coincidence counting.
This shows the main frequency spike at 12,24 etc with a massive spike at 108 (9x12)
An interesting exercise that may be of use in future.

Codepad with output.
[url]
http://codepad.org/LYLgscUL
[/url]

Code: Select all
#include <stdio.h>
#include <string.h>
unsigned int input[]={20,45,50,34,10,24,26,11,8,46,18,43,31,32,9,45,25,13,14,51,42,29,30,19,12,38,21,31,36,39,36,15,27,33,6,42,42,55,50,1,20,40,33,19,22,37,57,22,28,10,48,19,28,24,27,11,8,47,18,44,45,40,2,23,16,6,33,53,3,5,28,47,12,33,9,56,34,13,48,27,41,35,48,32,42,9,39,40,4,29,26,25,24,44,55,6,30,8,21,23,1,29,23,54,22,7,30,19,2,56,50,41,2,27,36,55,37,57,14,20,29,44,50,35,25,39,36,15,49,18,6,52,28,34,44,31,43,13,45,13,46,19,6,39,26,57,18,2,40,21,43,5,8,57,7,7,17,32,27,47,10,37,20,16,42,36,7,20,21,9,39,7,12,30,15,54,24,19,56,16,38,36,31,42,16,13,54,26,3,6,3,28,29,14,16,55,29,8,29,25,16,39,57,24,21,11,48,19,17,22,44,10,25,43,56,33,11,9,20,49,3,35,50,19,52,48,28,50,35,12,47,18,40,53,24,54,15,47,50,51,10,38,25,32,24,27,14,51,22,37,44,8,34,55,54,42,31,5,44,38,44,48,49,38,4,56,37,18,25,26,54,1,46,49,6,32,43,31,57,32,24,27,20,47,30,48,15,2,5,31,34,44,26,15,33,34,23,57,42,23,11,9,50,9,20,27,32,16,17,18,3,51,44,31,28,44,42,28,7,16,35,4,55,33,12,28,32,6,36,13,7,19,26,23,7,1,25,12,17,14};
//************************************************************************************************************************
void main (void)
{
unsigned int KEoutput[340];
unsigned int index,count,length,offset,segments;
   //length = strlen(input); // strlen doesn't work with null data.
   length = 340;
   
   for (offset = 0 ; offset <= (length/2) ; offset++)
   {
      count = 0;
      for (index = 0 ; index < (length) ; index++)
      {
         if ((input[index] == input[(index+offset)%length]))
            count++;
      }
      KEoutput[offset] = count;
   }

   printf("INDX,KE,Count,segments,count/Segments,Variance2\r\n");

   for (offset = 1 ; offset <= (length/2) ; offset++)
   {
      count = 0;
      segments = 0;
      for (index = offset ; index <= (length/2) ; index = index + offset)
      {
         segments++;      //should be lenght/coutn but count them just to be sure..
         count = count + KEoutput[index];
      }
      printf("%04u,%04u,%04u,%04u,%02.2f,%02.2f\n",offset,KEoutput[offset],count,segments,(float)count / (float)segments,(float)(5.4-((float)count / (float)segments))*(5.4-((float)count / (float)segments)));
   }
}
//************************************************************************************************************************
You do not have the required permissions to view the files attached to this post.
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

Re: Z340 Kasiski Examination

Postby smokie treats » Wed Jun 01, 2016 5:06 am

BartW wrote:Ok walk me through this...
you generated 100 random messages.
you homophonicly keyed them with 63 symbol space
but can you please define "diffuse the plaintext inefficiently" i don't understand this.


If I wanted to make a homophonic key that diffused the plaintext efficiently, then I would map more ciphertext to plaintext like E, and fewer ciphertext to plaintext like B. I have to put the +/- 63 symbols somewhere. But I made keys that had a slightly more uniform allocation of ciphertext. Not quite as many ciphertext for E, and one or two more for B. See the post above, and you can tell the difference. I figured more efficient diffusion would defeat coincidence counting, so I diffused less efficiently to increase whatever spikes I might get.

BartW wrote:Ok so just check me here.
You tallied up all the Shift78 across all the messages.
The average spread expected would be 340/63 ~ 5.4 which is what appears you have with a vaguely standard bell curve.
This would mean that the peak of the coincidence counting has moved position between messages.


I tallied 100 values for for each of x=1 to x=170, which gave me 1700 values. Since I made random messages, spikes could occur anywhere on a coincidence counting chart. So I kept all of the 170 x values for each of 100 charts. There were only a small handful of values that were in the same ballpark as the x=78 value of 19 for the 340.

BartW wrote:Hmmm... I wonder if we have a frequency mixing going on...
if you mix (two or more) signals together F1 and F2 then depending on the operation you will get Fout = F1+F2, F1*F2 and every mix and combo imaginable.
I wonder if we are seeing Fout and that we have the Letter frequency and symbol frequency beating together.
This is complicated more by the fact that neither is linear.


No idea what you are talking about, but that is o.k. for now.

BartW wrote:You guys are doing pretty well compared to some groups and You a pretty wicked with the Excel then :)
I intended to stick around but like most have limited availability due to work and family life.
In my job i am a hardware engineer but do some programming for hardware bring up and i have only been doing crypto for a short while so i still have plenty learning to go.


There are other groups? No pressure. I have a commute, a full time job, and other commitments. I work on it when I have both the willingness and ability.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby smokie treats » Wed Jun 01, 2016 6:58 am

Alright, on the left are the ciphertext involved with your x=78 spike, and on the right I decided to show the ciphertext that are members of period 39 bigram repeats. You can see the pivots there too, because they are period 39 bigram repeat ciphertext.

coincidence.counting.4.png

There are 38 ciphertext highlighted on the left, and EDIT 89 highlighted on the right. Counting by eye, I find that they have 20 in common. I have to get going, but I wonder if you generated a set of 38 random numbers between 1 and 340, and another set of 98 random numbers between 1 and 340, would it be difficult to do that and find that the two sets share at least 20 of the numbers in common.

EDIT: I just did what I described in the paragraph above. Except that with random number generating I didn't generate any repeat symbols. I tried 100 shuffles, and there was one shuffle where 18 of the numbers in the set of 38 was also in the set of 98.
You do not have the required permissions to view the files attached to this post.
Last edited by smokie treats on Wed Jun 01, 2016 12:53 pm, edited 1 time in total.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby smokie treats » Wed Jun 01, 2016 12:52 pm

Hey doranchak, are you out there?

The following discussion is for the Z340.

In the post above: On the left, there are 38 highlighted cells which are symbols involved in Bart's slide analysis where x=78. On the right, there are 89 highlighted cells which are symbols involved in the period 39 bigram repeats, including the pivot symbols. The two sets of symbols share 20 positions in common.

I randomly selected 38 positions from 340 to create experiment set 1. Then I randomly selected 89 positions from 340 to create experiment set 2. Out of 10,000 trials, I only got two where there were as many as 20 positions shared by both sets.

I did this because I found it interesting because for Bart's slide analysis, the spike is at x=78, the period 39 bigram repeat "spike", and the pivots. Because 39 * 2 = 78.

What I didn't do, but just thought of, was to select only 19 random positions for set 1 and then add 78 to their positions. And select 45 random positions for set 2 and then add 39 to their positions. I wonder if that will make a difference.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby Mr lowe » Wed Jun 01, 2016 3:39 pm

That's interesting smokie And Bartw keep at it.
Mr lowe
 
Posts: 1156
Joined: Fri Aug 15, 2014 4:07 am

Re: Z340 Kasiski Examination

Postby smokie treats » Wed Jun 01, 2016 8:02 pm

Here, I re-drafted the spreadsheets into 39 columns so that the period 78 unigram repeats and period 39 bigram repeats would line up into columns and make it easy to see. The pivots don't look like pivots anymore.

My spreadsheet had some issues and I fixed them, but from what I can find at this point, there are 35 positions involved with the period 78 unigram repeats, and 89 positions involved with the period 39 bigram repeats.

Of the 35 positions involved with the period 78 unigram repeats, 19 of them are also involved with the period 39 bigram repeats. All of the above, assuming of course, that he transcribed the message from left to right, top to bottom. 89 / 340 = 26% of the message for positions covered by the bigram repeats. So you would think that roughly 26% of the period 78 unigram repeat positions would fall on period 39 bigram repeat positions. But that is not true. 54% of the period 78 unigram repeat positions fall on period 39 bigram repeat positions.

Why? Does it mean something, or does it mean nothing?

coincidence.counting.5.png
You do not have the required permissions to view the files attached to this post.
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby BartW » Thu Jun 02, 2016 4:45 am

Smokie.
In the Left side of the "coincidence.counting.4.png" it is interesting to see the diagonalish banding of the values.
To me this seems rather unrandom or is that just me??? Also if you look at the hit values.
http://codepad.org/lEOlfjGU#output
Code: Select all
dG#+R+pD2c+F4+5|Tt+

There are 5 '+' now in Z340 there are 24 '+' symbols so in 24/340 ~ 7.06% chance of a char being a '+'
So in 18 chars we would randomly expect only 1.27 symbols my statistics fall apart about here on conditional probability and Combinations
to work out the chances of getting 5/19 GRRRR!!!!
Anyway to me at the moment it seem to be an anomoly however there is still a chance that this could be just pure luck.
Regards
Bart
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

Re: Z340 Kasiski Examination

Postby smokie treats » Thu Jun 02, 2016 5:37 am

BartW wrote:Smokie.
In the Left side of the "coincidence.counting.4.png" it is interesting to see the diagonalish banding of the values.
To me this seems rather unrandom or is that just me???


I highlighted the hit positions and the hit positions + 78. In a 17 x 20 grid, they look sort of like they are on diagonal lines. But the diagonals are offset by a couple of spaces and that's just the pattern of x and x+78. See the post above, where I re-drafted the message into 39 columns so that the x and x+78 positions would line up vertically. To make it easier to see.

BartW wrote:There are 5 '+' now in Z340 there are 24 '+' symbols so in 24/340 ~ 7.06% chance of a char being a '+'
So in 18 chars we would randomly expect only 1.27 symbols my statistics fall apart about here on conditional probability and Combinations
to work out the chances of getting 5/19 GRRRR!!!! Anyway to me at the moment it seem to be an anomoly however there is still a chance that this could be just pure luck.


Yes, my symbol 19 is the +, and there do seem to be a disproportionate count of them in your period 78 unigram repeat / x=78 slide spike. So is it an anomaly? What is the probability of that happening? How many times would you have to randomly shuffle the message to get a spike like the one that you found, and then discover that 5 of the 24 + symbols land on the spike's x or x+offset positions? How many semi-cyclic homophonic substitution messages would you have to make, with plaintext randomly taken from some large corpus, and with 63 symbol keys of varying diffusion efficiency, to get a single slide analysis spike like the one that you found. And then discover that 5 of the 24 + symbols land on the spike's x or x+offset positions"? Oh yeah, and then also discover that only one of the 24 + symbols land on a prime numbered position.

My mind has been racing all night to explain the observations made in the last couple of days. I have to do some quick research, then I will be back with a very general interpretation that can be refined later if necessary.

EDIT: Are you familiar with the period 19 bigram repeat statistics?
User avatar
smokie treats
 
Posts: 1620
Joined: Thu Feb 19, 2015 1:34 pm
Location: Lawrence, Kansas

Re: Z340 Kasiski Examination

Postby BartW » Thu Jun 02, 2016 3:18 pm

Hi Smokie
I am just under the hammer at work for a while but as soon as i can i want to do a IoC column search and see if any anomalies exsist at 78 or factors there of. I think this may aid in clarifying our questions
Regards
Bart
BartW
 
Posts: 54
Joined: Thu May 12, 2016 7:59 pm

PreviousNext

Return to Zodiac Cipher Mailings & Discussion

Who is online

Users browsing this forum: BDHOLLAND, Chaucer, Goodkidmaadtoschi, Jarlve, Mr lowe, tGkTcy2W9B4p60o and 47 guests

cron