Page 2 of 3

Re: A diagonal shift?

PostPosted: Thu Mar 12, 2015 2:23 pm
by AK Wilks
AK Wilks: Questions:

1. You mentioned the analysis by FBI CRR (Code) Unit Chief Olson. He thought the 340 might be divided into two parts with the second part starting IIRC at line 11. You said in your study IIRC lines 1-3 & 11-13 stood out as different. Could that be consistent with those lines being the start of a new message and a new coding scheme?

2. You mentioned lines 6-15 as showing least amount of directional bias? Would you tend to think a word search type message with possible diagonal and/or vertical words would be more likely to occur in lines 6-15?

Or roughly:

1-10 Mostly or all horizontal

11-16 Possible some or mostly diagonal

14-20 Possible some or mostly vertical?


Jarlve wrote:AK Wilks, thank you for your interest.

It is very hard to answer such questions but I will try to do so and explain my reasoning behind it.

1) No. In the 340 you have 9 rows (1,2,3,7,11,12,13,15,20) that have no repeats. Rows (1,2,3,11,12,13) seem consistent with what is expected for rows that have no repeats. In a 63 symbol cipher it is very normal to have a large amount of rows that have no repeats. And these rows could appear anywhere. The rather small observed difference, I think, was just from comparing rows with no repeats to rows with repeats. A straightforward two part message, 170 characters each, with for instance 2 different encodings is out of the question here. I have done testing and such schemes upset the positional and bigram data of the cipher to a very large degree. There is however a "significant enough" difference between the positional data of the 340 and other CHS ciphers including the 408 which indicates that something else is going on. Likely information travelling up and down somehow when it should have travelled horizontally and/or symbols not being part of the encoding.

When symbols are not part of the encoding a few things happen, they disturb the homophonic cycle and cause a shift of information against what is normally expected. They also cause, in case with a fixed column count of 17 a vertical stretch of information. Which can explain the difference in the positional data by inflation of irregularities.

It is possible that a 2 part message/encoding was "masked" by mixing it more evenly between the rows. For instance, message 1 from rows 1 to 5 and 11 to 15 and message 2 from row 6 to 10 and 16 to 20. But that would still have resulted in a very significant reduction in bigram counts which is not present for the 340. Which is in general another very strong objection versus a 2 part message.

2) Can't say yet. The next step for me is to make a word search and encode it in a few different ways and see how well the bigram counts correlate with what is actually there. Also the bigram test that I did was not well suited to the typical word search, because I wasn't testing for that specifically. And the relevance of a bigram test on a 170 character string can also be questioned. I need to remain skeptical, but I also need to do something in order to make progress.

At the moment the word search looks like a promising direction for which I will do more work.


Thanks.

Yes I agree the lack of directional bias, shown in Prof Knight's work and your work at least opens the reasonable possibility that parts of the message could be in a word search format with vertical and/or diagonal words and phrases. Worth exploring anyway. And I assume you agree with me, doranchak, glurk and most of the others here that IF there are any vertical or diagonal words we are looking for mostly correct spelling, as is, continuous and absolutely no wide anagrams and preferably no anagrams at all. As I mentioned parts of the Raw Graysmith as reconstructed by Ed, Claston and others show in that 6-15 midrange correct spelling no anagram diagonal words like BOMBS and LIST and towards the bottom in that 14 - 20 range, vertical correct spelling no anagrams vertical words like DUEL, BARS, LEASH, TAKE & LOSE.

Re: A diagonal shift?

PostPosted: Thu Mar 12, 2015 5:54 pm
by Jarlve
AK Wilks,

I wasn't much aware of Graysmith's solution or the reconstructed one. It is an interesting coincidence with my current work but I'm really not so sure yet. For instance the words you mention are short and could fit anywhere but it may have some merit. I would consider no anagramming at all because that would probably have significantly lowered the bigram count for which I don't see evidence. An alternative to word search is that the cipher somehow is divided in sectors with different directions.

In short I'm currently not gonna bet my money on it that it is a word search. And if it is, it will be most likely be a very tightly packed one. Not just words here and there.

Re: A diagonal shift?

PostPosted: Thu Mar 12, 2015 6:55 pm
by AK Wilks
Jarlve wrote:AK Wilks,

I wasn't much aware of Graysmith's solution or the reconstructed one. It is an interesting coincidence with my current work but I'm really not so sure yet. For instance the words you mention are short and could fit anywhere but it may have some merit. I would consider no anagramming at all because that would probably have significantly lowered the bigram count for which I don't see evidence. An alternative to word search is that the cipher somehow is divided in sectors with different directions.

In short I'm currently not gonna bet my money on it that it is a word search. And if it is, it will be most likely be a very tightly packed one. Not just words here and there.


I am in agreement with most of what you think there. I agree no anagrams. I tend to agree it would be tightly packed. And I think the Raw Graysmith is pretty tightly packed. See below. Why is not more tightly packed and more coherent? Good question. I think the Raw Graysmith (RG) is NOT a complete correct solution. I think at best there are parts that seem correct, and those parts fill in other parts - the opening and 4th line SEE A NAME mostly fill in the 8th line THESE FOOLSHALL SEE. And other parts create the diagonals and the verticals. All unintended and even unknown to Mr. Graysmith. If we ever correctly solved the 340 completely then I think we would see even more tightly packed word puzzle diagonal and vertical messages. But is there ultimately a coherent message or is it just a game? I don't know. And I think even if we go further than the RG and correctly complete it, that is just a first stage solution that still needs in parts some other step. Caesar shifts? Or some way to integrate the horizontal, vertical and diagonal parts into a coherent whole? Or did Z just create a fools game for the police, teasing us with words and phrases?

Image

As you go about your work, if you use several different word search models, I offer that one above as one model to experiment with - I do not offer it as the complete correct solution. But I do think parts may have merit - that parts may actually be correct.

If you or anyone else wishes to further specifically discuss the pros (many IMO) and cons (also many) of the RG we can do it here: viewtopic.php?f=81&t=260

But you can certainly see why Prof Knight's study and your work, both showing a lack of directional bias in the 340 and at least raising the possibility of diagonal and vertical words - interested me in re the RG and the occurrence of crime and Zodiac relevant words there diagonally and vertically.

Re: A diagonal shift?

PostPosted: Sat Mar 14, 2015 12:16 pm
by Jarlve
Thank you AK Wilks,

Probably tomorrow I will do some work on the following word search generated with: http://puzzlemaker.discoveryeducation.c ... _wordcross To see how much my bigrams analysis actually correlates with what is there.

The word search consists of 46 words totalling 379 letters. So most of the words are quite lengthy. The webtool fills in empty spaces with the "+" symbol.

Plaintext:

Code: Select all
h+dnsemit++f++gts
sernoitanimilenhr
wgaeuirns+fn+eire
aandfotoetlgmccig
mgniqlrabwee+inla
pnndmuegcbsr+ruln
eisieaacriepppoie
dwqatrertefraibne
rou+fcirtodiipegt
ola++tencernreec+
wlrs+velgsrtuosre
noesdgirls+s++l+v
dflacigolohcyspge
imcm++ocsicnarf+r
noitarapesflonely
grnsroirpepermint
tnoireredrumkcish
yirrepoleved+c++i
pnhhtronattention
egccenturiesronpg

CHS:

Code: Select all
0157<?FHM23P41RN=
>@U8ZIO]9JGK`A:0V
cS^BdLW;<2Q73CHXD
_]86P[M\ENaTFefIR
GS9JhbY^ic?@4K:`_
j;75FdATgi=U1Vda8
BL>HC]^eWIDkjkZJE
6ch_OX?YM@QU]Ki9A
V[d2PfLWN\5HIjBRO
Zb^34MC:gDX;YE?e1
c`U<2l@aS=VNd[>WA
7\B<6TJXb=3>41`2l
5Qa_fKRZb[0gm<kSC
LGeF34\f=Hg8]YP1U
9ZIO^V_jD>Q`[:Eam
TW;<X\JYk?j@UGK7M
N8ZLVAWB6XdFneH=0
mIYUCk[bDlE52f34J
j900OV\:]MN?;OKZ7
@RgeA8MdWLB>X[9kS

RHS:

Code: Select all
016;<AFHM43Q14TM<
=EW8\LM^;KFKbE:0V
cT_AdKV7>2Q;3ELWE
^^95PZO\BOaRFegLS
GT;KhaV]ic?B1J9b_
j::6Fd@Rfi>X4Ud`8
?H<LE^_gXHBjkjZH?
5ch_NYBUM?QU^Ki;?
X\d1PfIYO\6LLk?TO
Zb_42NC8gCU;V@Ce2
c`V>2lB`T>WMdZ>UB
7ZA=5TLYa<4>21a4l
6P`]fHS\b\0em>kR@
JGgG21Zf<If;]YP3U
;\LN]V]kC<QbZ:Dam
RV7<W[IWj@jAYFH8O
O8[IUEYB6YdFneJ>0
mKXU?jZ`Al?63g31K
k;00OX[:^MMC;NH[7
@Tee@:NdYKC=X\9kT

MHS: CHS with about 1/4 RHS:

Code: Select all
0159>?FJM23P41RN<
=CU:\JO]7JGHa@80V
cS^AdIW9>2Q93EJXB
_]:6PZM[CNbTFefKR
FS7Jh`Y^icDE1K;a_
j785Fd?Sfi=U4Vdb9
@L>HA]^gWKBkjk[JC
6ch_OWDXMAQY]Ii:E
Y\d1QeJUNZ6LKj@RO
[`^21M?;f@V;WABg3
caX>4lC`T<YNd[<YD
9\E=5RLUb>1=23`1l
5Qa_fHSZb[0em<jS?
HGgF41\f=Ig8]UP1X
7ZJO^V_k@>Pa[8A`m
SW9<X\IYj?kAVGL:M
N;[KUBVC6VdFneL=0
mHWX@j[aDlE62f34K
k700OXZ8_MN?9MI\:
@SgeA;OdYKB>U[7jS

Re: A diagonal shift?

PostPosted: Sun Mar 15, 2015 10:32 am
by Jarlve
The plaintext actual direction distribution:

Horizontals: 32.1%
Verticals: 32.7%
Diagonals NE-SW: 3.4%
Diagonals NW-SE: 31.6%

Bigram test for the word search ciphers above:

CHS:
Horizontals: 30.2%
Verticals: 22.6%
Diagonals NE-SW: 26.2%
Diagonals NW-SE: 20.7%

RHS:
Horizontals: 25.5%
Verticals: 23.3%
Diagonals NE-SW: 26.1%
Diagonals NW-SE: 24.9%

MHS:
Horizontals: 23.7%
Verticals: 23.8%
Diagonals NE-SW: 28.8%
Diagonals NW-SE: 23.5%

It seems that my bigram analysis cannot be trusted to derive correct direction information. Though since diagonal NE-SW is high in all, there probably exists a carryover from a specific mix of directions. (assuming horizontal and vertical) It may be possible to adjust, or to come up with a likely distribution after a study. It is interesting that the bigram distribution of the CHS example matches that of the 340 closely. The bigram counts in my examples are lower than in the 340 but this is because my algorithm uses the flattest distribution of symbols possible in the encoding. I need to come up with a more Zodiac like encoding.

When comparing bigram counts (of all directions) between word search, random plaintext and horizontal plaintext ciphers it is clear that the random plaintext cipher has a much lower count than the others. Word search is lower than horizontal but not highly so. Judging from this I say that the bigram counts of the 340 are much more consistent with word search than with having a random plaintext.

Summarizing, the word search exhibits some of the characteristics of the 340 I was looking for. Unclear direction with medium-high bigram count (for instance columnar transposition also has unclear direction but typically has low bigram counts). Similar symbol spread and some of the same strange statistical responses to transposition. It is not perfect but it currently is the best fitting model I have. If the 340 is a word search, I don't think it will be similar to Graysmiths, but who knows? And as said in other posts, some of the symbols may be filler for empty spaces in the word search. Also note that if one error was made in the encoding, for example leaving out a letter as in "experence", the whole word search will have shifted at that position.

Next time I will test a few quadrant ciphers, where each quadrant has another direction.

Re: A diagonal shift?

PostPosted: Wed Mar 25, 2015 4:55 am
by Jarlve
I may have found something interesting that could overthrow the word search idea. A difference of almost 15% in bigram counts between in and output (doing and undoing) full grid directional transpositions. Most noteably vertical and diagonal NE-SW. The thing here is that bigram counts are higher for undoing transposition. I will refer to it as either positive or negative.

The 408 also has a 15% difference but it is positive, which I think is normal because of information carry over to other directions. The ray_n cipher used in this thread exhibits a 7% positive difference and the 408 redone almost 32% positive. Even the word search is positive but only a few %.

It seems to be somewhat consistent for the test ciphers to be positive. I will try to recreate the "effect". The base difference for the plaintext used (340 characters of the 408, found in earlier post) is 99.4% Which is slightly negative so to say. After encoding with a new less perfect algorithm hopefully more likely to Zodiac encoding it becomes 111.6% So again positive.

Let's apply a full grid diagonal NE-SE transposition:

Code: Select all
iikiocifriaeirtil
leleethokgrnelgis
klpbicmndraglgrit
igeeusalomnanhtew
nlsmihifeafitigfp
puottwesdohtenfth
asinghutltscaoswr
snuntasaeonhsetei
ufincommmetkbabdi
eliemioerrcehlaew
lebensheeohtlrhdm
mthaltptrtsiateew
stslextrliwpllmii
suimeeurtinllosgy
oksgboiieiaicetme
oennygfindkevoube
viegaodrnebanoesw
lvnhtioavlllymuul
eitrnbealsleaaolr
tiaeechiyivncyity

The difference for this plaintext is 92% Aha! That is a negative.

Encoding CHS:

Code: Select all
g(ihIOUj?Z';Q1m"A
nLf=;9&NiJ*cL)Y>p
iA6<g-VE@/b5nJ?(a
hY=;C%.fIW7'TGRL2
c)D+UeZ4=bjQm"546
6_N9aK;:@IBRLEjm&
.X>7JGC9AapO'N%21
DT_cRb:.=IEeX;mLg
C4(7-NVW+=9i\'<@h
;nULVZI=*/O;BfbLK
)=\;Tp&L=NGaA?e@W
+RB.nm691a%Q'R;L2
Dm:f=39*)"K6AnV>g
X_(W;LC/ahcf)IpY0
Ni%5<IUZ=Qb"-;R+L
N=E70Jj>T@i;SI_\L
Sg=Y.N@?c;<'EILD2
AS7&m(NbSnf)0VC_A
=h91T\;.n:fL'bI)*
aU.=;OGZ0QSc-0"R0

81% So the encoding actually seems to articulate the bigram difference.

What does it mean? What could it mean? I'm not sure, maybe it is a fluke, maybe an indication of a vertical/diagonal transposition scheme, or a word search with the majority of the words in these directions. Furthermore I want to add that I'm not so sure anymore of my previous assumption, that the transposition was done after encoding. Because a full grid plaintext transposition plus poor encoding can have a significant impact on the numbers of the non-repeats as well.

Re: A diagonal shift?

PostPosted: Thu Mar 26, 2015 2:33 pm
by doranchak
Jarlve, this is fascinating work and I'm eager to hear more. I'm trying to catch up on what you've done so far.

Can you give more details about how you do the calculation of non-repeats? How are you generating the substrings to score from the cipher text? I'm not following how that works.

Thanks!

Re: A diagonal shift?

PostPosted: Thu Mar 26, 2015 6:06 pm
by Jarlve
Hey doranchak, thanks for your interest.

About the non-repeats,

Consider every symbol as a starting point, and then count the length of the unique non-repeating string that follows. When done so for the entire cipher multiply the count of each length by the length to give weight to longer non-repeating strings. For the 340 in horizontal direction the score you then get is 4462. I also have an alternative "IoC" calculation for this since frequencies are involved. Graphing these frequencies is interesting.

Code: Select all
for i=1 to 340 'each symbol as a starting point
for j=i to 340 'count the length of the unique non-repeating string that follows
counter+=1 'until repeat is found
next j
if counter>max_length then max_length=counter
nr_frequencies(counter)+=1
counter=0
next i
for i=1 to max_length
nr_score+=nr_frequencies(i)*i
next i
print nr_score

The 340 peaks at a unique string length of 17 with a count of 26 and then drops rather sharply. I find it strange, it's quite high and not so smooth. At first I thought the "+" symbols were somehow involved because of 340 / 24 being close to 17 but after removing the "+" symbols it still peaks at 17.

What follows is an image that has the length of the unique string that follows for each symbol of the 340.

Image
https://www.dropbox.com/s/gk8bhh3htwy7g ... 2.png?dl=0

Re: A diagonal shift?

PostPosted: Fri Mar 27, 2015 8:02 am
by doranchak
Oh, ok. So if I understand the measurement correctly, it is a way to test randomness. It is an interesting measurement and seems inexpensive to compute. In the past, I explored some more expensive measurements, such as detecting rare patterns and estimating their probabilities. I was curious if any routes or transpositions of the cipher text produce increased appearances of improbable patterns.

Examples of improbable patterns include: Long sequences of homophone cycles, and large numbers of repeated n-grams and other repeating fragments. The candidate homophone cycle "l*M", for instance, appears like this in the 340: [l*M] [l*M] [l*M] lM [l*M] [l*M] [l*M]. Based on that sequence and the frequency of its constituent symbols it's possible to estimate the probability of it occurring by chance. If the pattern was instead "l+M", the probability would be higher since "+" appears so often.

Other low probability repeated pairs of patterns in the 340 include "J??p7", "5?4?.", and "O?*?C" (where "?" are wildcards). I've been wanting to explore more candidate transpositions/routes that might produce more such patterns, perhaps indicating more structured underlying plaintext (assuming the transformation was performed after applying the symbol substitutions).

A problem with measurements of randomness is that they don't distinguish between interesting and uninteresting symbol repetitions. Uninteresting repetitions are the kinds that are very easily obtained by chance. This is why I tend to focus on discovery of improbable patterns, because they might suggest underlying message structure.

I'm going to study more of your posts to try to gain some ideas about which transformations might be worth exploring. Thanks for all of your efforts!

Re: A diagonal shift?

PostPosted: Fri Mar 27, 2015 8:39 am
by morf13
Every time I check in on one of these cipher threads, I leave with my head spinning :lol: Good Luck with your research Guys