Page 9 of 9

Re: Cycle types

PostPosted: Sat Dec 02, 2017 7:52 pm
by doranchak
Jarlve wrote:I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).


How does this work? I don't follow how the scores are computed relative to other scores.

Re: Cycle types

PostPosted: Sun Dec 03, 2017 3:26 am
by Jarlve
doranchak wrote:
Jarlve wrote:I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).


How does this work? I don't follow how the scores are computed relative to other scores.

Build cycle ngram frequencies from one cipher and score the other cipher with it. I also tried a chi-squared test to measure the difference between the cycle ngram frequencies of both ciphers. My conclusion is that these tests may not work so well/have problems. For now I am going back to my original approach: dedicated cycle detection for each individual cycle type.

Re: Cycle types

PostPosted: Sun Dec 03, 2017 5:59 am
by Jarlve
Jarlve wrote:16. The alternating length cycle, which alternates between shorter and longer substitution cycles: 12 - 1 - 12 - 1 - 12 - 1, 123 - 12 -123 - 12 - 123. (Jarlve)

I wrote a routine that looks for perfect examples of this and compares it versus 1000 randomizations. It is hard to interpret these numbers, various cycle types seem to correlate with alternating length cycles, the shortened cycles ciphers by smokie seem to be a good match though. In the full length 408 no perfect alternating length cycles occur.

Alternating length 2-symbol cycle sigmas:

340: 4.59
408: -0.76
408_1-340: -0.91
408_69-408: 3.33
jarlve_26percentrandomhomophones1: 0.48
jarlve_anticycles1: -1.36
jarlve_palindromiccycles1: 4.01
jarlve_perfectcycles1: 2.97
jarlve_randomshiftcycles1: -0.39
moonrock_regionalcycles1: -0.77
moonrock_regionalcycles2: 0.03
smokie_palindromiccycles1: 12.27
smokie_palindromiccycles2: 6.49
smokie_shortenedcycles1: 3.85
smokie_shortenedcycles2: 4.27

Alternating length 3-symbol cycles sigmas:

340: 5.91
408: -0.05
408_1-340: -0.18
408_69-408: -0.26
jarlve_26percentrandomhomophones: 0.85
jarlve_anticycles1: -0.35
jarlve_palindromiccycles1: 2.34
jarlve_perfectcycles1: 8.47
jarlve_randomshiftcycles1: 0.78
moonrock_regionalcycles1: -0.56
moonrock_regionalcycles2: 0.15
smokie_palindromiccycles1: 2.98
smokie_palindromiccycles2: 13.98
smokie_shortenedcycles1: 5.18
smokie_shortenedcycles2: 6.17

Code: Select all
AZdecrypt cycle types stats for: 340.txt
--------------------------------------------------------

2-symbol cycles:
--------------------------------------------------------
Alternating length cycles: 4.59 sigma
--------------------------------------------------------
MZMMZMMZMMZ: 11
VV;VV;VV;: 9
LL/LL/LL/: 9
LL;LL;LL;: 9
GG_GG_GG_: 9
LL7LL7LL7: 9
GG/GG/GG/: 9
^^/^^/^^/: 9
GG;GG;GG;: 9
VV/VV/VV/: 9

3-symbol cycles:
--------------------------------------------------------
Alternating length cycles: 5.91 sigma
--------------------------------------------------------
^*^*&^*^*&^*^*: 14
LL/;LL/;LL/;: 12
LL7;LL7;LL7;: 12
GG7;GG7;GG7;: 12
GG_;GG_;GG_;: 12
GG/;GG/;GG/;: 12
VV/;VV/;VV/;: 12
GG7&GG7&GG7: 11
LL7&LL7&LL7: 11
LL7qLL7qLL7: 11

Runtime: 28.96

Re: Cycle types

PostPosted: Sun Dec 03, 2017 6:03 am
by smokie treats
Taking things slowly this weekend. I want to upgrade my encoder for different patterns, and still thinking of ways to do it.

Here is a chart of the 340, all L=2 isomorphic patterns sorted from left to right. You have to zoom way in and then scroll left and right to see the details. There are interesting little clusters of spikes, and they are interesting because they are spikes of patterns and continuations of patterns.

Brown arrows: Four little spikes AABAA; AABAAB; AABAABAA; AABAABAAB

Orange arrows: Three little spikes ABAAAB; ABAAABA; ABAAABAA

Red arrows: Three spikes, one little and two big ABAAB; ABAABA; ABAABAA

Blue arrows: Two medium and two big spikes ABAB; ABABA; ABABAB; ABABABA

Purple arrow: One big spike ABABAA. It is not a cluster of spikes of continuations of patterns, just one spike.

https://drive.google.com/drive/folders/ ... -KASKWMX9c

Re: Cycle types

PostPosted: Sun Dec 03, 2017 12:31 pm
by Jarlve
doranchak wrote:But I would first recommend reading the paper, because it goes into detail about the speed efficiencies of their algorithm:

http://www.oranchak.com/king-homophonic-ciphers.pdf

Here is the cipher that is contained in the paper:

Code: Select all
1  2  3  4  5  6  7  8  9  10 11 12 5  13 14 15 16 17 18 19
4  5  20 21 22 2  3  8  1  10 23 24 9  4  5  25 8  7  3  21
24 26 1  22 4  27 24 2  28 23 24 29 5  6  7  11 26 30 16 13
28 27 19 11 15 31 16 20 23 29 22 28 6  2  11 6  18 30 14 9
12 29 16 32 30 25 13 8  26 17 10 28 12 12 15 12 29 6  19 27
18 12 23 7  26 24 9  33 4  22 33 2  5  30 27 29 7  11 23 24
20 26 10 32 32 34 13 16 19 8  15 6  4  18 20 27 9  12 13 35
3  10 32 28 22 31 31 15 19 8  18 9  11 21 20 21 13 12 30 16
10 21 4  15 19 23 6  29 14 30 32 31 20 8  26 18 5  27 29 28
21 31 24 9  23 24 13 4  26 24 10 27 11 19 23 30 2  16 7  8

Re: Cycle types

PostPosted: Sun Dec 03, 2017 12:48 pm
by Jarlve
Jarlve wrote:
doranchak wrote:
Jarlve wrote:I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).


How does this work? I don't follow how the scores are computed relative to other scores.

Build cycle ngram frequencies from one cipher and score the other cipher with it. I also tried a chi-squared test to measure the difference between the cycle ngram frequencies of both ciphers. My conclusion is that these tests may not work so well/have problems. For now I am going back to my original approach: dedicated cycle detection for each individual cycle type.

Decided not to give up on it yet and have improved the results by capturing more cycle ngram information and using the logarithmic of the cycle ngram frequencies. It really seems to be working now, this made my day. The results are still fuzzy but can be worked with.

Here is a run of a uniquely randomized 340 versus the ciphers in the batch file, note how the other randomized 340 ciphers are at the top (none of these randomized 340 ciphers share the same randomization by the way):

Code: Select all
340.txt (scored with) 340_randomized2: 437.49
340.txt (scored with) 340_randomized3: 434.90
340.txt (scored with) 340_randomized1: 404.95
340.txt (scored with) moonrock_regionalcycles2: 375.97
340.txt (scored with) moonrock_regionalcycles1: 373.10
340.txt (scored with) jarlve_randomshiftcycles1: 346.92
340.txt (scored with) 340_reversed: 341.79
340.txt (scored with) 340: 325.64
340.txt (scored with) smokie_shortenedcycles1: 303.96
340.txt (scored with) jarlve_26percentrandomhomophones1: 291.81
340.txt (scored with) tonyb1_perfectcycles1: 290.78
340.txt (scored with) 408_69-408: 278.20
340.txt (scored with) smokie_palindromic1: 270.64
340.txt (scored with) 408_1-340: 263.31
340.txt (scored with) jarlve_palindromic1: 261.63
340.txt (scored with) 408: 261.32
340.txt (scored with) jarlve_perfectcycles1: 245.52
340.txt (scored with) smokie_palindromic2: 238.78
340.txt (scored with) smokie_shortenedcycles2: 236.45
340.txt (scored with) jarlve_anticycles1: 213.58
340.txt (scored with) rayn_perfectcycles1: 211.12

And here is the normal 340. The 408 sub strings are now nearer to the top:

Code: Select all
340.txt (scored with) 340: 951.98
340.txt (scored with) smokie_shortenedcycles1: 596.95
340.txt (scored with) 408_1-340: 566.98
340.txt (scored with) 408_69-408: 551.54
340.txt (scored with) jarlve_26percentrandomhomophones1: 550.21
340.txt (scored with) rayn_perfectcycles1: 549.15
340.txt (scored with) jarlve_palindromic1: 542.22
340.txt (scored with) jarlve_randomshiftcycles1: 537.68
340.txt (scored with) smokie_shortenedcycles2: 521.50
340.txt (scored with) jarlve_perfectcycles1: 514.75
340.txt (scored with) moonrock_regionalcycles2: 507.62
340.txt (scored with) 408: 507.22
340.txt (scored with) 340_reversed: 490.91
340.txt (scored with) smokie_palindromic2: 489.08
340.txt (scored with) smokie_palindromic1: 473.01
340.txt (scored with) moonrock_regionalcycles1: 459.69
340.txt (scored with) tonyb1_perfectcycles1: 450.24
340.txt (scored with) 340_randomized1: 446.73
340.txt (scored with) 340_randomized3: 395.16
340.txt (scored with) 340_randomized2: 377.12
340.txt (scored with) jarlve_anticycles1: 86.06

And here is the full 408. Notice how the cipher itself is not the number 1 result, this happens sometimes, and has to do with the ngram frequencies. Though all the 408 and perfect cycles ciphers are at the top. It is really working:

Code: Select all
408.txt (scored with) 408_1-340: 903.36
408.txt (scored with) 408: 873.98
408.txt (scored with) 408_69-408: 573.92
408.txt (scored with) rayn_perfectcycles1: 501.29
408.txt (scored with) jarlve_perfectcycles1: 467.32
408.txt (scored with) tonyb1_perfectcycles1: 454.52
408.txt (scored with) smokie_shortenedcycles1: 414.29
408.txt (scored with) smokie_palindromic2: 407.16
408.txt (scored with) smokie_shortenedcycles2: 392.84
408.txt (scored with) jarlve_26percentrandomhomophones1: 384.52
408.txt (scored with) jarlve_randomshiftcycles1: 369.63
408.txt (scored with) jarlve_palindromic1: 361.26
408.txt (scored with) smokie_palindromic1: 333.51
408.txt (scored with) 340: 329.22
408.txt (scored with) 340_reversed: 326.03
408.txt (scored with) moonrock_regionalcycles1: 289.52
408.txt (scored with) moonrock_regionalcycles2: 279.60
408.txt (scored with) 340_randomized1: 273.43
408.txt (scored with) 340_randomized3: 230
408.txt (scored with) 340_randomized2: 214.01
408.txt (scored with) jarlve_anticycles1: 50.16

Re: Cycle types

PostPosted: Mon Dec 04, 2017 9:59 pm
by doranchak
Jarlve wrote:Here is the cipher that is contained in the paper:

Thanks!

Re: Cycle types

PostPosted: Mon Dec 04, 2017 10:00 pm
by doranchak
Jarlve wrote:Build cycle ngram frequencies from one cipher and score the other cipher with it.

How exactly do you score the other cipher relative to the frequencies of the first cipher's cycle ngrams? I'm probably just missing something obvious.

Re: Cycle types

PostPosted: Tue Dec 05, 2017 10:19 am
by Jarlve
doranchak wrote:
Jarlve wrote:Build cycle ngram frequencies from one cipher and score the other cipher with it.

How exactly do you score the other cipher relative to the frequencies of the first cipher's cycle ngrams? I'm probably just missing something obvious.

1. Get the cycle ngram frequencies of cipher A. For that my routine goes through all 5-symbol cycles with 10-gram frequencies and added 9, 8, 7, 6, 5, 4, 3, 2-gram frequencies at the end of the cycle or for when the cycle is very short.

2. Normalization. Divide these frequencies by the total amount of ngrams for that ngram length and divide the shorter ngrams some more since these will have higher frequency counts and these shoud add to the score, not dominate it. After the division get the logarithm.

3. Go through the cycles of cipher B and sum all the corresponding ngram logs of cipher A. Multiply the sum by some factor.

Currently this approach does not really normalize the symbol frequencies of the cipher, for instance smokie_shortenedcycles1 with a raw ioc of 2666 is close to the top for allot of ciphers. Step 2 is a band-aid fix for these kind of problems but it still is an issue. Getting the sigma of each ngram would be much better as normalization but it would take so much time.

Here is a new run that considers that each cycle ABCDE could as well be BCDEA, CDEAB, DEABC and EABCD. I like how the 408 ciphers take second and third place here.

Code: Select all
340.txt (scored with) 340: 316.81
340.txt (scored with) 408_69-408: 279.58
340.txt (scored with) 408_1-340: 279.28
340.txt (scored with) smokie_shortenedcycles1: 269.27
340.txt (scored with) 408: 256.80
340.txt (scored with) rayn_perfectcycles1: 254.92
340.txt (scored with) jarlve_topbottomcycles2: 240.52
340.txt (scored with) jarlve_topbottomcycles1: 239.39
340.txt (scored with) 340_reversed: 235.11
340.txt (scored with) jarlve_26percentrandomhomophones1: 233.47
340.txt (scored with) jarlve_palindromic1: 232.63
340.txt (scored with) jarlve_perfectcycles1: 232.04
340.txt (scored with) smokie_palindromic2: 229.78
340.txt (scored with) tonyb1_perfectcycles1: 229.03
340.txt (scored with) smokie_shortenedcycles2: 228.92
340.txt (scored with) moonrock_regionalcycles2: 228.68
340.txt (scored with) jarlve_randomshiftcycles1: 217.30
340.txt (scored with) 340_randomized1: 212.48
340.txt (scored with) moonrock_regionalcycles1: 212.34
340.txt (scored with) smokie_palindromic1: 209.09
340.txt (scored with) 340_randomized3: 192.02
340.txt (scored with) 340_randomized2: 190.94
340.txt (scored with) largo_oddevencycles1: 179.61
340.txt (scored with) jarlve_anticycles1: 89.21

Re: Cycle types

PostPosted: Fri Dec 08, 2017 10:18 am
by Jarlve
I completely ditched the idea of scoring one set of cycle ngrams with another.

It now works like this:

Get all cycle ngram frequencies of cipher A and B and calculate the sigma versus randomizations thereof. Sum (and make positive) the numerical differences between the cycle ngram frequency sigmas of cipher A and B. This sum is the final number and a lower sum denotes a better correlation between cipher A and B by this system. To that I added the option to sum only the sigma differences that are above a certain value (say 1) to reduce noise.

You can find this functionality in AZdecrypt 1.091 under the file menu as "Batch ciphers (match symbol sequences)". On my old i7 it checks about 5 million symbol sequences (cycles) per second using 6 threads. It goes through all 3-symbol sequences and uses 6-gram frequencies and it could take up to a minuter per cipher to process.

AZdecrypt 1.091 executable: https://drive.google.com/open?id=1EtJ_W ... Xoh9XJ8CYI
And the batch file I have been using: https://drive.google.com/open?id=1Hl6yz ... kFsGRhG-Rn

The sigma option can be found under options, solver as "(Batch ciphers) Match symbol sequences, only use sigma over". Which probably should read "only sum sigma difference over".

Here are some results:

340 versus batch file:

Code: Select all
1: 340 (versus) 340: 0
2: 340 (versus) 340_reversed: 632.16
3: 340 (versus) jarlve_topbottomcycles1: 742.42
4: 340 (versus) jarlve_topbottomcycles2: 902.65
5: 340 (versus) 408_69-408: 925.43
6: 340 (versus) jarlve_randomshiftcycles1: 963.44
7: 340 (versus) moonrock_regionalcycles2: 1011
8: 340 (versus) moonrock_regionalcycles1: 1176.88
9: 340 (versus) largo_oddevencycles1: 1202.21
10: 340 (versus) 340_randomized1: 1316.82
11: 340 (versus) jarlve_26percentrandomhomophones1: 1327.58
12: 340 (versus) tonyb1_perfectcycles1: 1402.42
13: 340 (versus) smokie_palindromic2: 1419.78
14: 340 (versus) smokie_palindromic1: 1424.45
15: 340 (versus) 340_randomized2: 1497.90
16: 340 (versus) 340_randomized3: 1504.96
17: 340 (versus) smokie_shortenedcycles1: 1591.68
18: 340 (versus) smokie_shortenedcycles2: 1596.30
19: 340 (versus) 408: 1656.97
20: 340 (versus) 408_1-340: 1719.17
21: 340 (versus) jarlve_palindromic1: 1798.17
22: 340 (versus) jarlve_perfectcycles1: 2406.96
23: 340 (versus) rayn_perfectcycles1: 2686
24: 340 (versus) jarlve_anticycles1: 3891.48

Randomized 340 versus batch file:

Code: Select all
1: 340_randomized4 (versus) 340_randomized3: 241.32
2: 340_randomized4 (versus) 340_randomized1: 394.06
3: 340_randomized4 (versus) 340_randomized2: 541.61
4: 340_randomized4 (versus) moonrock_regionalcycles2: 610.62
5: 340_randomized4 (versus) moonrock_regionalcycles1: 802.61
6: 340_randomized4 (versus) 340_reversed: 1565.53
7: 340_randomized4 (versus) largo_oddevencycles1: 1634.59
8: 340_randomized4 (versus) 340: 1700
9: 340_randomized4 (versus) jarlve_topbottomcycles1: 1915.01
10: 340_randomized4 (versus) jarlve_randomshiftcycles1: 2004.45
11: 340_randomized4 (versus) 408_69-408: 2339.06
12: 340_randomized4 (versus) smokie_palindromic1: 2359.25
13: 340_randomized4 (versus) tonyb1_perfectcycles1: 2560.36
14: 340_randomized4 (versus) jarlve_topbottomcycles2: 2590.79
15: 340_randomized4 (versus) jarlve_26percentrandomhomophones1: 2864.42
16: 340_randomized4 (versus) smokie_palindromic2: 2930.01
17: 340_randomized4 (versus) 408_1-340: 3145.80
18: 340_randomized4 (versus) 408: 3158.73
19: 340_randomized4 (versus) smokie_shortenedcycles2: 3181.26
20: 340_randomized4 (versus) smokie_shortenedcycles1: 3257.79
21: 340_randomized4 (versus) jarlve_palindromic1: 3276.70
22: 340_randomized4 (versus) jarlve_perfectcycles1: 3913.67
23: 340_randomized4 (versus) rayn_perfectcycles1: 4112.89
24: 340_randomized4 (versus) jarlve_anticycles1: 5030.45

408 versus batch file:

Code: Select all
1: 408 (versus) 408: 0
2: 408 (versus) 408_1-340: 57.69
3: 408 (versus) 408_69-408: 480.71
4: 408 (versus) tonyb1_perfectcycles1: 767.74
5: 408 (versus) smokie_shortenedcycles2: 1104.35
6: 408 (versus) rayn_perfectcycles1: 1117.10
7: 408 (versus) jarlve_topbottomcycles2: 1147.83
8: 408 (versus) jarlve_26percentrandomhomophones1: 1189.82
9: 408 (versus) jarlve_perfectcycles1: 1203.92
10: 408 (versus) smokie_shortenedcycles1: 1207.90
11: 408 (versus) smokie_palindromic2: 1309.57
12: 408 (versus) jarlve_palindromic1: 1497.30
13: 408 (versus) 340: 1636.33
14: 408 (versus) jarlve_topbottomcycles1: 1662.10
15: 408 (versus) jarlve_randomshiftcycles1: 1755.26
16: 408 (versus) 340_reversed: 1888
17: 408 (versus) largo_oddevencycles1: 1900.24
18: 408 (versus) smokie_palindromic1: 1905.17
19: 408 (versus) moonrock_regionalcycles1: 2593.94
20: 408 (versus) moonrock_regionalcycles2: 2701.20
21: 408 (versus) 340_randomized1: 2777.48
22: 408 (versus) 340_randomized2: 2882.50
23: 408 (versus) 340_randomized3: 2938.02
24: 408 (versus) jarlve_anticycles1: 4091.29