by smokie treats » Sat Jun 13, 2015 7:09 pm
Wildcard n-gram Experiment 2
Same message with perfect cycles, except that I changed the symbol L values a little bit.
Symbol Count
5 A 1 2 3 4 5
1 B 6
2 C 7 8
3 D 9 10 11
7 E 12 13 14 15 16 17 18
1 F 19
1 G 20
4 H 21 22 23 24
4 I 25 26 27 28
0 J
1 K 29
3 L 30 31 32
2 M 33 34
4 N 35 36 37 38
5 O 39 40 41 42 43
1 P 44
0 Q
4 R 45 46 47 48
4 S 49 50 51 52
5 T 53 54 55 56 57
1 U 58
1 V 59
1 W 60
1 X 61
1 Y 62
0 Z 63
25 30 26 29 12 29 27 31 32 28 35 20 44 13 39 44 30
14 6 15 7 1 58 49 16 25 53 26 50 51 40 33 58 8
21 19 58 36 27 54 28 52 34 41 45 17 19 58 37 55 22
2 38 29 25 31 32 26 35 20 60 27 30 9 20 3 33 18
28 36 56 23 12 19 42 46 47 13 49 57 6 14 7 4 58
50 15 34 5 37 25 51 53 24 16 33 43 52 54 10 1 38
20 17 48 39 58 49 2 35 26 34 3 31 40 19 4 32 30
55 41 29 27 31 32 50 42 33 18 56 21 28 36 20 20 25
59 12 51 34 13 57 22 14 33 43 52 53 54 23 45 26 30
31 27 37 20 15 61 44 16 46 17 38 8 18 28 49 12 59
13 35 6 14 55 56 15 47 57 24 5 36 20 16 53 54 25
37 20 62 39 58 48 45 40 7 29 50 41 19 19 60 26 55
21 1 20 27 46 32 56 22 17 6 18 51 57 44 2 47 53
42 19 28 54 25 52 55 23 3 56 60 24 12 38 26 60 27
30 31 6 13 48 14 6 43 45 35 28 36 44 4 46 5 11
25 8 15 1 37 9 2 32 30 57 21 16 26 22 3 59 17
29 27 31 32 18 10 60 28 30 31 6 12 7 39 34 13 33
62 49 32 4 59 14 50 25 60 26 30 31 38 40 53 20 27
59 15 62 41 58 34 62 35 5 33 16 6 17 8 1 58 51
18 62 42 58 60 28 32 30 54 47 62 55 43 52 31 39 60
Perfect solve in 15 seconds with the wildcard n-gram files.
To make things easy, I made 63 the wildcard and letter Z. I substituted 63 where the q appears in the 340 (not that it really matters exactly where). L=1 for 63.
25 30 26 29 63 29 27 31 32 28 35 20 44 13 39 44 30
14 63 15 7 1 58 49 16 25 53 26 50 51 40 33 58 8
21 19 58 36 27 54 28 52 34 41 45 17 19 58 37 55 22
2 63 63 25 31 32 26 35 20 60 63 30 9 20 3 33 18
28 36 56 23 12 19 42 46 47 13 49 57 6 14 7 4 58
63 15 34 5 37 25 51 53 24 16 33 43 52 54 10 1 38
20 17 48 39 58 49 2 35 26 34 3 31 40 19 4 32 30
55 41 29 27 31 32 50 42 33 18 56 21 28 36 20 20 25
59 12 51 34 13 57 22 14 33 43 52 53 54 23 45 26 30
31 27 37 20 15 61 44 16 46 17 63 8 18 28 49 12 59
13 35 6 14 55 56 15 47 57 24 5 63 20 16 53 54 25
37 20 62 39 58 48 45 40 7 29 50 41 19 19 60 26 55
21 1 20 27 46 32 56 22 17 6 18 51 57 44 2 47 53
42 19 28 54 25 52 55 23 3 56 60 24 12 38 26 60 27
30 31 6 13 48 14 6 43 45 35 28 36 44 4 46 5 11
25 8 15 1 37 9 2 32 30 63 21 16 26 22 3 59 17
29 27 31 32 18 10 60 28 30 31 6 12 7 39 34 13 33
62 49 32 4 59 14 50 25 60 26 30 31 38 40 53 20 63
59 15 62 41 58 34 62 35 5 33 16 6 17 8 1 58 51
18 62 42 58 60 63 32 30 54 47 62 55 43 52 31 39 60
Solved in 4:45. Quite a bit longer, but it did solve with the Z's where they should be. For a while, the program thought that the Z's were S's, maybe because of expected frequency. I don't know.
Note that the two Z's on line 4 where the two q's are on the 340 are at the end of the word "than" and the beginning of the word "killing." So no two-wildcard n-grams were needed to solve.
ILIKZKILLINGPEOPL
EZECAUSEITISSOMUC
HFUNITISMOREFUNTH
AZZILLINGWZLDGAME
INTHEFORRESTBECAU
ZEMANISTHEMOSTDAN
GEROUSANIMALOFALL
TOKILLSOMETHINGGI
VESMETHEMOSTTHRIL
LINGEXPEREZCEISEV
ENBETTERTHAZGETTI
NGYOURROCKSOFFWIT
HAGIRLTHEBESTPART
OFITISTHATWHENIWI
LLBEREBORNINPARAD
ICEANDALLZHEIHAVE
KILLEDWILLBECOMEM
YSLAVESIWILLNOTGZ
VEYOUMYNAMEBECAUS
EYOUWZLLTRYTOSLOW
Symbol 63 (or Z) is E, B, N, K, I, S, N, N, T, I, and I in that order
So yeah, if we want to test whether one of the symbols is polyalphabetic, and there are not two of those symbols in the same n-graph(?), we can use the one-wildcard n-grams to solve. I think that is what I am finding. Assuming that we know the L counts, and I am not sure how differences between set L counts and actual L counts affect things.
I will conduct another experiment before we should make two-wildcard n-grams.
Smokie