Largo wrote:This does not happen with all plaintexts tough. If you take the first 340 letters of z408 and do the transposition described above, the p19/15 peaks are preserved after the homophonic substitution. Obviously this depends on the plaintext and the key.
To me it looks possible that the p19/15 peaks in z340 could be a false positive and the „real“ period is a different one. AZDecrypt solves the cipher shown above within a minute (mode „Solve + transpose“). So a „false“ period peak could not be the only problem with z340.
Has this been discussed before?
Largo: Some thoughts. The 340 has a spike at p 15/19, but the bigram repeat count at period 1 is low. Evidence of transposition. If you do not transpose the plaintext, it is difficult to match these statistics. You can start with a plaintext that has a lower than average p1 and higher than average p19, then manipulate the key so that the p 1 bigrams are diffused more and the p 19 bigrams are diffused less. See this:
viewtopic.php?f=81&t=2617&p=43811&hilit=smokie18e#p43811If you do not transpose the plaintext, then it is very difficult to match 340 p 15/19 stats. If you do transpose the plaintext, then it is much easier to match p 15/19 stats... and you can get phantom spikes at other periods depending on the plaintext and the key. I have been thinking about a small project, make one of these messages and then dissect it. If the 340 isn't a transposition at periods 15 or 19, then how do we know it is a transposition at all? If it was then Jarlve's program would have solved it anyway.
Pretty much everything you just said. A handful of null symbols, or skipped plaintext or symbols caused by transcription errors would cause misalignments in the untransposed message. I find it very plausible that someone could easily skip a few.
Check here for a list of possible transposition issues, and an attack plan that maybe could be used if there are disruptions or misalignments:
viewtopic.php?f=81&t=3196&start=330That is why I am so excited about the independent row solver. I am hoping that the message can be solved with "sliding areas" plan and the independent row solver.
EDIT: Also consider scoring the bigram repeats according to probability. Here is my formula:
LN ( 1 / ( ( ( COUNT OF A / 340 ) * ( COUNT OF B / 340 ) ) ^ 2 ) )
Let's say you have only four of symbol A and only six of symbol B, and there are three occurrences of AB at p 19. That is pretty good evidence. Make a list of all bigram repeats, score them, sort them by score, and then graph the distribution. It is easier, by far, to match these stats at the correct period, but not at a phantom period, or with a randomly shuffled message.