Shuffle experiments show that + will fall only on 0 or 1 prime positions in 3% of shuffles. In 0.7% of shuffles, + and B each fall on 0 or 1 prime positions. So, I can't easily dismiss the phenomenon as coincidence. More info here: http://www.zodiackillerciphers.com/?p=319
While this primephobia of the most frequent symbols could be a coincidence, it could also be a symptom of the cipher's construction methodology.
The purpose of this post is to present 2 encryption methods that substantially augment the odds of inducing such primephobia in the resulting ciphers.
Primes in columns
When listing a series of numbers in a table format, an interesting phenomenon can be observed. For example, let's list all numbers from 1 to 340 in a table of 6 columns. In this table, I have highlighted all the prime numbers in orange:
You'll notice that, if we exclude the very first line of numbers, all the prime numbers are positioned in columns 1 and 5. Columns 2, 3, 4 and 6, highlighted in green, are prime-safe (again, excluding the first line), meaning that no prime number can be found in these columns. This is because all numbers in columns 2, 4 and 6 are at least divisible by 2 and numbers in column 3 are at least divisible by 3.
The appearance of these "prime-safe" columns is entirely dependant on the number of columns chosen to display the list of numbers. As a second example, here is the same list of numbers organised in a 7-column table:
You'll notice that, excluding the first line, only the 7th column is prime-safe, since all the numbers in that column are at least divisible by 7. All other columns potentially can host a prime number.
Here are a few examples of prime-safe columns according to the number of columns used to display the list of numbers :
- Code: Select all
# Columns Prime-safe Columns
---------------------------------------
5 5
6 2, 3, 4, 6
7 7
8 2, 4, 6, 8
9 3, 6, 9
10 2, 4, 5, 6, 8, 10
...
17 17
...
If a cipher construction method were to intrinsically exploit this prime-safe columns phenomenon, it would increase the probabilities of yielding primephobic ciphers. In other words, if the construction method was somehow funneling high-frequency symbols in prime-safe columns, it would greatly increase the yield of primephobic ciphers.
Recipe #1: Vigenère
The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution.
[...]
In a Caesar cipher, each letter of the alphabet is shifted along some number of places; for example, in a Caesar cipher of shift 3, A would become D, B would become E, Y would become B and so on. The Vigenère cipher consists of several Caesar ciphers in sequence with different shift values.
[...]
The alphabet used at each point depends on a repeating keyword. - Wikipedia
This repeating keyword makes the Vigenère encoding very cyclical. If the keyword is 5 characters long, it means that there will be 5 different encoding alphabets, repeated over: 1,2,3,4,5,1,2,3,4,5,1,2,etc. Another way to look at this is, for a 5-letter keyword, displaying the plaintext in a grid of 5 columns, every letter of a column will be encoded with the same alphabet. This cyclical quality of Vigenère is therefore very compatible with the prime-safe notion explained above.
For example, given a random english plaintext of 340 characters, we would find on average about 43 letter E and 30 letter T. By formatting this plaintext in a grid of 6 columns, these letters would be randomly spread out across all columns. Now, let's say we encode this plaintext using Vigenère with the keyword "QDEEZE". When an E in the plaintext is encoded with an E in the keyword, an "I" is obtained. When a T in the plaintext is encoded with a D in the keyword, a "W" is obtained. Since the keyword is 6 characters long, and the keyword letters D and E are in positions 2, 3, 4 and 6 (all prime-safe columns), this encoding process will funnel a high amount of resulting I and W symbols in prime-safe columns.
By generating random english plaintexts of 340 characters and Vigenère encoding them with that "QDEEZE" keyword and only selecting the resulting ciphers where the number of symbols I and W total 36 (to mimic the frequency of + and B in the z340), we get a staggering 54% of ciphers which exhibit a primephobia on these symbols equal or higher than the + and B of the z340. This is in comparison with 0.7% of random shuffles of the z340 exhibiting equal or higher prime phobia than the original z340.
The size of the keyword, its letters and their positions in the keyword will have a dramatic impact on the likelyhood of producing a primephobic cipher.
Recipe #2: progressive key polyalphabetic cipher
This method consists in switching encoding alphabets for each letter of the plaintext. If 5 alphabets are defined, the 1st plaintext letter is encoded using alphabet #1, the 2nd with alphabet #2, etc. The 6th letter is encoded with alphabet #1 and so forth... The number of defined alphabets will dictate how frequently the encoder cycles through these alphabets. This is the same principle as the number of characters in a Vigenère keyword.
For example, let's consider this partial encoding table consisting of 6 alphabets where only the +, B and X symbols are mapped:
This means that a + symbol would be decoded to either a E, A or D, depending on where that symbol is found (i.e. which alphabet is used) and a B symbol would correspond to the letters B, C or H. With such an encoding table, the + and B symbols would only fall on columns 2, 3, 4 and 6 (all prime-safe columns), thus yielding a rate close to 100% of ciphers being more primephobic than the z340.
Again, the number of alphabets and how the symbols are assigned to plaintext letters will greatly affect the primephobic cipher yield. But, as with Vigenère, the interesting conclusion is that both these encoding schemes have a demonstrable and significant impact on primephobia by concentrating symbols in prime-safe columns.
I think it is possible that the prime phobia exhibited by the z340 is an indication that a similar cyclical approach (polyalphabetic or otherwise) was used with favorable conditions as to concentrate + and B symbols in prime-safe columns.
_pi
