Page 20 of 29

Re: AZdecrypt 1.16

PostPosted: Wed Oct 09, 2019 10:38 am
by Jarlve
Thanks guys, it means allot.

doranchak wrote:I really need to upgrade my machines to have more RAM. :)

RAM in computers used to double every few years but now it has stagnated - just when we need it. :)

Mr lowe wrote:Hi Jarlve.. I have only had 10 minutes with it but it seems to be lightning fast. I am looking forward to the weekend and spending some time getting acquainted with it. I will run through some old scytales that stalled to see if I can get any better results.

Cool, try it with beijinghouse's v4 7-grams if you can! If not his 6-grams will do fine also.

Re: AZdecrypt 1.16

PostPosted: Sat Oct 12, 2019 8:22 am
by beijinghouse
doranchak wrote:Awesome work! I really need to upgrade my machines to have more RAM. :)


Just to clarify, my new v4 8-gram file only needs 14.35GB of memory to load.

I've been developing it on a system with 64GB of ram, but it's possible you could load and run it on a 16GB ram system if you close absolutely everything else. Might be useful for testing the v4 8-grams to test how powerful they are.

I may create a lower memory version of 8-grams in the future that only needs 3.6GB of mem if people say they want it. But it's also possible I'll make a higher performance 8-gram file that needs closer to 52GB as well.

My advice would be to get at least 64GB of memory if you want to be able to use the best ngram files in the near future.

Re: AZdecrypt 1.16

PostPosted: Sun Oct 13, 2019 10:17 am
by beldenge
Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven't proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it's basically useless. I'll do a little more research on that but hoped to gain some insight from you guys if possible.

EDIT: Here's a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it's not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith

Re: AZdecrypt 1.16

PostPosted: Sun Oct 13, 2019 12:43 pm
by Jarlve
beldenge wrote:Out of curiosity, what is the goal of producing higher-order ngram models?

To be able to solve more difficult ciphers/hypotheses/problems of higher multiplicity.

beldenge wrote:I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging.

True, it is something that me and beijinghouse have discussed also. But there are ways to overcome the convergence problem: faster hardware, denser n-grams, solver algorithm improvements and/or including lower n-gram sizes such as zkdecrypto.

In general the hill that the solver tries to climb grows more narrow/spikier as the n-gram size increases because neighbouring n-gram variations become more sparse, for example the 8-gram "DISCUSSIO" will have a good score but a 1 letter change like a "DILCULLIO" may have a value of 0 and it becomes more and more an all or nothing situation.

Re: AZdecrypt 1.16

PostPosted: Sun Oct 13, 2019 7:20 pm
by beldenge
Thanks Jarlve, that makes a lot of sense. And when you say "denser" n-grams, can you please elaborate? Is it simply that the model is built with a larger corpus of data, so the "good" n-grams have more samples?

Re: AZdecrypt 1.16

PostPosted: Sun Oct 13, 2019 8:11 pm
by doranchak
beldenge wrote:Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven't proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it's basically useless. I'll do a little more research on that but hoped to gain some insight from you guys if possible.

EDIT: Here's a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it's not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith


Funny you should mention Keras - I've been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I've been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!

Re: AZdecrypt 1.16

PostPosted: Mon Oct 14, 2019 12:35 am
by Jarlve
beldenge wrote:Is it simply that the model is built with a larger corpus of data, so the "good" n-grams have more samples?

Yes. For example. I am using a 500 GB corpus at the moment and it is not enough for good 8-grams. beijinghouse is using much more data. His 8-grams have 3,631,818,052 unique n-gram items where mine have only 991,781,102.

Re: AZdecrypt 1.16

PostPosted: Mon Oct 14, 2019 12:42 am
by Jarlve
doranchak wrote:
beldenge wrote:Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven't proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it's basically useless. I'll do a little more research on that but hoped to gain some insight from you guys if possible.

EDIT: Here's a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it's not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith


Funny you should mention Keras - I've been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I've been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!

Good luck with your talk doranchak! What you say is very interesting and chimes in with our work and thoughts over the years. Great stuff.

Re: AZdecrypt 1.16

PostPosted: Mon Oct 14, 2019 8:14 pm
by beldenge
doranchak wrote:
beldenge wrote:Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven't proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it's basically useless. I'll do a little more research on that but hoped to gain some insight from you guys if possible.

EDIT: Here's a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it's not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith


Funny you should mention Keras - I've been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I've been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!


Wishing you the best with your talk. That's exciting stuff! Do you happen to know if it will it be shared online? I would be very interested to listen.

Re: AZdecrypt 1.16

PostPosted: Tue Oct 15, 2019 8:31 am
by doranchak
beldenge wrote:Wishing you the best with your talk. That's exciting stuff! Do you happen to know if it will it be shared online? I would be very interested to listen.

Thanks. I plan to record the audio and will try to put together a video with the slides on YouTube.