Re: Not all homophonic substitutions can be auto-solved
I'm glad to be of some small help with my suggestions. :) Using 6-grams should improve the solves even further, and 7-grams are not that impractical. If you keep each 7-gram score to 1 byte which should be plenty for a log score, you "only" need 8,031,810,176 bytes (26 to the power of 7) to keep the 7-gram stats in memory in an optimized for speed array, or less than 8Gb of RAM total. Not entirely out of the realm of possible for modern computers. How much improvement you get from 7-grams vs 6-grams is still remains to be seen, as it will greatly depend on the size of your corpus at that point.
You don't actually need "much intelligence" to solve many problems. If you think about it, hill-climb algorithm is very dumb to begin with. I always thought that in the simplest terms it sounds just like how a 5-year old would approach solving a problem. :) You just nudge the solution a bit in a random direction and if it gets better, you keep it, otherwise you go back to the old solution and nudge some more. What could be simpler or dumber than that? And yet, it can solve a wide range of very complex optimization problems. It just takes a lot of nudging and a few clever improvements to speed up the whole process. So not much intelligence and a lot of speed is a *very* good thing.
Jarlve wrote:I don't want to go to much into program details but it's very similar to simulated annealing (performs about the same also) and it was something I came up with myself before I even learned of SA. In general my program has not much intelligence and relies on it's speed.
You don't actually need "much intelligence" to solve many problems. If you think about it, hill-climb algorithm is very dumb to begin with. I always thought that in the simplest terms it sounds just like how a 5-year old would approach solving a problem. :) You just nudge the solution a bit in a random direction and if it gets better, you keep it, otherwise you go back to the old solution and nudge some more. What could be simpler or dumber than that? And yet, it can solve a wide range of very complex optimization problems. It just takes a lot of nudging and a few clever improvements to speed up the whole process. So not much intelligence and a lot of speed is a *very* good thing.