John’s been programming on the web since gopher was a legitimate competitor. He is an independent consultant who specializes in machine learning, natural language processing, and how those are applied to the web.
Sessions for this user
The problem: you're using a modern dynamic language not known for speed, and you've identified a bottleneck. Write it in C? Does that give you the shakes? There are other language options available...
Can you perform simple arithmetic? Do you know how to program well enough to open and read files? Then you can write a Bayesian classifier, one of the machine learning techniques for predicting categories, most famous for its use in spam filters. Let's demystify this impressively-named but ultimately simple process.