I ran across this in one of my courses and I thought it might be interesting to the Nutch Project:

It's an article about using reinforcement learning to optimize the spider's performance when searching for specific content in order to do a domain-specific search later on. Their experiment looks for Computer Science research papers, and does it quite well...

There are a few similar papers, like the more recent (haven't looked at that one in depth). If you follow the links given from the bibliography and similar/related documents, there are many more interesting articles about this sort of thing.

I think it would be great to implement this, especially for potal-like applications. Just a thought.

