HTML meta tags in index

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

HTML meta tags in index

Michael Piccuirro
I'm using nutch to crawl my site.  I've successfully gone through the
tutorial and can search the index it creates.  Now I want to be able to
include the meta tags from those pages in the documents in the index.  I
would like the standard "description" and "keyword" tags as well as a couple
custom ones like "thumbnail" to be in my search results page.

So I've been doing a lot of RTFM'ing and the closest thing I can find is the
plugin example which demonstrates how to get a "recommended" meta tag and
increase the boost.  So currently I'm prepared to write a plugin that reads
all the meta tags I need to use and add them to the index.

My question is, am I on the right track by building the plugin?  Or is there
a easier out-of-the-box way to include the meta tag information?

Thanks a lot in advance for any help.