Regarding Lucene & Nutch

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Regarding Lucene & Nutch

wku_kunal
Hello Everyone,
   
  I am using Lucene & Nutch in my project for searching content in the webpages.
For a webpage or any other document, Lucene takes all the words in the page and indexes them and returns the result when searched.
   
  Lets say, I have 2 webpages as shown below:
   
  Webpage1
----------------------------------------------------------------------
This is the course page of Computer Science Department
  Subject: Operating System I
Professor: Qi Li
  Details:
The course operating system I deals with the basics of the operating system. Mainly the three topics dealt are process management, storage management & memory mangement. etc............................................
..................................................................
----------------------------------------------------------------------
   
  Webpage2
----------------------------------------------------------------------
This is the home page of Computer Science Department
  The computer science department offers courses at undergradudate level and
graduate level. The core courses for the graduate students are  Mathematical Foundations of Computer Science, Compilers, Advanced Database, Analysis of Algorithms and Operating Systems. etc............................
..................................................................
----------------------------------------------------------------------
   
  Now if I search using the word "operating system", the results shows both the webpages (webpage 1 & webpage2) since the word "operating system" exists in both the webpage.
   
  But my requirement is different. If I want to search the word "Operating System" which should appear in the subject field i.e., as in the webpage1, the result should show only webpage1. How can I achieve this result ?
   
  Please help me in this regard.
  Thanks & Regards,
Kunal Gosar


       
---------------------------------
Be a better Globetrotter. Get better travel answers from someone who knows.
Yahoo! Answers - Check it out.