[jira] Created: (NUTCH-480) Searching multiple indexes with a single nutch instance

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-480) Searching multiple indexes with a single nutch instance

Prajeeth Emanuel (Jira)
Searching multiple indexes with a single nutch instance
-------------------------------------------------------

                 Key: NUTCH-480
                 URL: https://issues.apache.org/jira/browse/NUTCH-480
             Project: Nutch
          Issue Type: Improvement
          Components: searcher, web gui
    Affects Versions: 0.8
         Environment: Linux and Windows
            Reporter: Ravi Chintakunta


Searching across multiple indexes with a single instance of Nutch is a cool feature improvement. I had this requirement for my production site, where we wanted to list the available categories (indexes) to search as check boxes and the user could select any combination of indexes to search.  The results page also displays the number of hits in each index.

To do this:

- I modified web.xml to include the paths to various search indexes
- Modified Nutch.java to read all the indexes and create IndexReaders
- Modified IndexSearcher.java to handle multiple IndexReaders

In the attached file you will find the patch to the Nutch 0.8 code base and also the newly added files:

- SearchServlet - a servlet that is the web interface for search. This is simplified version of jsp versions (without the i18n) and outputs the results in text, xml or json format.
- SearchConstants - an interface for messages and constants

Please note that the patch includes the functionality for spell check - aka "Did you mean?"

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-480) Searching multiple indexes with a single nutch instance

Prajeeth Emanuel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Chintakunta updated NUTCH-480:
-----------------------------------

    Attachment: nutch.zip

A patch that improves Nutch to search multiple indexes with a single instance of Nutch

> Searching multiple indexes with a single nutch instance
> -------------------------------------------------------
>
>                 Key: NUTCH-480
>                 URL: https://issues.apache.org/jira/browse/NUTCH-480
>             Project: Nutch
>          Issue Type: Improvement
>          Components: searcher, web gui
>    Affects Versions: 0.8
>         Environment: Linux and Windows
>            Reporter: Ravi Chintakunta
>         Attachments: nutch.zip
>
>
> Searching across multiple indexes with a single instance of Nutch is a cool feature improvement. I had this requirement for my production site, where we wanted to list the available categories (indexes) to search as check boxes and the user could select any combination of indexes to search.  The results page also displays the number of hits in each index.
> To do this:
> - I modified web.xml to include the paths to various search indexes
> - Modified Nutch.java to read all the indexes and create IndexReaders
> - Modified IndexSearcher.java to handle multiple IndexReaders
> In the attached file you will find the patch to the Nutch 0.8 code base and also the newly added files:
> - SearchServlet - a servlet that is the web interface for search. This is simplified version of jsp versions (without the i18n) and outputs the results in text, xml or json format.
> - SearchConstants - an interface for messages and constants
> Please note that the patch includes the functionality for spell check - aka "Did you mean?"

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.