[jira] [Created] (NUTCH-2495) Use -deleteGone instead of clean job in crawler script while indexing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (NUTCH-2495) Use -deleteGone instead of clean job in crawler script while indexing

JIRA jira@apache.org
Moreno Feltscher created NUTCH-2495:
---------------------------------------

             Summary: Use -deleteGone instead of clean job in crawler script while indexing
                 Key: NUTCH-2495
                 URL: https://issues.apache.org/jira/browse/NUTCH-2495
             Project: Nutch
          Issue Type: Improvement
            Reporter: Moreno Feltscher
            Assignee: Moreno Feltscher


Instead of running {{bin/nutch clean}} after indexing the documents run {{bin/nutch index}} with the {{-deleteGone}} flag which instead of just deleting gone and duplicated documents also deletes redirects from the index.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)