IndexMerger and non-nutch Lucene indexes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

IndexMerger and non-nutch Lucene indexes

Brian Whitman
We use Solr to inject non-nutch crawled Lucene documents into the  
main Nutch index. This works fine.. we can search (not using the  
nutch searcher) both nutch docs and the solr injected docs with one  

However, I would like to use the IndexMerger for merging successive  
Nutch crawls. If one of the index directories we give bin/nutch merge  
has Solr-generated Lucene docs in it, we get:

2007-01-26 02:49:34,093 INFO  indexer.IndexMerger - merging indexes  
to: crawl/index
2007-01-26 02:49:34,094 INFO  indexer.IndexMerger - Adding crawl/
2007-01-26 02:49:34,102 INFO  indexer.IndexMerger - Adding crawl/
2007-01-26 02:49:34,106 FATAL indexer.IndexMerger - IndexMerger: crawl/index/_0.fnm not a directory
         at org.apache.nutch.indexer.FsDirectory.<init>
         at org.apache.nutch.indexer.IndexMerger.merge
         at org.apache.hadoop.util.ToolBase.doMain(
         at org.apache.nutch.indexer.IndexMerger.main

Any way around this?