Problem with sorting index

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with sorting index

Michael-49
When i'm trying to use IndexSorter, i'm getting this error:

Exception in thread "main" java.lang.IllegalArgumentException: attempt to access a deleted document
        at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:282)
        at org.apache.lucene.index.FilterIndexReader.document(FilterIndexReader.java:104)
        at org.apache.nutch.indexer.IndexSorter$SortingReader.document(IndexSorter.java:170)
        at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:186)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88)
        at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:579)
        at org.apache.nutch.indexer.IndexSorter.sort(IndexSorter.java:240)
        at org.apache.nutch.indexer.IndexSorter.main(IndexSorter.java:291)
       
Anyone knows how to fix this?
 

Michael

Reply | Threaded
Open this post in threaded view
|

Re: Problem with sorting index

Doug Cutting
It sounds like you're sorting a segment index after dedup, rather than a
merged index.  It also looks like there's a bug in IndexSorter.  But you
should be able to work around it by merging your segment indexes after
deduping, so there are no deletions.

Please file a bug in Jira.

Doug

Michael wrote:

> When i'm trying to use IndexSorter, i'm getting this error:
>
> Exception in thread "main" java.lang.IllegalArgumentException: attempt to access a deleted document
>         at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:282)
>         at org.apache.lucene.index.FilterIndexReader.document(FilterIndexReader.java:104)
>         at org.apache.nutch.indexer.IndexSorter$SortingReader.document(IndexSorter.java:170)
>         at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:186)
>         at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88)
>         at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:579)
>         at org.apache.nutch.indexer.IndexSorter.sort(IndexSorter.java:240)
>         at org.apache.nutch.indexer.IndexSorter.main(IndexSorter.java:291)
>        
> Anyone knows how to fix this?
>  
>
> Michael
>
vis
Reply | Threaded
Open this post in threaded view
|

Re: Problem with sorting index

vis
In reply to this post by Michael-49
Sorry, I am on holiday until the 8th of May.

Please contact the [hidden email] for urgent matters.

Kind regards, Herman.