[jira] The question about DocStoreOffset

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] The question about DocStoreOffset

Yali Hu

Hi, good morning.

I faced one problem when created index file using lucene 3.2.

In my indexs file there are the TermVector files but they become to 0 byte
when is open by Luke.

According to the explaination about Lucene index files format,
mybe the reason is the value of DocStoreOffset of segment file.

In my indexs file, the segement file name is segments_1

And the part of source code to create index is as below.


IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_32, null);
iwc.setIndexDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
IndexWriter writer = new IndexWriter(dir, iwc);
LogMergePolicy lmp = new LogDocMergePolicy();

writer.addDocument(doc.getDocument(), analyzer);

writer.addIndexes(new Directory[] { form.getDirectory() });


In fact my source is based on the Hadoop-contrib 2951 which create index
using map/reduce.

But it is based on Lucene 2.3. I made some changes in some deprecated
method to lt it match Lucene 3.2.

In the original source, there is no this problem.
The termvector file can be list normally and the segment file name is

But after my modification, termvector file was created normally but the
segement seams not correct which let termvector file cannot be list in Luke

Is there anybody could give me some advice about it?

Thanks in advance.

Best regards.

Yali Hu

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]