[jira] Created: (LUCENE-1328) FileNotFoundException in

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-1328) FileNotFoundException in

JIRA jira@apache.org
FileNotFoundException in
-------------------------

                 Key: LUCENE-1328
                 URL: https://issues.apache.org/jira/browse/LUCENE-1328
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 2.1
         Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp ([hidden email]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007

We use solr 1.2 and the lucene is 2.1.( I don't think this problem has anything to do with solr.)
            Reporter: Yajun Liu


I had this problem for a while. here is how I used lucene index:

1) I don't use compound file.

2) I have a single process and a single thread to update index as index updater. The index is really small, the mergefactor is 10.
After index is updated, the same thread copies the index to a tmp directory, validate the index in the tmp directory by:
 
IndexReader reader = IndexReader.open(tmp_directory);
reader.close();

then rename the tmp directory to a snapshot_timestamp;

3) the snapshot_timestamp is rsyn to search nodes which DO NOT update index.

4) We automatically stop and start index updater and search nodes every midnight. (don't ask me why)


Here is what I observed:

1) Not always, sometimes when index updater is started during our automatic recycle, we got

java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:501)
        at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:526)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
        at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:57)
        at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:176)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:157)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:130)
        at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:205)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:184)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:157)

Note each time, the missing file is different. When this happen, the program automatically tried to reopen the most recent THREE snapshots and we got the same exception for each snapshot. Remember, each of the snapshot was validated before it was copied.

2) The similar things happened on the search node: the same index which was opened OK during last night nodes recycle could not be opened due to the same exception. The search node does not update index.

In my case, the index was "validated" before, and it became invalidate in a later time. It seems it happened only when we restart the application. When the exception happen, the file did not exist in the index.

--Yajun


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Robert Engels
If your "automatic recycle" means a restart/reboot, the /tmp  
directory is probably being cleared by the OS and you might have a  
startup race condition.

On Jul 7, 2008, at 2:17 AM, Yajun Liu (JIRA) wrote:

> FileNotFoundException in
> -------------------------
>
>                  Key: LUCENE-1328
>                  URL: https://issues.apache.org/jira/browse/ 
> LUCENE-1328
>              Project: Lucene - Java
>           Issue Type: Bug
>           Components: Index
>     Affects Versions: 2.1
>          Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp  
> ([hidden email]) (gcc version 3.4.6  
> 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
>
> We use solr 1.2 and the lucene is 2.1.( I don't think this problem  
> has anything to do with solr.)
>             Reporter: Yajun Liu
>
>
> I had this problem for a while. here is how I used lucene index:
>
> 1) I don't use compound file.
>
> 2) I have a single process and a single thread to update index as  
> index updater. The index is really small, the mergefactor is 10.
> After index is updated, the same thread copies the index to a tmp  
> directory, validate the index in the tmp directory by:
>
> IndexReader reader = IndexReader.open(tmp_directory);
> reader.close();
>
> then rename the tmp directory to a snapshot_timestamp;
>
> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT  
> update index.
>
> 4) We automatically stop and start index updater and search nodes  
> every midnight. (don't ask me why)
>
>
> Here is what I observed:
>
> 1) Not always, sometimes when index updater is started during our  
> automatic recycle, we got
>
> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file  
> or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput
> $Descriptor.<init>(FSDirectory.java:501)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>
> (FSDirectory.java:526)
>         at org.apache.lucene.store.FSDirectory.openInput
> (FSDirectory.java:440)
>         at org.apache.lucene.index.FieldInfos.<init>
> (FieldInfos.java:57)
>         at org.apache.lucene.index.SegmentReader.initialize
> (SegmentReader.java:176)
>         at org.apache.lucene.index.SegmentReader.get
> (SegmentReader.java:157)
>         at org.apache.lucene.index.SegmentReader.get
> (SegmentReader.java:130)
>         at org.apache.lucene.index.IndexReader$1.doBody
> (IndexReader.java:205)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run
> (SegmentInfos.java:610)
>         at org.apache.lucene.index.IndexReader.open
> (IndexReader.java:184)
>         at org.apache.lucene.index.IndexReader.open
> (IndexReader.java:157)
>
> Note each time, the missing file is different. When this happen,  
> the program automatically tried to reopen the most recent THREE  
> snapshots and we got the same exception for each snapshot.  
> Remember, each of the snapshot was validated before it was copied.
>
> 2) The similar things happened on the search node: the same index  
> which was opened OK during last night nodes recycle could not be  
> opened due to the same exception. The search node does not update  
> index.
>
> In my case, the index was "validated" before, and it became  
> invalidate in a later time. It seems it happened only when we  
> restart the application. When the exception happen, the file did  
> not exist in the index.
>
> --Yajun
>
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Michael McCandless-2
In reply to this post by JIRA jira@apache.org
Yajun Liu (JIRA) <[hidden email]> wrote:

Can you double check which underlying version of Lucene you are
using?  Those source file/line numbers don't line up to a stock 2.1
release as far as I can tell.

This part is odd:

> When this happen, the program automatically tried to reopen the most
> recent THREE snapshots and we got the same exception for each
> snapshot. Remember, each of the snapshot was validated before it was
> copied.

If I understand the steps correctly, once and index is created,
verified (by opening an IndexReader) and copied up to a snapshot, that
snapshot is never changed by Lucene (opened by an IndexWriter, or
IndexReader that does delete/setNorm)?  Yet somehow all 3 become
corrupt at the same time (that exception is hit when opening an
IndexReader)?

Also:

> 2) The similar things happened on the search node: the same index
> which was opened OK during last night nodes recycle could not be
> opened due to the same exception. The search node does not update
> index.

Again nothing makes changes to these copies either, yet they suddenly
starting hitting that same exception?

Are you really sure, when you go to open the snapshots, that you're
actually specifying the right directory each time?  It seems
particularly odd for all 3 snapshots to hit the same exception at the
same time.

Another question: is it possible you are reopening your readers on an
index dir while rsync is still copying to that index dir?  That could easily
lead to exceptions like this....

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yajun
In reply to this post by Robert Engels
My bad, we don't use /tmp explicitly. We use /var/tmp/snapshot_timestamp which is not deleted by OS when reboot.

--Yajun

Robert Engels wrote
If your "automatic recycle" means a restart/reboot, the /tmp  
directory is probably being cleared by the OS and you might have a  
startup race condition.

On Jul 7, 2008, at 2:17 AM, Yajun Liu (JIRA) wrote:

> FileNotFoundException in
> -------------------------
>
>                  Key: LUCENE-1328
>                  URL: https://issues.apache.org/jira/browse/ 
> LUCENE-1328
>              Project: Lucene - Java
>           Issue Type: Bug
>           Components: Index
>     Affects Versions: 2.1
>          Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp  
> (brewbuilder@ls20-bc1-14.build.redhat.com) (gcc version 3.4.6  
> 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
>
> We use solr 1.2 and the lucene is 2.1.( I don't think this problem  
> has anything to do with solr.)
>             Reporter: Yajun Liu
>
>
> I had this problem for a while. here is how I used lucene index:
>
> 1) I don't use compound file.
>
> 2) I have a single process and a single thread to update index as  
> index updater. The index is really small, the mergefactor is 10.
> After index is updated, the same thread copies the index to a tmp  
> directory, validate the index in the tmp directory by:
>
> IndexReader reader = IndexReader.open(tmp_directory);
> reader.close();
>
> then rename the tmp directory to a snapshot_timestamp;
>
> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT  
> update index.
>
> 4) We automatically stop and start index updater and search nodes  
> every midnight. (don't ask me why)
>
>
> Here is what I observed:
>
> 1) Not always, sometimes when index updater is started during our  
> automatic recycle, we got
>
> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file  
> or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput
> $Descriptor.<init>(FSDirectory.java:501)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init> 
> (FSDirectory.java:526)
>         at org.apache.lucene.store.FSDirectory.openInput
> (FSDirectory.java:440)
>         at org.apache.lucene.index.FieldInfos.<init> 
> (FieldInfos.java:57)
>         at org.apache.lucene.index.SegmentReader.initialize
> (SegmentReader.java:176)
>         at org.apache.lucene.index.SegmentReader.get
> (SegmentReader.java:157)
>         at org.apache.lucene.index.SegmentReader.get
> (SegmentReader.java:130)
>         at org.apache.lucene.index.IndexReader$1.doBody
> (IndexReader.java:205)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run
> (SegmentInfos.java:610)
>         at org.apache.lucene.index.IndexReader.open
> (IndexReader.java:184)
>         at org.apache.lucene.index.IndexReader.open
> (IndexReader.java:157)
>
> Note each time, the missing file is different. When this happen,  
> the program automatically tried to reopen the most recent THREE  
> snapshots and we got the same exception for each snapshot.  
> Remember, each of the snapshot was validated before it was copied.
>
> 2) The similar things happened on the search node: the same index  
> which was opened OK during last night nodes recycle could not be  
> opened due to the same exception. The search node does not update  
> index.
>
> In my case, the index was "validated" before, and it became  
> invalidate in a later time. It seems it happened only when we  
> restart the application. When the exception happen, the file did  
> not exist in the index.
>
> --Yajun
>
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yajun
In reply to this post by Michael McCandless-2
My answer is inline.


Michael McCandless-2 wrote
Yajun Liu (JIRA) <jira@apache.org> wrote:

Can you double check which underlying version of Lucene you are
using?  Those source file/line numbers don't line up to a stock 2.1
release as far as I can tell.

YL>> The lucene library comes with Solr. The jar file is lucene-core-2007-05-20_00-04-53.jar. I compared the source code of IndexReader which is close to Lucene 2.2, not 2.1

This part is odd:

> When this happen, the program automatically tried to reopen the most
> recent THREE snapshots and we got the same exception for each
> snapshot. Remember, each of the snapshot was validated before it was
> copied.

If I understand the steps correctly, once and index is created,
verified (by opening an IndexReader) and copied up to a snapshot, that
snapshot is never changed by Lucene (opened by an IndexWriter, or
IndexReader that does delete/setNorm)?  Yet somehow all 3 become
corrupt at the same time (that exception is hit when opening an
IndexReader)?

YL>> That's correct. Here is the sudo code of taking a snapshot:
    create a tmp directory /var/tmp/snapshot_timestamp
    int rety = 0;
    while (retry++ < 10) {
         copy index to /var/tmp/snapshot_timestamp
         try {
             IndexReader reader = IndexReader.open(/var/tmp/snapshot_timestamp/index);
             reader.close();
         } catch (IOException e) {
             delete /var/tmp/snapshot_timestamp/index
             continue;
         }
     }

Did you say IndexReader also does delete? If that is the case, then the snapshot that is "validated" by IndexReader could be "invalidated" by this deletion. That could explain all the snapshots become corrupted.    
         
         
Also:

> 2) The similar things happened on the search node: the same index
> which was opened OK during last night nodes recycle could not be
> opened due to the same exception. The search node does not update
> index.

Again nothing makes changes to these copies either, yet they suddenly
starting hitting that same exception?

YL>> correct. Nothing change on the search node.

Are you really sure, when you go to open the snapshots, that you're
actually specifying the right directory each time?  It seems
particularly odd for all 3 snapshots to hit the same exception at the
same time.

YL>> very sure.

Another question: is it possible you are reopening your readers on an
index dir while rsync is still copying to that index dir?  That could easily
lead to exceptions like this....

YL>> index was rsync to a /var/tmp/rsync_tmp directory. When rsync is done, it is moved to a directory and then search node open indexreader on it.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Michael McCandless-2

Yajun wrote:

>> YL>> The lucene library comes with Solr. The jar file is
>> lucene-core-2007-05-20_00-04-53.jar. I compared the source code of
>> IndexReader which is close to Lucene 2.2, not 2.1

Hmmm OK thanks.

>> This part is odd:
>>
>>> When this happen, the program automatically tried to reopen the most
>>> recent THREE snapshots and we got the same exception for each
>>> snapshot. Remember, each of the snapshot was validated before it was
>>> copied.
>>
>> If I understand the steps correctly, once and index is created,
>> verified (by opening an IndexReader) and copied up to a snapshot,  
>> that
>> snapshot is never changed by Lucene (opened by an IndexWriter, or
>> IndexReader that does delete/setNorm)?  Yet somehow all 3 become
>> corrupt at the same time (that exception is hit when opening an
>> IndexReader)?
>>
>> YL>> That's correct. Here is the sudo code of taking a snapshot:
>>    create a tmp directory /var/tmp/snapshot_timestamp
>>    int rety = 0;
>>    while (retry++ < 10) {
>>         copy index to /var/tmp/snapshot_timestamp
>>         try {
>>             IndexReader reader =
>> IndexReader.open(/var/tmp/snapshot_timestamp/index);
>>             reader.close();
>>         } catch (IOException e) {
>>             delete /var/tmp/snapshot_timestamp/index
>>             continue;
>>         }
>>     }
>>
>> Did you say IndexReader also does delete? If that is the case, then  
>> the
>> snapshot that is "validated" by IndexReader could be "invalidated"  
>> by this
>> deletion. That could explain all the snapshots become corrupted.

IndexReader only does deletes if you use the deleteDocument or setNorm  
methods.  Are you using these?  If not, then I think there must be  
something outside of Lucene causing this because IndexReader won't  
make any changes to an index otherwise.

It's very strange that all of your snapshots suddenly become unusable  
at the same time.

Are your snapshots complete copies, ie, there are no hard or soft  
links in them?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yajun
I don't use deleteDocument nor setNorm with IndexReader. I tried both using hard links (cp -lr) and copy (cp -r) to create snapshot, both has the same problem.

It seems that segment file segments_xxx has segment that does not exist in the index directory anymore.

I used to delete all the "invalidated" indexes. Now I'll keep them, hopefully it will give me some clue.

--Yajun

Michael McCandless-2 wrote
IndexReader only does deletes if you use the deleteDocument or setNorm  
methods.  Are you using these?  If not, then I think there must be  
something outside of Lucene causing this because IndexReader won't  
make any changes to an index otherwise.

It's very strange that all of your snapshots suddenly become unusable  
at the same time.

Are your snapshots complete copies, ie, there are no hard or soft  
links in them?

Mike
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Michael McCandless-2

One more question: are you sure that when you copy out your snapshot,  
the IndexWriter was closed?

If IndexWriter is open when the copy is done, it's possible to get a  
corrupted copy.

Mike

Yajun wrote:

>
> I don't use deleteDocument nor setNorm with IndexReader. I tried  
> both using
> hard links (cp -lr) and copy (cp -r) to create snapshot, both has  
> the same
> problem.
>
> It seems that segment file segments_xxx has segment that does not  
> exist in
> the index directory anymore.
>
> I used to delete all the "invalidated" indexes. Now I'll keep them,
> hopefully it will give me some clue.
>
> --Yajun
>
>
> Michael McCandless-2 wrote:
>>
>>
>> IndexReader only does deletes if you use the deleteDocument or  
>> setNorm
>> methods.  Are you using these?  If not, then I think there must be
>> something outside of Lucene causing this because IndexReader won't
>> make any changes to an index otherwise.
>>
>> It's very strange that all of your snapshots suddenly become unusable
>> at the same time.
>>
>> Are your snapshots complete copies, ie, there are no hard or soft
>> links in them?
>>
>> Mike
>>
>>
>
> --
> View this message in context: http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18325285.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yajun
Mike,

Not very sure about whether IndexWriter is closed. The index update goes through Solr, I'll debug it tonight.

Even if IndexWriter is not closed, since I "validate" the snapshot, at least that the snapshot should keep being "validated". :-)

--Yajun

Michael McCandless-2 wrote
One more question: are you sure that when you copy out your snapshot,  
the IndexWriter was closed?

If IndexWriter is open when the copy is done, it's possible to get a  
corrupted copy.

Mike

Yajun wrote:

>
> I don't use deleteDocument nor setNorm with IndexReader. I tried  
> both using
> hard links (cp -lr) and copy (cp -r) to create snapshot, both has  
> the same
> problem.
>
> It seems that segment file segments_xxx has segment that does not  
> exist in
> the index directory anymore.
>
> I used to delete all the "invalidated" indexes. Now I'll keep them,
> hopefully it will give me some clue.
>
> --Yajun
>
>
> Michael McCandless-2 wrote:
>>
>>
>> IndexReader only does deletes if you use the deleteDocument or  
>> setNorm
>> methods.  Are you using these?  If not, then I think there must be
>> something outside of Lucene causing this because IndexReader won't
>> make any changes to an index otherwise.
>>
>> It's very strange that all of your snapshots suddenly become unusable
>> at the same time.
>>
>> Are your snapshots complete copies, ie, there are no hard or soft
>> links in them?
>>
>> Mike
>>
>>
>
> --
> View this message in context: http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18325285.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Michael McCandless-2

Ahh right your validation would catch the IndexWriter-still-open case.

It seems like something external to Lucene is messing up your index.  
It's odd.

Mike

Yajun wrote:

>
> Mike,
>
> Not very sure about whether IndexWriter is closed. The index update  
> goes
> through Solr, I'll debug it tonight.
>
> Even if IndexWriter is not closed, since I "validate" the snapshot,  
> at least
> that the snapshot should keep being "validated". :-)
>
> --Yajun
>
>
> Michael McCandless-2 wrote:
>>
>>
>> One more question: are you sure that when you copy out your snapshot,
>> the IndexWriter was closed?
>>
>> If IndexWriter is open when the copy is done, it's possible to get a
>> corrupted copy.
>>
>> Mike
>>
>> Yajun wrote:
>>
>>>
>>> I don't use deleteDocument nor setNorm with IndexReader. I tried
>>> both using
>>> hard links (cp -lr) and copy (cp -r) to create snapshot, both has
>>> the same
>>> problem.
>>>
>>> It seems that segment file segments_xxx has segment that does not
>>> exist in
>>> the index directory anymore.
>>>
>>> I used to delete all the "invalidated" indexes. Now I'll keep them,
>>> hopefully it will give me some clue.
>>>
>>> --Yajun
>>>
>>>
>>> Michael McCandless-2 wrote:
>>>>
>>>>
>>>> IndexReader only does deletes if you use the deleteDocument or
>>>> setNorm
>>>> methods.  Are you using these?  If not, then I think there must be
>>>> something outside of Lucene causing this because IndexReader won't
>>>> make any changes to an index otherwise.
>>>>
>>>> It's very strange that all of your snapshots suddenly become  
>>>> unusable
>>>> at the same time.
>>>>
>>>> Are your snapshots complete copies, ie, there are no hard or soft
>>>> links in them?
>>>>
>>>> Mike
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18325285.html
>>> Sent from the Lucene - Java Developer mailing list archive at
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18326328.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yajun
I'm adding tons of logging, hopefully it will give me some information.

--Yajun

Michael McCandless-2 wrote
Ahh right your validation would catch the IndexWriter-still-open case.

It seems like something external to Lucene is messing up your index.  
It's odd.

Mike

Yajun wrote:

>
> Mike,
>
> Not very sure about whether IndexWriter is closed. The index update  
> goes
> through Solr, I'll debug it tonight.
>
> Even if IndexWriter is not closed, since I "validate" the snapshot,  
> at least
> that the snapshot should keep being "validated". :-)
>
> --Yajun
>
>
> Michael McCandless-2 wrote:
>>
>>
>> One more question: are you sure that when you copy out your snapshot,
>> the IndexWriter was closed?
>>
>> If IndexWriter is open when the copy is done, it's possible to get a
>> corrupted copy.
>>
>> Mike
>>
>> Yajun wrote:
>>
>>>
>>> I don't use deleteDocument nor setNorm with IndexReader. I tried
>>> both using
>>> hard links (cp -lr) and copy (cp -r) to create snapshot, both has
>>> the same
>>> problem.
>>>
>>> It seems that segment file segments_xxx has segment that does not
>>> exist in
>>> the index directory anymore.
>>>
>>> I used to delete all the "invalidated" indexes. Now I'll keep them,
>>> hopefully it will give me some clue.
>>>
>>> --Yajun
>>>
>>>
>>> Michael McCandless-2 wrote:
>>>>
>>>>
>>>> IndexReader only does deletes if you use the deleteDocument or
>>>> setNorm
>>>> methods.  Are you using these?  If not, then I think there must be
>>>> something outside of Lucene causing this because IndexReader won't
>>>> make any changes to an index otherwise.
>>>>
>>>> It's very strange that all of your snapshots suddenly become  
>>>> unusable
>>>> at the same time.
>>>>
>>>> Are your snapshots complete copies, ie, there are no hard or soft
>>>> links in them?
>>>>
>>>> Mike
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18325285.html
>>> Sent from the Lucene - Java Developer mailing list archive at
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/-jira--Created%3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18326328.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Yonik Seeley-2
On Mon, Jul 7, 2008 at 5:03 PM, Yajun <[hidden email]> wrote:
>
> I'm adding tons of logging, hopefully it will give me some information.

Try capturing the directory contents before you take a snapshot...
something like
ls -l index > index/ls.txt

Then if a missing file turns up, you can compare with what the index
looked like.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Bug In IndexWriter.addDocument?

Digy
In reply to this post by Yajun

Hi all,

 

I am a Lucene.Net user. Since I need a fast indexing in my current project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net is currently in v2.1) and I use the same instances of document and fields to gain some speed improvements.

 

I use TokenStreams to set the value of fields.

 

My problem is that I get NullPointerException in "addDocument".

 

Exception in thread "main" java.lang.NullPointerException

        at org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)

        at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)

        at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)

        at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)

        at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)

        at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)

        at MainClass.Test(MainClass.java:39)

        at MainClass.main(MainClass.java:10)

 

To show the same bug in Java I prepared a sample application (oh, that was hard since this is my second app. in java(first one was a “Hello World” app.))

 

Is something wrong with my application or is it a bug in Lucene?

 

Thanks,

DIGY

 

 

 

SampleCode:

    public class MainClass

    {

            

        DummyTokenStream DummyTokenStream1 = new DummyTokenStream();

        DummyTokenStream DummyTokenStream2 = new DummyTokenStream();

 

       //use the same document&field instances for Indexing

        org.apache.lucene.document.Document Doc = new org.apache.lucene.document.Document();

 

        org.apache.lucene.document.Field Field1 = new org.apache.lucene.document.Field("Field1", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

        org.apache.lucene.document.Field Field2 = new org.apache.lucene.document.Field("Field2", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

 

        public MainClass()

        {

            Doc.add(Field1);

            Doc.add(Field2);

        }

 

 

        public void Index() throws

                           org.apache.lucene.index.CorruptIndexException,

                           org.apache.lucene.store.LockObtainFailedException,

                           java.io.IOException

        {

              System.out.println("Index Started");

             org.apache.lucene.index.IndexWriter wr = new org.apache.lucene.index.IndexWriter("testindex", new org.apache.lucene.analysis.WhitespaceAnalyzer(),true);

           

            for (int i = 0; i < 100; i++)

            {

                    PrepDoc();

                    wr.addDocument(Doc);

            }

            wr.close();

             System.out.println("Index Completed");

        }

 

        void PrepDoc()

        {

            DummyTokenStream1.SetText("test1"); //Set a new Text to Token Stream

            Field1.setValue(DummyTokenStream1); //Set TokenStream to Field Value

 

 

            DummyTokenStream2.SetText("test2"); //Set a new Text to Token Stream

            Field2.setValue(DummyTokenStream2); //Set TokenStream to Field Value

        }

 

       public static void main(String[] args)  throws

                    org.apache.lucene.index.CorruptIndexException,

                    org.apache.lucene.store.LockObtainFailedException,

                    java.io.IOException

       {

              MainClass m = new MainClass();

              m.Index();

       }

 

 

 

            

       public class DummyTokenStream extends org.apache.lucene.analysis.TokenStream

       {

              String Text = "";

              boolean EndOfStream = false;

              org.apache.lucene.analysis.Token Token = new org.apache.lucene.analysis.Token();

 

             //return "Text" as the first token and null as the second

             public org.apache.lucene.analysis.Token next()

             {

                    if (EndOfStream == false)

                    {

                           EndOfStream = true;

 

                           Token.setTermText(Text);

                           Token.setStartOffset(0);

                           Token.setEndOffset(Text.length() - 1);

                           Token.setTermLength(Text.length());

                           return Token;

                    }

                    return null;

             }

 

             public void SetText(String Text)

             {

                    EndOfStream = false;

                    this.Text = Text;

             }

       }

 

    }

     

 

Reply | Threaded
Open this post in threaded view
|

Re: Bug In IndexWriter.addDocument?

Ajay Lakhani
Dear Digy,
As of Lucene 2.3, there are new setValue(...) methods that allow you to change the value of a Field. However, there seems to be an issue with the org.apache.lucene.index.FieldWriter.writeField(...) API that stores the string value for the field, which happens to be null in the case of a TokenStream.

The
org.apache.lucene.index.FieldWriter.writeField(...) API needs to be changed to verify whether the Field Data is an instance of String, Reader or a TokenStream and then retrieve the respective values. I shall patch this soon.

Is there a particular reason you are using a TokenStream ? I suggest you set the text value directly to the Field: Field1.setValue("xxx");

Moreover, it's best to create a single Document instance, then add multiple Field instances to it, but hold onto these Field instances and re-use them by changing their values for each added document. After the document is added, you then directly change the Field values (idField.setValue(...), etc), and then re-add your Document instance. You cannot re-use a single Field instance within a Document, and, you should not change a Field's value until the Document containing that Field has been added to the index.

2008/7/8 Digy <[hidden email]>:

Hi all,

 

I am a Lucene.Net user. Since I need a fast indexing in my current project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net is currently in v2.1) and I use the same instances of document and fields to gain some speed improvements.

 

I use TokenStreams to set the value of fields.

 

My problem is that I get NullPointerException in "addDocument".

 

Exception in thread "main" java.lang.NullPointerException

        at org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)

        at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)

        at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)

        at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)

        at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)

        at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)

        at MainClass.Test(MainClass.java:39)

        at MainClass.main(MainClass.java:10)

 

To show the same bug in Java I prepared a sample application (oh, that was hard since this is my second app. in java(first one was a "Hello World" app.))

 

Is something wrong with my application or is it a bug in Lucene?

 

Thanks,

DIGY

 

 

 

SampleCode:

    public class MainClass

    {

            

        DummyTokenStream DummyTokenStream1 = new DummyTokenStream();

        DummyTokenStream DummyTokenStream2 = new DummyTokenStream();

 

       //use the same document&field instances for Indexing

        org.apache.lucene.document.Document Doc = new org.apache.lucene.document.Document();

 

        org.apache.lucene.document.Field Field1 = new org.apache.lucene.document.Field("Field1", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

        org.apache.lucene.document.Field Field2 = new org.apache.lucene.document.Field("Field2", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

 

        public MainClass()

        {

            Doc.add(Field1);

            Doc.add(Field2);

        }

 

 

        public void Index() throws

                           org.apache.lucene.index.CorruptIndexException,

                           org.apache.lucene.store.LockObtainFailedException,

                           java.io.IOException

        {

              System.out.println("Index Started");

             org.apache.lucene.index.IndexWriter wr = new org.apache.lucene.index.IndexWriter("testindex", new org.apache.lucene.analysis.WhitespaceAnalyzer(),true);

           

            for (int i = 0; i < 100; i++)

            {

                    PrepDoc();

                    wr.addDocument(Doc);

            }

            wr.close();

             System.out.println("Index Completed");

        }

 

        void PrepDoc()

        {

            DummyTokenStream1.SetText("test1"); //Set a new Text to Token Stream

            Field1.setValue(DummyTokenStream1); //Set TokenStream to Field Value

 

 

            DummyTokenStream2.SetText("test2"); //Set a new Text to Token Stream

            Field2.setValue(DummyTokenStream2); //Set TokenStream to Field Value

        }

 

       public static void main(String[] args)  throws

                    org.apache.lucene.index.CorruptIndexException,

                    org.apache.lucene.store.LockObtainFailedException,

                    java.io.IOException

       {

              MainClass m = new MainClass();

              m.Index();

       }

 

 

 

            

       public class DummyTokenStream extends org.apache.lucene.analysis.TokenStream

       {

              String Text = "";

              boolean EndOfStream = false;

              org.apache.lucene.analysis.Token Token = new org.apache.lucene.analysis.Token();

 

             //return "Text" as the first token and null as the second

             public org.apache.lucene.analysis.Token next()

             {

                    if (EndOfStream == false)

                    {

                           EndOfStream = true;

 

                           Token.setTermText(Text);

                           Token.setStartOffset(0);

                           Token.setEndOffset(Text.length() - 1);

                           Token.setTermLength(Text.length());

                           return Token;

                    }

                    return null;

             }

 

             public void SetText(String Text)

             {

                    EndOfStream = false;

                    this.Text = Text;

             }

       }

 

    }

     

 


Reply | Threaded
Open this post in threaded view
|

Re: Bug In IndexWriter.addDocument?

Ajay Lakhani
Dear Digy,

To add on, I might think that this is not a glitch.

A TokenStream is usually not stored. 
If you change your field attribute to org.apache.lucene.document.Field.Store.NO then there will be no issue.

Developers, any thoughts on this!

Cheers
Ajay

2008/7/8 Ajay Lakhani <[hidden email]>:
Dear Digy,
As of Lucene 2.3, there are new setValue(...) methods that allow you to change the value of a Field. However, there seems to be an issue with the org.apache.lucene.index.FieldWriter.writeField(...) API that stores the string value for the field, which happens to be null in the case of a TokenStream.

The
org.apache.lucene.index.FieldWriter.writeField(...) API needs to be changed to verify whether the Field Data is an instance of String, Reader or a TokenStream and then retrieve the respective values. I shall patch this soon.

Is there a particular reason you are using a TokenStream ? I suggest you set the text value directly to the Field: Field1.setValue("xxx");

Moreover, it's best to create a single Document instance, then add multiple Field instances to it, but hold onto these Field instances and re-use them by changing their values for each added document. After the document is added, you then directly change the Field values (idField.setValue(...), etc), and then re-add your Document instance. You cannot re-use a single Field instance within a Document, and, you should not change a Field's value until the Document containing that Field has been added to the index.

2008/7/8 Digy <[hidden email]>:

Hi all,

 

I am a Lucene.Net user. Since I need a fast indexing in my current project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net is currently in v2.1) and I use the same instances of document and fields to gain some speed improvements.

 

I use TokenStreams to set the value of fields.

 

My problem is that I get NullPointerException in "addDocument".

 

Exception in thread "main" java.lang.NullPointerException

        at org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)

        at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)

        at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)

        at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)

        at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)

        at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)

        at MainClass.Test(MainClass.java:39)

        at MainClass.main(MainClass.java:10)

 

To show the same bug in Java I prepared a sample application (oh, that was hard since this is my second app. in java(first one was a "Hello World" app.))

 

Is something wrong with my application or is it a bug in Lucene?

 

Thanks,

DIGY

 

 

 

SampleCode:

    public class MainClass

    {

            

        DummyTokenStream DummyTokenStream1 = new DummyTokenStream();

        DummyTokenStream DummyTokenStream2 = new DummyTokenStream();

 

       //use the same document&field instances for Indexing

        org.apache.lucene.document.Document Doc = new org.apache.lucene.document.Document();

 

        org.apache.lucene.document.Field Field1 = new org.apache.lucene.document.Field("Field1", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

        org.apache.lucene.document.Field Field2 = new org.apache.lucene.document.Field("Field2", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

 

        public MainClass()

        {

            Doc.add(Field1);

            Doc.add(Field2);

        }

 

 

        public void Index() throws

                           org.apache.lucene.index.CorruptIndexException,

                           org.apache.lucene.store.LockObtainFailedException,

                           java.io.IOException

        {

              System.out.println("Index Started");

             org.apache.lucene.index.IndexWriter wr = new org.apache.lucene.index.IndexWriter("testindex", new org.apache.lucene.analysis.WhitespaceAnalyzer(),true);

           

            for (int i = 0; i < 100; i++)

            {

                    PrepDoc();

                    wr.addDocument(Doc);

            }

            wr.close();

             System.out.println("Index Completed");

        }

 

        void PrepDoc()

        {

            DummyTokenStream1.SetText("test1"); //Set a new Text to Token Stream

            Field1.setValue(DummyTokenStream1); //Set TokenStream to Field Value

 

 

            DummyTokenStream2.SetText("test2"); //Set a new Text to Token Stream

            Field2.setValue(DummyTokenStream2); //Set TokenStream to Field Value

        }

 

       public static void main(String[] args)  throws

                    org.apache.lucene.index.CorruptIndexException,

                    org.apache.lucene.store.LockObtainFailedException,

                    java.io.IOException

       {

              MainClass m = new MainClass();

              m.Index();

       }

 

 

 

            

       public class DummyTokenStream extends org.apache.lucene.analysis.TokenStream

       {

              String Text = "";

              boolean EndOfStream = false;

              org.apache.lucene.analysis.Token Token = new org.apache.lucene.analysis.Token();

 

             //return "Text" as the first token and null as the second

             public org.apache.lucene.analysis.Token next()

             {

                    if (EndOfStream == false)

                    {

                           EndOfStream = true;

 

                           Token.setTermText(Text);

                           Token.setStartOffset(0);

                           Token.setEndOffset(Text.length() - 1);

                           Token.setTermLength(Text.length());

                           return Token;

                    }

                    return null;

             }

 

             public void SetText(String Text)

             {

                    EndOfStream = false;

                    this.Text = Text;

             }

       }

 

    }

     

 



Reply | Threaded
Open this post in threaded view
|

Re: Bug In IndexWriter.addDocument?

Ajay Lakhani
Dear Digy,
You cannot store the Filed value when using a TokenStream but can store the term vector
For this you should create an instance of the Field in this manner:

Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES);

Below is the code that should work.

public class Main2Class{
  Document Doc = new Document();
 
  DummyTokenStream DummyTokenStream1 = new DummyTokenStream();
  Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES);
 
  DummyTokenStream DummyTokenStream2 = new DummyTokenStream();
  Field Field2 = new Field("Field1", DummyTokenStream2, TermVector.YES);
  
  public void Index() throws Exception {
    Doc.add(Field1);
    Doc.add(Field2);
   
    IndexWriter wr = new IndexWriter("testindex", new WhitespaceAnalyzer(), true);
   
    for (int i = 0; i < 100; i++){
      PrepDoc();
      wr.addDocument(Doc);
    }
    wr.close();
  }

  void PrepDoc(){
    DummyTokenStream1.SetText("test1");
    Field1.setValue(DummyTokenStream1);
    DummyTokenStream2.SetText("test2");
    Field2.setValue(DummyTokenStream2);
  }

  public static void main(String[] args) throws Exception {
    Main2Class m = new Main2Class();
    m.Index();
  }
}

Cheers
Ajay

2008/7/8 Ajay Lakhani <[hidden email]>:
Dear Digy,

To add on, I might think that this is not a glitch.

A TokenStream is usually not stored. 
If you change your field attribute to org.apache.lucene.document.Field.Store.NO then there will be no issue.

Developers, any thoughts on this!

Cheers
Ajay

2008/7/8 Ajay Lakhani <[hidden email]>:

Dear Digy,
As of Lucene 2.3, there are new setValue(...) methods that allow you to change the value of a Field. However, there seems to be an issue with the org.apache.lucene.index.FieldWriter.writeField(...) API that stores the string value for the field, which happens to be null in the case of a TokenStream.

The
org.apache.lucene.index.FieldWriter.writeField(...) API needs to be changed to verify whether the Field Data is an instance of String, Reader or a TokenStream and then retrieve the respective values. I shall patch this soon.

Is there a particular reason you are using a TokenStream ? I suggest you set the text value directly to the Field: Field1.setValue("xxx");

Moreover, it's best to create a single Document instance, then add multiple Field instances to it, but hold onto these Field instances and re-use them by changing their values for each added document. After the document is added, you then directly change the Field values (idField.setValue(...), etc), and then re-add your Document instance. You cannot re-use a single Field instance within a Document, and, you should not change a Field's value until the Document containing that Field has been added to the index.

2008/7/8 Digy <[hidden email]>:

Hi all,

 

I am a Lucene.Net user. Since I need a fast indexing in my current project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net is currently in v2.1) and I use the same instances of document and fields to gain some speed improvements.

 

I use TokenStreams to set the value of fields.

 

My problem is that I get NullPointerException in "addDocument".

 

Exception in thread "main" java.lang.NullPointerException

        at org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)

        at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)

        at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)

        at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)

        at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)

        at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)

        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)

        at MainClass.Test(MainClass.java:39)

        at MainClass.main(MainClass.java:10)

 

To show the same bug in Java I prepared a sample application (oh, that was hard since this is my second app. in java(first one was a "Hello World" app.))

 

Is something wrong with my application or is it a bug in Lucene?

 

Thanks,

DIGY

 

 

 

SampleCode:

    public class MainClass

    {

            

        DummyTokenStream DummyTokenStream1 = new DummyTokenStream();

        DummyTokenStream DummyTokenStream2 = new DummyTokenStream();

 

       //use the same document&field instances for Indexing

        org.apache.lucene.document.Document Doc = new org.apache.lucene.document.Document();

 

        org.apache.lucene.document.Field Field1 = new org.apache.lucene.document.Field("Field1", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

        org.apache.lucene.document.Field Field2 = new org.apache.lucene.document.Field("Field2", "", org.apache.lucene.document.Field.Store.YES, org.apache.lucene.document.Field.Index.TOKENIZED);

 

        public MainClass()

        {

            Doc.add(Field1);

            Doc.add(Field2);

        }

 

 

        public void Index() throws

                           org.apache.lucene.index.CorruptIndexException,

                           org.apache.lucene.store.LockObtainFailedException,

                           java.io.IOException

        {

              System.out.println("Index Started");

             org.apache.lucene.index.IndexWriter wr = new org.apache.lucene.index.IndexWriter("testindex", new org.apache.lucene.analysis.WhitespaceAnalyzer(),true);

           

            for (int i = 0; i < 100; i++)

            {

                    PrepDoc();

                    wr.addDocument(Doc);

            }

            wr.close();

             System.out.println("Index Completed");

        }

 

        void PrepDoc()

        {

            DummyTokenStream1.SetText("test1"); //Set a new Text to Token Stream

            Field1.setValue(DummyTokenStream1); //Set TokenStream to Field Value

 

 

            DummyTokenStream2.SetText("test2"); //Set a new Text to Token Stream

            Field2.setValue(DummyTokenStream2); //Set TokenStream to Field Value

        }

 

       public static void main(String[] args)  throws

                    org.apache.lucene.index.CorruptIndexException,

                    org.apache.lucene.store.LockObtainFailedException,

                    java.io.IOException

       {

              MainClass m = new MainClass();

              m.Index();

       }

 

 

 

            

       public class DummyTokenStream extends org.apache.lucene.analysis.TokenStream

       {

              String Text = "";

              boolean EndOfStream = false;

              org.apache.lucene.analysis.Token Token = new org.apache.lucene.analysis.Token();

 

             //return "Text" as the first token and null as the second

             public org.apache.lucene.analysis.Token next()

             {

                    if (EndOfStream == false)

                    {

                           EndOfStream = true;

 

                           Token.setTermText(Text);

                           Token.setStartOffset(0);

                           Token.setEndOffset(Text.length() - 1);

                           Token.setTermLength(Text.length());

                           return Token;

                    }

                    return null;

             }

 

             public void SetText(String Text)

             {

                    EndOfStream = false;

                    this.Text = Text;

             }

       }

 

    }

     

 




Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-1328) FileNotFoundException in

Robert Engels
In reply to this post by Yajun
On most system, /tmp is a link to /var/tmp, so it is cleared on reboot.

On Jul 7, 2008, at 12:44 PM, Yajun wrote:

>
> My bad, we don't use /tmp explicitly. We use /var/tmp/
> snapshot_timestamp
> which is not deleted by OS when reboot.
>
> --Yajun
>
>
> Robert Engels wrote:
>>
>> If your "automatic recycle" means a restart/reboot, the /tmp
>> directory is probably being cleared by the OS and you might have a
>> startup race condition.
>>
>> On Jul 7, 2008, at 2:17 AM, Yajun Liu (JIRA) wrote:
>>
>>> FileNotFoundException in
>>> -------------------------
>>>
>>>                  Key: LUCENE-1328
>>>                  URL: https://issues.apache.org/jira/browse/
>>> LUCENE-1328
>>>              Project: Lucene - Java
>>>           Issue Type: Bug
>>>           Components: Index
>>>     Affects Versions: 2.1
>>>          Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp
>>> ([hidden email]) (gcc version 3.4.6
>>> 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
>>>
>>> We use solr 1.2 and the lucene is 2.1.( I don't think this problem
>>> has anything to do with solr.)
>>>             Reporter: Yajun Liu
>>>
>>>
>>> I had this problem for a while. here is how I used lucene index:
>>>
>>> 1) I don't use compound file.
>>>
>>> 2) I have a single process and a single thread to update index as
>>> index updater. The index is really small, the mergefactor is 10.
>>> After index is updated, the same thread copies the index to a tmp
>>> directory, validate the index in the tmp directory by:
>>>
>>> IndexReader reader = IndexReader.open(tmp_directory);
>>> reader.close();
>>>
>>> then rename the tmp directory to a snapshot_timestamp;
>>>
>>> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT
>>> update index.
>>>
>>> 4) We automatically stop and start index updater and search nodes
>>> every midnight. (don't ask me why)
>>>
>>>
>>> Here is what I observed:
>>>
>>> 1) Not always, sometimes when index updater is started during our
>>> automatic recycle, we got
>>>
>>> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file
>>> or directory)
>>>         at java.io.RandomAccessFile.open(Native Method)
>>>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:
>>> 212)
>>>         at org.apache.lucene.store.FSDirectory$FSIndexInput
>>> $Descriptor.<init>(FSDirectory.java:501)
>>>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>
>>> (FSDirectory.java:526)
>>>         at org.apache.lucene.store.FSDirectory.openInput
>>> (FSDirectory.java:440)
>>>         at org.apache.lucene.index.FieldInfos.<init>
>>> (FieldInfos.java:57)
>>>         at org.apache.lucene.index.SegmentReader.initialize
>>> (SegmentReader.java:176)
>>>         at org.apache.lucene.index.SegmentReader.get
>>> (SegmentReader.java:157)
>>>         at org.apache.lucene.index.SegmentReader.get
>>> (SegmentReader.java:130)
>>>         at org.apache.lucene.index.IndexReader$1.doBody
>>> (IndexReader.java:205)
>>>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run
>>> (SegmentInfos.java:610)
>>>         at org.apache.lucene.index.IndexReader.open
>>> (IndexReader.java:184)
>>>         at org.apache.lucene.index.IndexReader.open
>>> (IndexReader.java:157)
>>>
>>> Note each time, the missing file is different. When this happen,
>>> the program automatically tried to reopen the most recent THREE
>>> snapshots and we got the same exception for each snapshot.
>>> Remember, each of the snapshot was validated before it was copied.
>>>
>>> 2) The similar things happened on the search node: the same index
>>> which was opened OK during last night nodes recycle could not be
>>> opened due to the same exception. The search node does not update
>>> index.
>>>
>>> In my case, the index was "validated" before, and it became
>>> invalidate in a later time. It seems it happened only when we
>>> restart the application. When the exception happen, the file did
>>> not exist in the index.
>>>
>>> --Yajun
>>>
>>>
>>> --
>>> This message is automatically generated by JIRA.
>>> -
>>> You can reply to this email to add a comment to the issue online.
>>>
>>>
>>> --------------------------------------------------------------------
>>> -
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context: <a href="http://www.nabble.com/-jira--Created%">http://www.nabble.com/-jira--Created% 
> 3A-%28LUCENE-1328%29-FileNotFoundException-in-tp18311277p18322404.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-1328) FileNotFoundException in

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1328.
----------------------------------------

    Resolution: Invalid

I think this came down to not closing IndexSearchers...

> FileNotFoundException in
> -------------------------
>
>                 Key: LUCENE-1328
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1328
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp ([hidden email]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
> We use solr 1.2 and the lucene is 2.1.( I don't think this problem has anything to do with solr.)
>            Reporter: Yajun Liu
>
> I had this problem for a while. here is how I used lucene index:
> 1) I don't use compound file.
> 2) I have a single process and a single thread to update index as index updater. The index is really small, the mergefactor is 10.
> After index is updated, the same thread copies the index to a tmp directory, validate the index in the tmp directory by:
>  
> IndexReader reader = IndexReader.open(tmp_directory);
> reader.close();
> then rename the tmp directory to a snapshot_timestamp;
> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT update index.
> 4) We automatically stop and start index updater and search nodes every midnight. (don't ask me why)
> Here is what I observed:
> 1) Not always, sometimes when index updater is started during our automatic recycle, we got
> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:501)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:526)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:57)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:176)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:157)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:130)
>         at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:205)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:184)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:157)
> Note each time, the missing file is different. When this happen, the program automatically tried to reopen the most recent THREE snapshots and we got the same exception for each snapshot. Remember, each of the snapshot was validated before it was copied.
> 2) The similar things happened on the search node: the same index which was opened OK during last night nodes recycle could not be opened due to the same exception. The search node does not update index.
> In my case, the index was "validated" before, and it became invalidate in a later time. It seems it happened only when we restart the application. When the exception happen, the file did not exist in the index.
> --Yajun

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1328) FileNotFoundException in

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613436#action_12613436 ]

Yajun Liu commented on LUCENE-1328:
-----------------------------------

Ok, after adding lots of logging, I finally found the problem. We have a cron job which deletes old files. Sometimes over the weekend, there are not much updates of index (We are a B2B web site), so files of old segments were deleted by the cron job, so it is not a bug.

> FileNotFoundException in
> -------------------------
>
>                 Key: LUCENE-1328
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1328
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp ([hidden email]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
> We use solr 1.2 and the lucene is 2.1.( I don't think this problem has anything to do with solr.)
>            Reporter: Yajun Liu
>
> I had this problem for a while. here is how I used lucene index:
> 1) I don't use compound file.
> 2) I have a single process and a single thread to update index as index updater. The index is really small, the mergefactor is 10.
> After index is updated, the same thread copies the index to a tmp directory, validate the index in the tmp directory by:
>  
> IndexReader reader = IndexReader.open(tmp_directory);
> reader.close();
> then rename the tmp directory to a snapshot_timestamp;
> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT update index.
> 4) We automatically stop and start index updater and search nodes every midnight. (don't ask me why)
> Here is what I observed:
> 1) Not always, sometimes when index updater is started during our automatic recycle, we got
> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:501)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:526)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:57)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:176)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:157)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:130)
>         at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:205)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:184)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:157)
> Note each time, the missing file is different. When this happen, the program automatically tried to reopen the most recent THREE snapshots and we got the same exception for each snapshot. Remember, each of the snapshot was validated before it was copied.
> 2) The similar things happened on the search node: the same index which was opened OK during last night nodes recycle could not be opened due to the same exception. The search node does not update index.
> In my case, the index was "validated" before, and it became invalidate in a later time. It seems it happened only when we restart the application. When the exception happen, the file did not exist in the index.
> --Yajun

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1328) FileNotFoundException in

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613439#action_12613439 ]

Michael McCandless commented on LUCENE-1328:
--------------------------------------------

OK, thanks for bringing closure here!

> FileNotFoundException in
> -------------------------
>
>                 Key: LUCENE-1328
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1328
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: OS:  Linux version 2.6.9-67.0.1.ELsmp ([hidden email]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-9)) #1 SMP Fri Nov 30 11:57:43 EST 2007
> We use solr 1.2 and the lucene is 2.1.( I don't think this problem has anything to do with solr.)
>            Reporter: Yajun Liu
>
> I had this problem for a while. here is how I used lucene index:
> 1) I don't use compound file.
> 2) I have a single process and a single thread to update index as index updater. The index is really small, the mergefactor is 10.
> After index is updated, the same thread copies the index to a tmp directory, validate the index in the tmp directory by:
>  
> IndexReader reader = IndexReader.open(tmp_directory);
> reader.close();
> then rename the tmp directory to a snapshot_timestamp;
> 3) the snapshot_timestamp is rsyn to search nodes which DO NOT update index.
> 4) We automatically stop and start index updater and search nodes every midnight. (don't ask me why)
> Here is what I observed:
> 1) Not always, sometimes when index updater is started during our automatic recycle, we got
> java.io.FileNotFoundException: /var/tmp/index/_gw.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:501)
>         at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:526)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:57)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:176)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:157)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:130)
>         at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:205)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:184)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:157)
> Note each time, the missing file is different. When this happen, the program automatically tried to reopen the most recent THREE snapshots and we got the same exception for each snapshot. Remember, each of the snapshot was validated before it was copied.
> 2) The similar things happened on the search node: the same index which was opened OK during last night nodes recycle could not be opened due to the same exception. The search node does not update index.
> In my case, the index was "validated" before, and it became invalidate in a later time. It seems it happened only when we restart the application. When the exception happen, the file did not exist in the index.
> --Yajun

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]