[jira] Created: (LUCENE-555) Index Corruption

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
Index Corruption
----------------

         Key: LUCENE-555
         URL: http://issues.apache.org/jira/browse/LUCENE-555
     Project: Lucene - Java
        Type: Bug

  Components: Index  
    Versions: 1.9    
 Environment: Linux FC4, Java 1.4.9
    Reporter: dan
    Priority: Critical


Index Corruption

>>>>>>>>> output

java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
        at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
        at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
        at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
        at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
        at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
        at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)

>>>>>>>>> input

- I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
- This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
- I don't know the reason. But, the following requirement applies regardless.

>>>>>>>>> requirement

- Like all modern database programs, there has to be a way to repair an index. Period.


--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12375996 ]

Erik Hatcher commented on LUCENE-555:
-------------------------------------

Could you share a test case that demonstrates this issue?

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376071 ]

dan commented on LUCENE-555:
----------------------------

Yes, I have a 1.6GB FSDirectory that you can try to optimize yourself. How would you like to receive the file?

BTW, it doesn't matter 'how' an index becomes corrupt. Hard drive failure, out-of-memory, core dump,
VM crash, controller failure, thermal shutdown, process crash, etc, are all non sequitor. Lucene must be able
to repair a corrupt index. Period.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376073 ]

Yonik Seeley commented on LUCENE-555:
-------------------------------------

> Yes, I have a 1.6GB FSDirectory that you can try to optimize yourself.

So are the steps to reproduce as simple as
  1) optimize this specific index
  2) try to open a new IndexReader on the optimized index

Any failures in segment merging (OOM, out-of-disk, IO error, etc) will normally abort the whole process, preventing the new segments file from being written, and thus preventing the bad segments from being referenced.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376075 ]

robert engels commented on LUCENE-555:
--------------------------------------

I think Dan is way off base here.

If complete fault tolerance is needed, he should develop it (and contribute it!).

Many users and uses of Lucene do not require the complexity, and performance degradation a complete fault tolerant system would require.

Even database systems do not automatically recover a db if the hard-drive fails (including any mirrored drives, etc.). You usually need some backup solution in this case.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376078 ]

Otis Gospodnetic commented on LUCENE-555:
-----------------------------------------

I agree with Robert.
Also, Dan, 1.6GB is quite large.  I don't think anyone will be copying that and debugging it for you.
What Erik meant  was - try to write a unit test that creates a small index that demonstrates the bug.  Then we can debug that and see what the problem is.  It is possible that your app did something funky with your 1.6GB index and that's why it's corrupt.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376090 ]

dan commented on LUCENE-555:
----------------------------

Robert says: "complete fault tolerance" and "automatically recover". Robert, I used none of these terms. You did. Every database, open source or not, that has risen in its class, has a method, process, or other means of journaling though its records to restore it to a consistent, usable state. Some methods are better than others. But the central point is they all have them.

It doesn't have to be "automatic recover", and it doesn't have to be "completely fault tolerant". But, yes, it has to be recoverable. There may be some data loss in the process, but it has to be recoverable. I stand by my statement firmly: Recovery is a necessary and critical requirement.

If you don't want to hear it from me, then ask your business users. Are your business users willing to commit meaningful, mission critical data to a database that has no recovery mechanism? Have you done this? Please do.

Robert says: "Many users and uses of Lucene do not require the complexity, and performance degradation a complete fault tolerant system would require...." How you choose RAID it, store it, mirror it, back it up, copy it, is an implementation choice, and is entirely non sequitur to the basic requirement of software package performing a recovery process on its own file format. QED


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376094 ]

Daniel Naber commented on LUCENE-555:
-------------------------------------

Lucene is not a database. If you need a database, use a database.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376095 ]

Doug Cutting commented on LUCENE-555:
-------------------------------------

Dan,

Lucene has never had a recovery mechanism other than rebuilding the index.  This is usually practical, since Lucene is not meant to be used as a primary repository for data, but rather as an index, to help folks find things they already have.

So perhaps we should recategorize this as a feature request, rather than a bug, no?  Please note that it might be difficult to add, since Lucene, in order to keep indexes relatively small, keeps no journal of transactions.

Finally, Lucene is not a database.  It is a full-text indexing system.  The requirements are different.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376097 ]

robert engels commented on LUCENE-555:
--------------------------------------

First off, your statement "Lucene must be able to repair a corrupt index" would imply to most people "complete fault tolerance". Many (most?) would argue that having the ability to recover only "partial" data is not really worthwhile, BUT this will depend upon the situation.

Many users of Lucene can recreate the index if a corruption does occur, so, providing a level of fault tolerance above this is not required.

Our solution that sits on top of lucene offers both - an added degree of fault-tolerence during the index construction phase, and the ability to recreate the index if a catostrophic failure occurs.

Lucene works fine for us, and I suggest that you work on a contribution if you need this behavior.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376115 ]

Chuck Williams commented on LUCENE-555:
---------------------------------------

I'm surprised that an optimize led to a corrupt index.  Other than the non-atomic rename problem, there isn't anything else in Lucene that should lead to corruption.  Even a failure in rename is recoverable, since some correct version of the segments file is always available.

The one recovery issue I've encountered is the buffering of recently indexed documents in memory.  I have a selective journaling mechanism that saves just as many documents as might ever be buffered in RAM.  This mechanism supports several logging modes:  complete information, just stored fields wth the ability to compute the others, and just keys with the ability to retrieve externally.  A mechanism like this could be extended to support unlimited journaling if you really want it.

If the optimize of an index leaves it corrupted, this is a bug that should be fixed.  If Lucene is robust in the sense that it doesn't corrupt the index, as it is designed to be now, I think this is sufficient.

Optional journaling facilties would be a nice add-on feature.  It is somethat that is not too hard for applications to create now.  The mechanism I use is bundled with a number of other useful services, including updating the index by modifying field values of selected documents, managing synchronization of delete, write and update, managing the periodic refreshing of the reader used for search, etc.  If I can get agreement from my Company, the I'll contribute some of all of this.  Maybe it would be of help.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376306 ]

dan commented on LUCENE-555:
----------------------------

How about some engineering satire to spell it out for you nerds? [Doesn't apply to you Chuck]

public void businessRealityCheck()
{

boolean myopicEngineerStillDoesntGetIt = true;

while ( myopicEngineerStillDoesntGetIt)
{

case(1)
{
A small business running MySQL has a travelling salesman who trips and pulls the power cord - the database is corrupt. The cause had nothing to do with the software whatsoever. How does team MySQL respond? The say with enthusiasm "run this recovery program with these parameters..." And guess what? It just works! The database is recovered. MySQL moves to the top of the class.
}

case(2)
{
Same scenario. How does team Lucene respond? If you are Robert, you say "I think Dan is way off base." If you are Otis, you retort "I agree with Robert." And all others sing the chorus. LOL. You could get a gig at The Comedy Club with this material guys. It's hilarious.
}

if ( case(1) == case(2))
        myopicEngineerStillDoesntGetIt = true;

else
        break;
}

}

I'm off this express train to Pretend Town. Close this issue. Pretend that recovery isn't critical. And enjoy your train ride home.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376308 ]

robert engels commented on LUCENE-555:
--------------------------------------

Dan, please let us know what company you work for so we can avoid that place like the plague.

You are obviously having a bad day, year, life...

If you took the time to actually READ the comments, you would realize that for MOST users of Lucene the performance overhead that would be required in EVERY CASE in order to allow index recovery IN THE RARE CASE is not worth it. For MOST users of Lucene the index can be regenerated if the index should become corrupt - similar to how MySQL "rebuilds" the database - just a different process.

MySQL cannot recover missing data if the data disk blocks become corrupt - after recovery those records will be gone. For MANY users this is unacceptable. MySQL can rebuild the indexes on the data, since it has the source data. With Lucene, in MOST cases, the data Lucene retains is insufficient to rebuild the index from scratch.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376309 ]

Doug Cutting commented on LUCENE-555:
-------------------------------------

Calling folks names probably won't help your agenda.  Running away probably won't help your agenda either.  What might help it it is calm, polite, persistent engagement.  Lucene is changed primarily by folks who use Lucene.  Other users are telling you that they don't personally require this sort of recovery.  Perhaps that's even a self-defining characteristic: if they needed it then they wouldn't be using Lucene.

So, if you need this in Lucene, the best way to get it added is to help add it.  If you're not interested in helping, then this will probably have to wait until someone else who needs it comes along and is willing to make it happen.  So, if you want to help, examine the file formats document and make a proposal.  Then, if it looks like it might work, contribute a patch that implements it.  Perhaps you'll be able to identify some collaborators to help out.  Perhaps not.  That's the way Lucene changes.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376310 ]

Otis Gospodnetic commented on LUCENE-555:
-----------------------------------------

Hillarious.  Very constructive, Dan.  Hint: somebody already implemented transaction support for Lucene a while back.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376313 ]

Yonik Seeley commented on LUCENE-555:
-------------------------------------

Dan, I am interested in the source if index corruption.
If you can share a test that reproduces this, it would be helpful.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376319 ]

Andi Vajda commented on LUCENE-555:
-----------------------------------

There is an implementation of the Lucene index store that is backed up by Berkeley DB. Take a look at the 'db' contrib area: http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/db/
Using this you can bracket index changes with transaction. Should the cord be pulled, you can use Berkeley DB's recovery mechanisms.

> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Driver about ACID requirements for Lucene (Re: [jira] Commented: (LUCENE-555))

Tatu Saloranta
In reply to this post by Sebastian Nagel (Jira)
--- "dan (JIRA)" <[hidden email]> wrote:

> while ( myopicEngineerStillDoesntGetIt)
> {
>
> case(1)
> {
> A small business running MySQL has a travelling
....
>
> case(2)
> {
> Same scenario. How does team Lucene respond? If you

Dan, do us all a favor and please figure out the
difference between a DATABASE and INDEXING ENGINE, and
quit whining.
Lucene is latter, MySQL former: you can not and should
not expect same feature set from the two. If you need
ACID, use a database. If you need fast full text
search capability, use latter. Although some DBs
bundle full text indexing packages, they are not part
of the DB engine.
If you are storing important (primary) data in Lucene
index, you are just clueless.

There are many ways to implement recoverability of
Lucene indexes; including using BDB backend or having
versioned copies of index files (instead of modifying
existing index as is, make a copy, modify it, flip
when done). But it is just downright silly to demand
database features from a non-database: the core to
good products is focusing on core features.
Catastrophic failure recovery is not a core feature
for indexing engines.

-+ Tatu +-


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-555) Index Corruption

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376331 ]

Chuck Williams commented on LUCENE-555:
---------------------------------------

Dan,

I don't know if you are still watching this, but in addition to Doug's point about how Lucene changes, there is a second important consideration to keep in mind.

Lucene is a search library, not an enterprise search application.  If you are looking for the latter, you might want to check out something like SOLR.  The existence of SOLR demonstrates the difference.

There are many successful applications based on Lucene for a wide range of uses.  Many major products and web sites are based on Lucene.  As Lucene is a library, it supports a wide range of use cases.  The beauty of the library is that it is solid, robust, well exercised through use and open source review/contribution, and is well-designed for specialization into different applications.

Journaling and recovery are useful capabilities, but I hold to the position that the job of the library is to never corrupt the index.  Journalizing and recovery should be optional add-ons for those applications that need them.  For my current application, for example, total index corruption can be resolved by reindexing from an external persistent repository.  My one requirement is to know if data is lost, and if it is a small amount of data, to be able to identify what was lost.  Whence, a limited journaling capability that focuses on recovery of data held in RAM and not yet committed to disk when a crash occurs.

What you found is a bug in optimize, a quite surprising one at that.  Please take the time to help track it down.


> Index Corruption
> ----------------
>
>          Key: LUCENE-555
>          URL: http://issues.apache.org/jira/browse/LUCENE-555
>      Project: Lucene - Java
>         Type: Bug

>   Components: Index
>     Versions: 1.9
>  Environment: Linux FC4, Java 1.4.9
>     Reporter: dan
>     Priority: Critical

>
> Index Corruption
> >>>>>>>>> output
> java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204)
>         at org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425)
>         at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434)
>         at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
>         at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56)
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
>         at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
>         at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
> >>>>>>>>> input
> - I open an index, I read, I write, I optimize, and eventually the above happens. The index is unusable.
> - This has happened to me somewhere between 20 and 30 times now - on indexes of different shapes and sizes.
> - I don't know the reason. But, the following requirement applies regardless.
> >>>>>>>>> requirement
> - Like all modern database programs, there has to be a way to repair an index. Period.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (LUCENE-555) Index Corruption

Karel Tejnora
In reply to this post by Sebastian Nagel (Jira)
Ok than
indexer indexes to separate directory (sequence of dir, e.g. 1/ 2/ 3/
4/) with create=true. [transaction log]
than merges newly created index to 'for-search' index.
backup is copy of 'for-search' index
than rollforward is IndexWriter addIndexes(...) newer than backup image.
rollbackward to DATE is rollforward to date etc.
indexed data kept as chunk of xml files.

also there is no problem to have 'for-search' indexes 0/ 1/ directory
and use soft link
current and old.On success of following swap those links.
IndexWriter path=old/ create=true
IndexWriter.addIndexes(new Directory[] { current, 5/ }

Doug has written somewhere how technorati achived delta-backups.

There are a lot of way achive fail-over.

PS: MySQL :) try to work with innodb and move system time backward.

>>- Like all modern database programs, there has to be a way to repair an index. Period.
>>    
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12