[lucy-user] Indexing error message

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[lucy-user] Indexing error message

Edwin Crockford
Repeatedly get errors like this:

/Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past
EOF (8 > 0)/

Anybody have and idea what is causing this?

Thanks

Edwin
Reply | Threaded
Open this post in threaded view
|

RE: [lucy-user] Indexing error message

Zebrowski, Zak
Hello Edwin,
Seems like the index is trying to read something beyond the end of file.  My guess is that it's possibly a hard drive error or full hard disk at the time the index was being created.
Good luck,
Zak

-----Original Message-----
From: Edwin Crockford [mailto:[hidden email]]
Sent: Monday, December 08, 2014 3:05 PM
To: [hidden email]
Subject: [lucy-user] Indexing error message

Repeatedly get errors like this:

/Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past
EOF (8 > 0)/

Anybody have and idea what is causing this?

Thanks

Edwin
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Indexing error message

Marvin Humphrey
In reply to this post by Edwin Crockford
On Mon, Dec 8, 2014 at 12:05 PM, Edwin Crockford <[hidden email]> wrote:
> Repeatedly get errors like this:
>
> /Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past EOF
> (8 > 0)/
>
> Anybody have and idea what is causing this?

The `highlight.ix` virtual file is a sequence of 8-byte file pointers, each of
which points into a variable size blob in the virtual file `highlight.dat`.
Lucy document numbers for each segment begin at 1, and the length of
`highlight.ix` should be `highest_doc_num * 8`.  If the file's length is 0,
that implies that there are no documents in that segment.

The next step when debugging this is to examine the contents of cfmeta.json
for the specific segment.  Does the segment really contain no documents?  Is
the virtual file `documents.ix`, which follows the same format, also
zero-length?

Marvin Humphrey
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Indexing error message

Edwin Crockford
Hi Marvin,

Thanks for the quick reply, here's a fragment of  the cfmeta.json file
for the segment:

{
   "files": {
     "documents.dat": {
       "length": "17556716",
       "offset": "0"
     },
     "documents.ix": {
       "length": "238760",
       "offset": "17556720"
     },
     "highlight.dat": {
       "length": "47793",
       "offset": "17795480"
     },
     "highlight.ix": {
       "length": "0",
       "offset": "17843280"
     },


Not quite sure what the format is but it  has a 0 length for
"highlight.ix", highlight.dat has a largish length. Is this some failure
in the highlighting mechansim?

Regards
Edwin

On 08/12/2014 20:32, Marvin Humphrey wrote:

> On Mon, Dec 8, 2014 at 12:05 PM, Edwin Crockford <[hidden email]> wrote:
>> Repeatedly get errors like this:
>>
>> /Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past EOF
>> (8 > 0)/
>>
>> Anybody have and idea what is causing this?
> The `highlight.ix` virtual file is a sequence of 8-byte file pointers, each of
> which points into a variable size blob in the virtual file `highlight.dat`.
> Lucy document numbers for each segment begin at 1, and the length of
> `highlight.ix` should be `highest_doc_num * 8`.  If the file's length is 0,
> that implies that there are no documents in that segment.
>
> The next step when debugging this is to examine the contents of cfmeta.json
> for the specific segment.  Does the segment really contain no documents?  Is
> the virtual file `documents.ix`, which follows the same format, also
> zero-length?
>
> Marvin Humphrey

Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Indexing error message

Marvin Humphrey
On Mon, Dec 8, 2014 at 12:55 PM, Edwin Crockford <[hidden email]> wrote:

> Hi Marvin,
>
> Thanks for the quick reply, here's a fragment of  the cfmeta.json file for
> the segment:
>
> {
>   "files": {
>     "documents.dat": {
>       "length": "17556716",
>       "offset": "0"
>     },
>     "documents.ix": {
>       "length": "238760",
>       "offset": "17556720"
>     },
>     "highlight.dat": {
>       "length": "47793",
>       "offset": "17795480"
>     },
>     "highlight.ix": {
>       "length": "0",
>       "offset": "17843280"
>     },
>
>
> Not quite sure what the format is but it  has a 0 length for "highlight.ix",
> highlight.dat has a largish length. Is this some failure in the highlighting
> mechansim?

Looking at the code in HighlightWriter.c, nothing jumps out at me.  I can't
see how it's possible to write to highlight.dat without also writing to
highlight.ix.  And for what it's worth, HighlightWriter's codebase has been
largely stable since 2009, recieving only minor modifications.

Disk filling up also seems unlikely -- lots of other files are written to at
the same time as highlight.ix (e.g. documents.ix) and those don't exhibit the
same problem; we should detect that a flush had failed when the file
descriptor gets closed; the "compound files" cf.dat and cfmeta.json get
written *after* highlight.ix, at which point you need *more* disk space.

Insead, I speculate what we are looking at is a different manifestation of the
same problem we talked about in August 2013.

    http://markmail.org/message/vynzixtoxfxhcx42

    > I believe we have traced the issue back to an interaction between two
    > different systems (one doing bulk updates and another doing on the fly
    > single document indexing) attempting updates at the same time. I
think there
    > was a way around the locking that caused the problem, does that seem
    > plausable?

    Yes, that makes sense. The error can be explained by having two Indexers
    trying to write to the same segment. One of them happens to delete the temp
    file "lextemp" first, and then the other can't find it and throws an
    exception.

    Only one Indexer may operate on a given index at a time. A BackgroundMerger
    may operate at the same time as an Indexer, but even it must
acquire the write
    lock briefly (once at the start of its run and once again at the end). While
    Lucy's locking APIs provide the technical capacity to disable the locking
    mechanism, it is not generally possible to get around the need for
locking in
    order to coordinate write access to an index.

Generally, when you disable locking and two writers attempt to write the same
segment, the first Indexer will crash before commit() completes and the index
will be left in a consistent state.  If you are unlucky, though, there's a
possibility you'll get corrupt data instead.

For that to happen, the second indexing process would have to start up while
the first was nearly done and in the process of assembling the compound file
`cf.dat` from temporary files such as `highlight.ix`.  The second process
"cleans up" the temp files from the "crashed" first process and initializes
new empty files.  The first process doesn't realize that its own
highlight.ix file has been clobbered and slurps the new empty file into
cf.dat.

How was your issue from last year resolved?

Marvin Humphrey
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Indexing error message

Edwin Crockford
Thanks Marvin, could well be a manifestation of the same issue we
previously talked about. At least it gives me a point to start. I'll
talk with the ops guys to see how and when they run the bulk indexer and
see if we can sort out a better way of running it.

Thanks again
Edwin

On 08/12/2014 23:18, Marvin Humphrey wrote:

> On Mon, Dec 8, 2014 at 12:55 PM, Edwin Crockford <[hidden email]> wrote:
>> Hi Marvin,
>>
>> Thanks for the quick reply, here's a fragment of  the cfmeta.json file for
>> the segment:
>>
>> {
>>    "files": {
>>      "documents.dat": {
>>        "length": "17556716",
>>        "offset": "0"
>>      },
>>      "documents.ix": {
>>        "length": "238760",
>>        "offset": "17556720"
>>      },
>>      "highlight.dat": {
>>        "length": "47793",
>>        "offset": "17795480"
>>      },
>>      "highlight.ix": {
>>        "length": "0",
>>        "offset": "17843280"
>>      },
>>
>>
>> Not quite sure what the format is but it  has a 0 length for "highlight.ix",
>> highlight.dat has a largish length. Is this some failure in the highlighting
>> mechansim?
> Looking at the code in HighlightWriter.c, nothing jumps out at me.  I can't
> see how it's possible to write to highlight.dat without also writing to
> highlight.ix.  And for what it's worth, HighlightWriter's codebase has been
> largely stable since 2009, recieving only minor modifications.
>
> Disk filling up also seems unlikely -- lots of other files are written to at
> the same time as highlight.ix (e.g. documents.ix) and those don't exhibit the
> same problem; we should detect that a flush had failed when the file
> descriptor gets closed; the "compound files" cf.dat and cfmeta.json get
> written *after* highlight.ix, at which point you need *more* disk space.
>
> Insead, I speculate what we are looking at is a different manifestation of the
> same problem we talked about in August 2013.
>
>      http://markmail.org/message/vynzixtoxfxhcx42
>
>      > I believe we have traced the issue back to an interaction between two
>      > different systems (one doing bulk updates and another doing on the fly
>      > single document indexing) attempting updates at the same time. I
> think there
>      > was a way around the locking that caused the problem, does that seem
>      > plausable?
>
>      Yes, that makes sense. The error can be explained by having two Indexers
>      trying to write to the same segment. One of them happens to delete the temp
>      file "lextemp" first, and then the other can't find it and throws an
>      exception.
>
>      Only one Indexer may operate on a given index at a time. A BackgroundMerger
>      may operate at the same time as an Indexer, but even it must
> acquire the write
>      lock briefly (once at the start of its run and once again at the end). While
>      Lucy's locking APIs provide the technical capacity to disable the locking
>      mechanism, it is not generally possible to get around the need for
> locking in
>      order to coordinate write access to an index.
>
> Generally, when you disable locking and two writers attempt to write the same
> segment, the first Indexer will crash before commit() completes and the index
> will be left in a consistent state.  If you are unlucky, though, there's a
> possibility you'll get corrupt data instead.
>
> For that to happen, the second indexing process would have to start up while
> the first was nearly done and in the process of assembling the compound file
> `cf.dat` from temporary files such as `highlight.ix`.  The second process
> "cleans up" the temp files from the "crashed" first process and initializes
> new empty files.  The first process doesn't realize that its own
> highlight.ix file has been clobbered and slurps the new empty file into
> cf.dat.
>
> How was your issue from last year resolved?
>
> Marvin Humphrey