Ask for a better solution for the case

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Ask for a better solution for the case

hu andy
Hi, I hava an application that need mark the retrieved documents  which have
been read. So the next time I needn't read the marked documents again.

    I have an idea  that adding a particular field into the indexed
document. But as lucene have no update method, I have to delete that
document, and add it again.  I think it seems a little stupid. Or I can use
a database to satisfy the mark requirement, but how does the database relate
to lucene index, especially when i want to retrieve document that I have
read? Maybe there is a better idea.

    Any suggestion will be greatly appreciated.
Reply | Threaded
Open this post in threaded view
|

Re: Ask for a better solution for the case

Erick Erickson
This one's fairly wild, I'm interested to see what the gurus think...

You could create a bitset and mark each document retrieved by the
appropriate bit position (using the Lucene document id). Persist this bitset
(assuming you need it across sessions). Be careful, I wouldn't persist it
via the toString(), persist it as a binary entity. It depends on how many
docs we're talking about I guess....

Anyway, let's say you have accumulated one of these. Create a filter with
the XOR of the persisted bitset, and pass that filter on to subsequent
searches...... When the search comes back, set the bits in your (persisted)
bitset and save it away. Repeat as needed....

I have no idea if this would help in your particular situation... And, any
time your index changed, any persisted bitsets would be invalid.

Anyway, it may even work. See the Filters in Lucene for what filters are all
about.

Erick
Reply | Threaded
Open this post in threaded view
|

Re: Ask for a better solution for the case

Doug Cutting
In reply to this post by hu andy
hu andy wrote:
> Hi, I hava an application that need mark the retrieved documents  which have
> been read. So the next time I needn't read the marked documents again.

You could mark the documents as deleted, then later clear deletions.  So
long as you don't close the IndexReader, the deletions will never be
flushed to disk.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]