Hi, I hava an application that need mark the retrieved documents which have
been read. So the next time I needn't read the marked documents again.
I have an idea that adding a particular field into the indexed
document. But as lucene have no update method, I have to delete that
document, and add it again. I think it seems a little stupid. Or I can use
a database to satisfy the mark requirement, but how does the database relate
to lucene index, especially when i want to retrieve document that I have
read? Maybe there is a better idea.
This one's fairly wild, I'm interested to see what the gurus think...
You could create a bitset and mark each document retrieved by the
appropriate bit position (using the Lucene document id). Persist this bitset
(assuming you need it across sessions). Be careful, I wouldn't persist it
via the toString(), persist it as a binary entity. It depends on how many
docs we're talking about I guess....
Anyway, let's say you have accumulated one of these. Create a filter with
the XOR of the persisted bitset, and pass that filter on to subsequent
searches...... When the search comes back, set the bits in your (persisted)
bitset and save it away. Repeat as needed....
I have no idea if this would help in your particular situation... And, any
time your index changed, any persisted bitsets would be invalid.
Anyway, it may even work. See the Filters in Lucene for what filters are all