Email and attachments

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Email and attachments


I am a newbie with Lucene and I am working out the best way to index email data.

An earlier poster talked about index attachments with two alternatives:
However, there is a third alternative:

Each message/attachment is indexed as a separate Document with the email header
data included in all Documents.  The drawback of this approach seems to be that
it is not possible to make AND searches between two body parts in different
documents directly in Lucene (or is it?).

One advantage of this approach is that it is then possible to use a different
Analyzer for each Document, which is useful when the attachments contain data in
different languages.

If combining all attachments to a single body field, it's only possible to use
the index or Document analyzer.

Has anyone used this type of approach and does it work?


To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]