Adding attribute to index

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding attribute to index

Nitasha Walia (niwalia)
Hi,
 
I am a new user of Java Lucene and need to learn how to add a new attribute, such that, given a database of emails, containing sender information, searching for a keyword, results in
1. The sender of the email
2. The email.
 
I am using Lucene-2.3.1, and don't know where to start in the huge code base.
 
Can someone please advise on the same?
 
Thanks,

Nitasha Walia
Software Engineer
Product Development

[hidden email]
Mobile: 412-736 4507



United States
Cisco home page

 
Think before you print. Think before you print.
This e-mail may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message.


 
Reply | Threaded
Open this post in threaded view
|

Re: Adding attribute to index

Donna L Gresh
This is "fast and loose" code (from my head; check the syntax). I *highly*
recommend you get a copy of the book Lucene in Action; it will really
help.

To create the index, add a document with two fields; one for the sender
and one for the email text.

IndexWriter indexWriter = new IndexWriter(......

Document emailDoc = new Document();
Field senderField = new Field("sender", senderEmailAddress,
Field.Store.YES, Field.Index.UN_TOKENIZED);
emailDoc.add(senderField);
Field textField = new Field("emailText", textOfEmail, Field.Store.YES,
Field.Index.TOKENIZED);
emailDoc.add(textField);
indexWriter.addDocument(emailDoc);


Then when you are searching, search in the email text field:

Query query = new TermQuery(new Term("emailText","searchTerm"));
Hits hits = searcher.search(query);
Document doc = hits.doc(0); //best fit document
String emailSender = doc.get("sender");
String emailText = doc.get("emailText");


Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
[hidden email]


"Nitasha Walia (niwalia)" <[hidden email]> wrote on 04/02/2008 02:26:45
PM:

> Hi,
>
> I am a new user of Java Lucene and need to learn how to add a new
> attribute, such that, given a database of emails, containing sender
> information, searching for a keyword, results in
> 1. The sender of the email
> 2. The email.
>
> I am using Lucene-2.3.1, and don't know where to start in the huge code
base.

>
> Can someone please advise on the same?
>
> Thanks,
>
> [image removed]
>
> Nitasha Walia
> Software Engineer
> Product Development
>
> [hidden email]
> Mobile: 412-736 4507
>
>
>
> United States
> Cisco home page

>
>
>
> [image removed]
>
> [image removed] Think before you print.
>
> This e-mail may contain confidential and privileged material for the
> sole use of the intended recipient. Any review, use, distribution or
> disclosure by others is strictly prohibited. If you are not the
> intended recipient (or authorized to receive for the recipient),
> please contact the sender by reply e-mail and delete all copies of
> this message.
>
> [image removed]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding attribute to index

Michael Wechner
In reply to this post by Nitasha Walia (niwalia)
Nitasha Walia (niwalia) wrote:

> Hi,
>  
> I am a new user of Java Lucene and need to learn how to add a new
> attribute, such that, given a database of emails, containing sender
> information, searching for a keyword, results in


what kind of database do you use to store your emails?

I am asking because it might make sense to introduce some data
abstraction layer (for example JCR or Yarep) which would access your
database and has built-in Lucene and hence you would't have to worry
about Lucene itself, but could rather search like

Node[] emails = getRepository("emails").search("sender", QUERY);
for (i < emails.length) System.out.print(emails[i].getProperty("body");

> 1. The sender of the email
> 2. The email.


Otherwise I would suggest to start at

http://lucene.apache.org/java/2_3_1/gettingstarted.html

HTH

Michael

>  
> I am using Lucene-2.3.1, and don't know where to start in the huge
> code base.
>  
> Can someone please advise on the same?
>  
> Thanks,
>
> *Nitasha Walia*
> *Software Engineer*
> **Product Development*
> *
> [hidden email] <mailto:[hidden email]>
> Mobile: *412-736 4507*
>
>
>
> **
>
> United States
> Cisco home page <http://www.cisco.com/>
>
>
>
> Think before you print. Think before you print.
> This e-mail may contain confidential and privileged material for the
> sole use of the intended recipient. Any review, use, distribution or
> disclosure by others is strictly prohibited. If you are not the
> intended recipient (or authorized to receive for the recipient),
> please contact the sender by reply e-mail and delete all copies of
> this message.
>
>
>  



--
Michael Wechner
Wyona      -   Open Source Content Management - Yanel, Yulup
http://www.wyona.com
[hidden email], [hidden email]
+41 44 272 91 61


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Adding attribute to index

Nitasha Walia (niwalia)
In reply to this post by Donna L Gresh
Thanks !!

-----Original Message-----
From: Donna L Gresh [mailto:[hidden email]]
Sent: Wednesday, April 02, 2008 11:52 AM
To: [hidden email]
Subject: Re: Adding attribute to index

This is "fast and loose" code (from my head; check the syntax). I
*highly* recommend you get a copy of the book Lucene in Action; it will
really help.

To create the index, add a document with two fields; one for the sender
and one for the email text.

IndexWriter indexWriter = new IndexWriter(......

Document emailDoc = new Document();
Field senderField = new Field("sender", senderEmailAddress,
Field.Store.YES, Field.Index.UN_TOKENIZED); emailDoc.add(senderField);
Field textField = new Field("emailText", textOfEmail, Field.Store.YES,
Field.Index.TOKENIZED); emailDoc.add(textField);
indexWriter.addDocument(emailDoc);


Then when you are searching, search in the email text field:

Query query = new TermQuery(new Term("emailText","searchTerm")); Hits
hits = searcher.search(query); Document doc = hits.doc(0); //best fit
document String emailSender = doc.get("sender"); String emailText =
doc.get("emailText");


Donna L. Gresh
Services Research, Mathematical Sciences Department IBM T.J. Watson
Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
[hidden email]


"Nitasha Walia (niwalia)" <[hidden email]> wrote on 04/02/2008
02:26:45
PM:

> Hi,
>
> I am a new user of Java Lucene and need to learn how to add a new
> attribute, such that, given a database of emails, containing sender
> information, searching for a keyword, results in 1. The sender of the
> email 2. The email.
>
> I am using Lucene-2.3.1, and don't know where to start in the huge
> code
base.

>
> Can someone please advise on the same?
>
> Thanks,
>
> [image removed]
>
> Nitasha Walia
> Software Engineer
> Product Development
>
> [hidden email]
> Mobile: 412-736 4507
>
>
>
> United States
> Cisco home page

>
>
>
> [image removed]
>
> [image removed] Think before you print.
>
> This e-mail may contain confidential and privileged material for the
> sole use of the intended recipient. Any review, use, distribution or
> disclosure by others is strictly prohibited. If you are not the
> intended recipient (or authorized to receive for the recipient),
> please contact the sender by reply e-mail and delete all copies of
> this message.
>
> [image removed]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]