lucene search sentence

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

lucene search sentence

anton feldmann
Hi

I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an other Document.

"Document document = new Document(new Field ("contents", reader ));
           
        StringTokenizer token = new StringTokenizer(contents.replaceAll(". ", "\\.x\\") , "\\.x\\");
while(token.hasMoreTokens()){
        Document doc = new Document();
      doc.add(new Field ("sentence", token.nextToken(),Field.Store.YES, Field.Index.TOKENIZED) );
}"

1) How do I write a Lucene Search and display all the hits in an
document?
2) How do I display the sentence the hit is in? and color the hit.
3) How do I display the sentence before and after the sentence the hit
is in?

Cherrs

anton


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: lucene search sentence

Grant Ingersoll
Anton,

Please don't cross post "How do I..." questions to the dev list, it
doesn't get you anywhere and just annoys those most likely to help you.

See below.

-Grant
Anton Feldmann wrote:

> Hi
>
> I wrote a Indexer which is indexing all the contents of a text and the
> sentence are seperated in an other Document.
>
> "Document document = new Document(new Field ("contents", reader ));
>            
>         StringTokenizer token = new StringTokenizer(contents.replaceAll(". ", "\\.x\\") , "\\.x\\");
> while(token.hasMoreTokens()){
>         Document doc = new Document();
>       doc.add(new Field ("sentence", token.nextToken(),Field.Store.YES, Field.Index.TOKENIZED) );
> }"
>
> 1) How do I write a Lucene Search and display all the hits in an
> document?
>  
SpanQuery can give you information about where matches take place.  If
you are looking for a more basic answer, then refer to the demo on how
to do a search that returns Hits or the well-written "Lucene In Action".

> 2) How do I display the sentence the hit is in? and color the hit.
>  
Use the Highlighter contrib package.

> 3) How do I display the sentence before and after the sentence the hit
> is in?
>  
Not sure.  You probably need some way of keeping track of where the
sentences occur.  See my previous answer to a similar question you asked
about how to index and search sentences.  I, personally, think you need
to have a Document per sentence, with some metadata fields about where
that sentence takes place, but others may have alternate ideas.  You
_could_, instead of having each field be named "sentence", have the
field name reflect which sentence it is, along with a catch all field,
but this would make querying a lot harder.

> Cherrs
>
> anton
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

--

Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244

http://www.cnlp.org 
Voice:  315-443-5484
Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: lucene search sentence

steve_rowe
In reply to this post by anton feldmann
Anton Feldmann wrote:
> 3) How do I display the sentence before and after the sentence the hit
> is in?

You could:

1. Make your Lucene Document be a set of three sentences (before,
searchable, after), which you store, but write a custom Analyzer which
only returns tokens for the "searchable" central sentence.

2. Store the full document contents outside of Lucene, and make your
Lucene Document be a single sentence, the tokens from which you will
index, but also include offset and length Fields for the previous and
next sentences with the Document, corresponding to the windows from the
full document that you want to display with the hit.  This one will
likely work better with the Highlighter package.

Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]