How to serach in sentence and dispaly the whole sentence

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to serach in sentence and dispaly the whole sentence

anton feldmann
I intend, to make a search, to find a word or a word pair
in  a sentence or a paragraph. But then the sentence should be indicated
as a whole. The question relates to the fact, that I need to extend Lucene
in such a way that this is possible. But where to I make a start, because
I have no idea, how I have to change the IndexFile, whether that
conforms with the Lucene Team.

cheers

anton feldmann


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to serach in sentence and dispaly the whole sentence

Grant Ingersoll
Anton,

I think there are at least a couple of ways of doing this.  I assume you
have a program that does sentence detection already, as Lucene does not
provide this.  If not, I am sure a search of the web will find one that
has high accuracy.
You can:
1. Index each sentence as a separate Document.  You will need a field on
the Document relating it back to the overall file so you can reconstruct it.
2. As you index, insert sentence/paragraph boundary markers into your
index and then use the SpanQuery functionality.  Search this mail
archive for sentence boundary detection and Span Query (try the dev list
too).  I think there was a discussion between me, Doug and Hoss on how
to do this.
3. Do search as you do now and then post process to figure out what
sentence it came from.  This will be inefficient, but I don't know what
your requirements are that way, so it may work for you.

There are probably other ways too.

anton feldmann wrote:

> I intend, to make a search, to find a word or a word pair
> in  a sentence or a paragraph. But then the sentence should be indicated
> as a whole. The question relates to the fact, that I need to extend
> Lucene
> in such a way that this is possible. But where to I make a start, because
> I have no idea, how I have to change the IndexFile, whether that
> conforms with the Lucene Team.
>
> cheers
>
> anton feldmann
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--

Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244

http://www.cnlp.org 
Voice:  315-443-5484
Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to serach in sentence and dispaly the whole sentence

anton feldmann
Are the names of a field in a document unique or can i make a field with
the name "sentence" for each sentence in an text document?

Grant Ingersoll schrieb:

> Anton,
>
> I think there are at least a couple of ways of doing this.  I assume
> you have a program that does sentence detection already, as Lucene
> does not provide this.  If not, I am sure a search of the web will
> find one that has high accuracy.
> You can:
> 1. Index each sentence as a separate Document.  You will need a field
> on the Document relating it back to the overall file so you can
> reconstruct it.
> 2. As you index, insert sentence/paragraph boundary markers into your
> index and then use the SpanQuery functionality.  Search this mail
> archive for sentence boundary detection and Span Query (try the dev
> list too).  I think there was a discussion between me, Doug and Hoss
> on how to do this.
> 3. Do search as you do now and then post process to figure out what
> sentence it came from.  This will be inefficient, but I don't know
> what your requirements are that way, so it may work for you.
>
> There are probably other ways too.
>
> anton feldmann wrote:
>> I intend, to make a search, to find a word or a word pair
>> in  a sentence or a paragraph. But then the sentence should be indicated
>> as a whole. The question relates to the fact, that I need to extend
>> Lucene
>> in such a way that this is possible. But where to I make a start,
>> because
>> I have no idea, how I have to change the IndexFile, whether that
>> conforms with the Lucene Team.
>>
>> cheers
>>
>> anton feldmann
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to serach in sentence and dispaly the whole sentence

Erik Hatcher

On Apr 26, 2006, at 6:20 PM, anton feldmann wrote:
> Are the names of a field in a document unique or can i make a field  
> with the name "sentence" for each sentence in an text document?

The names of a field in a document are unique, but you can add  
multiple instances of the same field name.  You can retrieve the  
array of them, if they are stored, as well.  You do have to be  
careful though - a phrase query can match across these field  
instances unless you specify a positional gap between them large  
enough to prevent it.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]