How do i get a text summary

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

How do i get a text summary

Ravinder.Teepiredddy
Hi All,

 

Is there a way to get a text summary of an indexed document to display
along with the search result?

Please let me know the technical changes.

 

Thanks,

Ravinder

 



DISCLAIMER:
This message contains privileged and confidential information and is intended only for an individual named. If you are not the intended recipient, you should not disseminate, distribute, store, print, copy or deliver this message. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. The sender, therefore,  does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required, please request a hard-copy version.
Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

ownedthx
Am I missing something?  Isn't this exactly what Lucene does?

Put in a value when you create your Document, get it back out when it comes
back from a search, right?

Want a text summary? Put it in to the document...

I just started playing with Lucene so maybe I'm missing something, but these
question seems quite fundamental to what Lucene is all about.

On Wed, Feb 27, 2008 at 8:57 PM, <[hidden email]>
wrote:

> Hi All,
>
>
>
> Is there a way to get a text summary of an indexed document to display
> along with the search result?
>
> Please let me know the technical changes.
>
>
>
> Thanks,
>
> Ravinder
>
>
>
>
>
> DISCLAIMER:
> This message contains privileged and confidential information and is
> intended only for an individual named. If you are not the intended
> recipient, you should not disseminate, distribute, store, print, copy or
> deliver this message. Please notify the sender immediately by e-mail if you
> have received this e-mail by mistake and delete this e-mail from your
> system. E-mail transmission cannot be guaranteed to be secure or error-free
> as information could be intercepted, corrupted, lost, destroyed, arrive late
> or incomplete or contain viruses. The sender, therefore,  does not accept
> liability for any errors or omissions in the contents of this message which
> arise as a result of e-mail transmission. If verification is required,
> please request a hard-copy version.
>



--
The poor have to labour in the face of the majestic equality of the law,
which forbids the rich as well as the poor to sleep under bridges, to beg in
the streets, and to steal bread.
Reply | Threaded
Open this post in threaded view
|

RE: How do i get a text summary

John Griffin-3
In reply to this post by Ravinder.Teepiredddy
Ravinder,

If you want something from an index it has to be IN the index. So, store a
summary field in each document and make sure that field is part of the
query.

John G.

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]]
Sent: Wednesday, February 27, 2008 7:58 PM
To: [hidden email]
Subject: How do i get a text summary

Hi All,

 

Is there a way to get a text summary of an indexed document to display
along with the search result?

Please let me know the technical changes.

 

Thanks,

Ravinder

 



DISCLAIMER:
This message contains privileged and confidential information and is
intended only for an individual named. If you are not the intended
recipient, you should not disseminate, distribute, store, print, copy or
deliver this message. Please notify the sender immediately by e-mail if you
have received this e-mail by mistake and delete this e-mail from your
system. E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed, arrive late
or incomplete or contain viruses. The sender, therefore,  does not accept
liability for any errors or omissions in the contents of this message which
arise as a result of e-mail transmission. If verification is required,
please request a hard-copy version.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How do i get a text summary

Ravinder.Teepiredddy
Hi John,

I am getting Summary value null in results.jsp page and I need "snippet"
or "fragment" to be highlighted.
I have gone through lucene faqs related but it's not clear. I will
appreciate if you help me to find list of files (Java) to be modified.

Thanks in advance.
Ravinder

-----Original Message-----
From: John Griffin [mailto:[hidden email]]
Sent: Thursday, February 28, 2008 11:50 AM
To: [hidden email]
Subject: RE: How do i get a text summary

Ravinder,

If you want something from an index it has to be IN the index. So, store
a
summary field in each document and make sure that field is part of the
query.

John G.

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]]
Sent: Wednesday, February 27, 2008 7:58 PM
To: [hidden email]
Subject: How do i get a text summary

Hi All,

 

Is there a way to get a text summary of an indexed document to display
along with the search result?

Please let me know the technical changes.

 

Thanks,

Ravinder

 



DISCLAIMER:
This message contains privileged and confidential information and is
intended only for an individual named. If you are not the intended
recipient, you should not disseminate, distribute, store, print, copy or
deliver this message. Please notify the sender immediately by e-mail if
you
have received this e-mail by mistake and delete this e-mail from your
system. E-mail transmission cannot be guaranteed to be secure or
error-free
as information could be intercepted, corrupted, lost, destroyed, arrive
late
or incomplete or contain viruses. The sender, therefore,  does not
accept
liability for any errors or omissions in the contents of this message
which
arise as a result of e-mail transmission. If verification is required,
please request a hard-copy version.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



DISCLAIMER:
This message contains privileged and confidential information and is intended only for an individual named. If you are not the intended recipient, you should not disseminate, distribute, store, print, copy or deliver this message. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. The sender, therefore,  does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required, please request a hard-copy version.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How do i get a text summary

spring
In reply to this post by John Griffin-3
> If you want something from an index it has to be IN the
> index. So, store a
> summary field in each document and make sure that field is part of the
> query.

And how could one create automatically such a summary?
Taking the first 2 lines of a document makes not always much sense.
How does google this?

Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

Mathieu Lecarme
[hidden email] a écrit :
>> If you want something from an index it has to be IN the
>> index. So, store a
>> summary field in each document and make sure that field is part of the
>> query.
>>    
>
> And how could one create automatically such a summary?
>  
Have a look to http://alias-i.com/lingpipe/index.html or
http://www.nzdl.org/Kea/
Summerizing is a datamining stuff.
> Taking the first 2 lines of a document makes not always much sense.
> How does google this?
>  
The simpler way is to give text context, n words before, and n words after.

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How do i get a text summary

Donna L Gresh
In reply to this post by spring
I think you may want to look into the Highlighter. It allows you to show
the "relevant" bits of the document which contributed to the document
being matched to the query. It does a pretty good job. Of course it does
not create a "summary" but it does give you a good idea of why the
document was hit.

http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/highlight/Highlighter.html

Donna Gresh


<[hidden email]> wrote on 02/28/2008 07:42:40 AM:

> > If you want something from an index it has to be IN the
> > index. So, store a
> > summary field in each document and make sure that field is part of the
> > query.
>
> And how could one create automatically such a summary?
> Taking the first 2 lines of a document makes not always much sense.
> How does google this?
>
> Thank you.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

Karl Wettin
In reply to this post by spring
[hidden email] skrev:
>> If you want something from an index it has to be IN the
>> index. So, store a
>> summary field in each document and make sure that field is part of the
>> query.
>
> And how could one create automatically such a summary?
> Taking the first 2 lines of a document makes not always much sense.
> How does google this?

Google don't summarize, they highlight parts that match the query. See
previous reponses.

If you really want to summarize there are a number of more and less
scientific ways to figure out what's important and what's not.

Very simple algorithmic solutions usually involve ranking top senstances
by looking at distribution of terms in sentances, paragraphs and the
whole document. I implemented something like this a couple of years back
that worked fairly well.

Citeseer is a great source for papers on pretty much any IR related
subject: <http://citeseer.ist.psu.edu/cs?cs=1&q=text+summarization>


    karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

Eric Th
Hi Karl,
Where is the introduction of below algorithm? Thanks.
"Very simple algorithmic solutions usually involve ranking top senstances
by looking at distribution of terms in sentances, paragraphs and the
whole document. I implemented something like this a couple of years back
that worked fairly well."



2008/2/29, Karl Wettin <[hidden email]>:

>
> [hidden email] skrev:
>
> >> If you want something from an index it has to be IN the
> >> index. So, store a
> >> summary field in each document and make sure that field is part of the
> >> query.
> >
> > And how could one create automatically such a summary?
> > Taking the first 2 lines of a document makes not always much sense.
> > How does google this?
>
>
> Google don't summarize, they highlight parts that match the query. See
> previous reponses.
>
> If you really want to summarize there are a number of more and less
> scientific ways to figure out what's important and what's not.
>
> Very simple algorithmic solutions usually involve ranking top senstances
> by looking at distribution of terms in sentances, paragraphs and the
> whole document. I implemented something like this a couple of years back
> that worked fairly well.
>
> Citeseer is a great source for papers on pretty much any IR related
> subject: <http://citeseer.ist.psu.edu/cs?cs=1&q=text+summarization>
>
>
>
>     karl
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

Karl Wettin
h t skrev:
> Where is the introduction of below algorithm? Thanks.

I can't recall where I picked it up, but something like this:

Score terms by count and distribution. A term occuring 20 times in the
same paragraph is not as important as a term occuring 20 times over 10
paragraphs. Similar terms affect each others score (I used ngrams and I
also detected abbreviations). Be sure to remove stop words using some
term reduction algorithm. Language detection makes sense. Rank sentances
by length normalized term score. The top n sentances is your summary. If
enough of these sentances are parts of the same paragraph and the
paragraph is small enough, that is your summary instead.

I hope this helps.


     karl


> "Very simple algorithmic solutions usually involve ranking top senstances
> by looking at distribution of terms in sentances, paragraphs and the
> whole document. I implemented something like this a couple of years back
> that worked fairly well."
>
>
>
> 2008/2/29, Karl Wettin <[hidden email]>:
>> [hidden email] skrev:
>>
>>>> If you want something from an index it has to be IN the
>>>> index. So, store a
>>>> summary field in each document and make sure that field is part of the
>>>> query.
>>> And how could one create automatically such a summary?
>>> Taking the first 2 lines of a document makes not always much sense.
>>> How does google this?
>>
>> Google don't summarize, they highlight parts that match the query. See
>> previous reponses.
>>
>> If you really want to summarize there are a number of more and less
>> scientific ways to figure out what's important and what's not.
>>
>> Very simple algorithmic solutions usually involve ranking top senstances
>> by looking at distribution of terms in sentances, paragraphs and the
>> whole document. I implemented something like this a couple of years back
>> that worked fairly well.
>>
>> Citeseer is a great source for papers on pretty much any IR related
>> subject: <http://citeseer.ist.psu.edu/cs?cs=1&q=text+summarization>
>>
>>
>>
>>     karl
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How do i get a text summary

Bob Carpenter
In reply to this post by Mathieu Lecarme
CONTENTS DELETED
The author has deleted this message.