Weighting on words

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Weighting on words

Jean-Christophe Alleman

Hi,

I need to know an other feature of Nutch which is important for me.


Is it possible with Nutch to change the weight of word. I explain :


If a word of the query is in the URL of a document, is it possible for
the administrator to increase manually the weight of the document so
when the results are displayed, the document is displayed in the first
results.


For example : I search "Harry Potter". There is a document HarryPotter.html in the directory ./home/bestsale/
of my Intranet.

So can I put more weight to HarryPotter because it's in ./home/bestsale/ ?


Or is there any option to change weight of documents ?



It's very important for me to know that !


Thank's for your attention,

Jisay

_________________________________________________________________
Emmenez vos amis avec vous, grâce à Messenger sur votre GSM.
http://get.live.com
Reply | Threaded
Open this post in threaded view
|

Re: Weighting on words

Jasper Kamperman
Yes, there are many options. There is a field Boost in the underlying  
Lucene index and you can write your own plugins that add fields which  
you can then use to boost certain documents in query resutls. For  
specifics you'll need to read the documentation or earlier posts in  
this list asking more or less the same question :-).

On Feb 27, 2008, at 7:46 AM, Jean-Christophe Alleman wrote:

>
> Hi,
>
> I need to know an other feature of Nutch which is important for me.
>
>
> Is it possible with Nutch to change the weight of word. I explain :
>
>
> If a word of the query is in the URL of a document, is it possible for
> the administrator to increase manually the weight of the document so
> when the results are displayed, the document is displayed in the first
> results.
>
>
> For example : I search "Harry Potter". There is a document  
> HarryPotter.html in the directory ./home/bestsale/
> of my Intranet.
>
> So can I put more weight to HarryPotter because it's in ./home/
> bestsale/ ?
>
>
> Or is there any option to change weight of documents ?
>
>
>
> It's very important for me to know that !
>
>
> Thank's for your attention,
>
> Jisay
>
> _________________________________________________________________
> Emmenez vos amis avec vous, grâce à Messenger sur votre GSM.
> http://get.live.com

Reply | Threaded
Open this post in threaded view
|

Re: Weighting on words

Dennis Kubes-2
There is a field boost but currently nutch overrides all field boosts
with a document boost which is the score calculated by the scoring
filters.  So currently setting field boosts in indexing filter plugins
won't have any affect.

Dennis

Jasper Kamperman wrote:

> Yes, there are many options. There is a field Boost in the underlying
> Lucene index and you can write your own plugins that add fields which
> you can then use to boost certain documents in query resutls. For
> specifics you'll need to read the documentation or earlier posts in this
> list asking more or less the same question :-).
>
> On Feb 27, 2008, at 7:46 AM, Jean-Christophe Alleman wrote:
>
>>
>> Hi,
>>
>> I need to know an other feature of Nutch which is important for me.
>>
>>
>> Is it possible with Nutch to change the weight of word. I explain :
>>
>>
>> If a word of the query is in the URL of a document, is it possible for
>> the administrator to increase manually the weight of the document so
>> when the results are displayed, the document is displayed in the first
>> results.
>>
>>
>> For example : I search "Harry Potter". There is a document
>> HarryPotter.html in the directory ./home/bestsale/
>> of my Intranet.
>>
>> So can I put more weight to HarryPotter because it's in
>> ./home/bestsale/ ?
>>
>>
>> Or is there any option to change weight of documents ?
>>
>>
>>
>> It's very important for me to know that !
>>
>>
>> Thank's for your attention,
>>
>> Jisay
>>
>> _________________________________________________________________
>> Emmenez vos amis avec vous, grâce à Messenger sur votre GSM.
>> http://get.live.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Weighting on words

Andrzej Białecki-2
Dennis Kubes wrote:
> There is a field boost but currently nutch overrides all field boosts
> with a document boost which is the score calculated by the scoring
> filters.  So currently setting field boosts in indexing filter plugins
> won't have any affect.

This is not entirely correct ... field boosts are multiplied by the
document boost, so if you set different field boosts the matches in
different fields will be scored differently.


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com