Problems with synonyms

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Problems with synonyms

Leonardo Dias-5
Hello there. How are you guys?

We're having problems with synonyms here and I thought that maybe you
guys could help us on how SOLR works for synonyms.

The problem is the following: I'd like to setup a synonym like "dba,
database administrator".

Instead of increasing the number of results for the keyword "dba", the
results got smaller and it only brought me back results that had both
the keywords "dba" and "database administrator" at the same time instead
of bringing back both "dba" and "database administrator" as expected
since our synonym configuration is using expand=true.

Since in the past this was not the expected behavior, I'd like to know
whether something changed in the solr/lucene internals so that this
functionality is now lost, or if I'm doing something wrong with my setup.

Currently all fields pass through the Synonym filter factory. The
analysis shows me that it tries to search for database administrator and
DBA. A debug query also shows me that the query it's trying to do is
something like this:

+DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
observation:"(dba datab) administr"^10.0 | description:"(dba datab)
administr"^10.0 | company:"(dba datab) administr")~0.1)

The problem is: when I search for this, I get 5 results. When I search
for dba only, without the "dba, database administrator" line in the
synonyms.txt file, I get more than 100 results.

Do you guys know why this is happening?

Thank you,

Leonardo
Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Vernon Chapman

Leonardo,

I am no expert but I would check to make sure that the
DefaultOperator parameter in your schema.xml file is set to
OR rather thank AND.

Vernon

On 3/31/09 3:24 PM, "Leonardo Dias" <[hidden email]> wrote:

> Hello there. How are you guys?
>
> We're having problems with synonyms here and I thought that maybe you
> guys could help us on how SOLR works for synonyms.
>
> The problem is the following: I'd like to setup a synonym like "dba,
> database administrator".
>
> Instead of increasing the number of results for the keyword "dba", the
> results got smaller and it only brought me back results that had both
> the keywords "dba" and "database administrator" at the same time instead
> of bringing back both "dba" and "database administrator" as expected
> since our synonym configuration is using expand=true.
>
> Since in the past this was not the expected behavior, I'd like to know
> whether something changed in the solr/lucene internals so that this
> functionality is now lost, or if I'm doing something wrong with my setup.
>
> Currently all fields pass through the Synonym filter factory. The
> analysis shows me that it tries to search for database administrator and
> DBA. A debug query also shows me that the query it's trying to do is
> something like this:
>
> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
> administr"^10.0 | company:"(dba datab) administr")~0.1)
>
> The problem is: when I search for this, I get 5 results. When I search
> for dba only, without the "dba, database administrator" line in the
> synonyms.txt file, I get more than 100 results.
>
> Do you guys know why this is happening?
>
> Thank you,
>
> Leonardo


Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Leonardo Dias-5
Hi, Vernon!

We tried both approaches: OR and AND. In both cases, the results were smaller when the synonyms was set up, with no change at all when it comes to synonyms.

Any other ideas? Is it likely to be a bug?

Best,

Leonardo

Vernon Chapman escreveu:
Leonardo,

I am no expert but I would check to make sure that the
DefaultOperator parameter in your schema.xml file is set to
OR rather thank AND.

Vernon

On 3/31/09 3:24 PM, "Leonardo Dias" [hidden email] wrote:

  
Hello there. How are you guys?

We're having problems with synonyms here and I thought that maybe you
guys could help us on how SOLR works for synonyms.

The problem is the following: I'd like to setup a synonym like "dba,
database administrator".

Instead of increasing the number of results for the keyword "dba", the
results got smaller and it only brought me back results that had both
the keywords "dba" and "database administrator" at the same time instead
of bringing back both "dba" and "database administrator" as expected
since our synonym configuration is using expand=true.

Since in the past this was not the expected behavior, I'd like to know
whether something changed in the solr/lucene internals so that this
functionality is now lost, or if I'm doing something wrong with my setup.

Currently all fields pass through the Synonym filter factory. The
analysis shows me that it tries to search for database administrator and
DBA. A debug query also shows me that the query it's trying to do is
something like this:

+DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
observation:"(dba datab) administr"^10.0 | description:"(dba datab)
administr"^10.0 | company:"(dba datab) administr")~0.1)

The problem is: when I search for this, I get 5 results. When I search
for dba only, without the "dba, database administrator" line in the
synonyms.txt file, I get more than 100 results.

Do you guys know why this is happening?

Thank you,

Leonardo
    



  

--

Leonardo Dias
Gerente de Processos Estratégicos
[hidden email]
Tel.:(11) 3177.0742
Ramal: 742
www.catho.com.br
Antes de imprimir,
pense no meio ambiente.
 

Esta mensagem é destinada exclusivamente para a(s) pessoa(s) a quem é dirigida, podendo conter informação confidencial e legalmente protegida. Se você não for destinatário desta mensagem, desde já fica notificado de abster-se a divulgar, copiar, distribuir, examinar ou, de qualquer forma, utilizar a informação contida nesta mensagem, por ser ilegal. Caso você tenha recebido esta mensagem por engano, pedimos que responda essa mensagem informando o acontecido.
Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Vernon Chapman
Leonardo,

The only other thing I can think of is check the
Field type in the schema.xml file make sure that you are using the same
filters.

For example if in your index analyzer you use the solr.SynonymFilterFactory
filter make sure your query analyzer also uses the same filter class.

Other than that I am stuck, hope that helps

Vernon



On 3/31/09 3:39 PM, "Leonardo Dias" <[hidden email]> wrote:

> Hi, Vernon!
>
> We tried both approaches: OR and AND. In both cases, the results were smaller
> when the synonyms was set up, with no change at all when it comes to synonyms.
>
> Any other ideas? Is it likely to be a bug?
>
> Best,
>
> Leonardo
>
> Vernon Chapman escreveu:
>>  
>> Leonardo,
>>
>> I am no expert but I would check to make sure that the
>> DefaultOperator parameter in your schema.xml file is set to
>> OR rather thank AND.
>>
>> Vernon
>>
>> On 3/31/09 3:24 PM, "Leonardo Dias" <[hidden email]>
>> <mailto:[hidden email]>  wrote:
>>
>>  
>>  
>>>  
>>> Hello there. How are you guys?
>>>
>>> We're having problems with synonyms here and I thought that maybe you
>>> guys could help us on how SOLR works for synonyms.
>>>
>>> The problem is the following: I'd like to setup a synonym like "dba,
>>> database administrator".
>>>
>>> Instead of increasing the number of results for the keyword "dba", the
>>> results got smaller and it only brought me back results that had both
>>> the keywords "dba" and "database administrator" at the same time instead
>>> of bringing back both "dba" and "database administrator" as expected
>>> since our synonym configuration is using expand=true.
>>>
>>> Since in the past this was not the expected behavior, I'd like to know
>>> whether something changed in the solr/lucene internals so that this
>>> functionality is now lost, or if I'm doing something wrong with my setup.
>>>
>>> Currently all fields pass through the Synonym filter factory. The
>>> analysis shows me that it tries to search for database administrator and
>>> DBA. A debug query also shows me that the query it's trying to do is
>>> something like this:
>>>
>>> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
>>> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
>>> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
>>> administr"^10.0 | company:"(dba datab) administr")~0.1)
>>>
>>> The problem is: when I search for this, I get 5 results. When I search
>>> for dba only, without the "dba, database administrator" line in the
>>> synonyms.txt file, I get more than 100 results.
>>>
>>> Do you guys know why this is happening?
>>>
>>> Thank you,
>>>
>>> Leonardo
>>>    
>>>  
>>  
>>
>>
>>
>>  
>>  


on 3/31/09 3:39 pm, "leonardo dias" <[hidden email]> wrote:

hi, vernon!

we tried both approaches: or and and. in both cases, the results were
smaller when the synonyms was set up, with no change at all when it comes to
synonyms.

any other ideas? is it likely to be a bug?

best,

leonardo

vernon chapman escreveu:
 
leonardo,

i am no expert but i would check to make sure that the
defaultoperator parameter in your schema.xml file is set to
or rather thank and.

vernon

on 3/31/09 3:24 pm, "leonardo dias" <[hidden email]>
<mailto:[hidden email]>  wrote:

 
 
 
hello there. how are you guys?

we're having problems with synonyms here and i thought that maybe you
guys could help us on how solr works for synonyms.

the problem is the following: i'd like to setup a synonym like "dba,
database administrator".

instead of increasing the number of results for the keyword "dba", the
results got smaller and it only brought me back results that had both
the keywords "dba" and "database administrator" at the same time instead
of bringing back both "dba" and "database administrator" as expected
since our synonym configuration is using expand=true.

since in the past this was not the expected behavior, i'd like to know
whether something changed in the solr/lucene internals so that this
functionality is now lost, or if i'm doing something wrong with my setup.

currently all fields pass through the synonym filter factory. the
analysis shows me that it tries to search for database administrator and
dba. a debug query also shows me that the query it's trying to do is
something like this:

+disjunctionmaxquery((title:"(dba datab) administr")~0.1)
disjunctionmaxquery((title:"(dba datab) administr"^100000.0 |
observation:"(dba datab) administr"^10.0 | description:"(dba datab)
administr"^10.0 | company:"(dba datab) administr")~0.1)

the problem is: when i search for this, i get 5 results. when i search
for dba only, without the "dba, database administrator" line in the
synonyms.txt file, i get more than 100 results.

do you guys know why this is happening?

thank you,

leonardo
   
 
 



 

Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Yonik Seeley-2-2
In reply to this post by Leonardo Dias-5
This is a known limitation of using the SynonymFilter and expanding to
variants of different sizes at query time.  See the notes for
SynonymFilterFactory here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46


-Yonik
http://www.lucidimagination.com


On Tue, Mar 31, 2009 at 3:24 PM, Leonardo Dias <[hidden email]> wrote:
> Hello there. How are you guys?
>



> We're having problems with synonyms here and I thought that maybe you guys
> could help us on how SOLR works for synonyms.
>
> The problem is the following: I'd like to setup a synonym like "dba,
> database administrator".
>
> Instead of increasing the number of results for the keyword "dba", the
> results got smaller and it only brought me back results that had both the
> keywords "dba" and "database administrator" at the same time instead of
> bringing back both "dba" and "database administrator" as expected since our
> synonym configuration is using expand=true.
>
> Since in the past this was not the expected behavior, I'd like to know
> whether something changed in the solr/lucene internals so that this
> functionality is now lost, or if I'm doing something wrong with my setup.
>
> Currently all fields pass through the Synonym filter factory. The analysis
> shows me that it tries to search for database administrator and DBA. A debug
> query also shows me that the query it's trying to do is something like this:
>
> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
> administr"^10.0 | company:"(dba datab) administr")~0.1)
>
> The problem is: when I search for this, I get 5 results. When I search for
> dba only, without the "dba, database administrator" line in the synonyms.txt
> file, I get more than 100 results.
>
> Do you guys know why this is happening?
>
> Thank you,
>
> Leonardo
>
Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Walter Underwood, Netflix
In reply to this post by Vernon Chapman
It looks like you are using synonyms at query time. Don't do that, it
works very strangely. Only use them at index time. That does the right
matching and also gives the right idf for scoring.

More details are here:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4
ddd82e453dc68fcfc92da77358d46

wunder

On 3/31/09 1:04 PM, "Vernon Chapman" <[hidden email]> wrote:

> Leonardo,
>
> The only other thing I can think of is check the
> Field type in the schema.xml file make sure that you are using the same
> filters.
>
> For example if in your index analyzer you use the solr.SynonymFilterFactory
> filter make sure your query analyzer also uses the same filter class.
>
> Other than that I am stuck, hope that helps
>
> Vernon
>
>
>
> On 3/31/09 3:39 PM, "Leonardo Dias" <[hidden email]> wrote:
>
>> Hi, Vernon!
>>
>> We tried both approaches: OR and AND. In both cases, the results were smaller
>> when the synonyms was set up, with no change at all when it comes to
>> synonyms.
>>
>> Any other ideas? Is it likely to be a bug?
>>
>> Best,
>>
>> Leonardo
>>
>> Vernon Chapman escreveu:
>>>  
>>> Leonardo,
>>>
>>> I am no expert but I would check to make sure that the
>>> DefaultOperator parameter in your schema.xml file is set to
>>> OR rather thank AND.
>>>
>>> Vernon
>>>
>>> On 3/31/09 3:24 PM, "Leonardo Dias" <[hidden email]>
>>> <mailto:[hidden email]>  wrote:
>>>
>>>  
>>>  
>>>>  
>>>> Hello there. How are you guys?
>>>>
>>>> We're having problems with synonyms here and I thought that maybe you
>>>> guys could help us on how SOLR works for synonyms.
>>>>
>>>> The problem is the following: I'd like to setup a synonym like "dba,
>>>> database administrator".
>>>>
>>>> Instead of increasing the number of results for the keyword "dba", the
>>>> results got smaller and it only brought me back results that had both
>>>> the keywords "dba" and "database administrator" at the same time instead
>>>> of bringing back both "dba" and "database administrator" as expected
>>>> since our synonym configuration is using expand=true.
>>>>
>>>> Since in the past this was not the expected behavior, I'd like to know
>>>> whether something changed in the solr/lucene internals so that this
>>>> functionality is now lost, or if I'm doing something wrong with my setup.
>>>>
>>>> Currently all fields pass through the Synonym filter factory. The
>>>> analysis shows me that it tries to search for database administrator and
>>>> DBA. A debug query also shows me that the query it's trying to do is
>>>> something like this:
>>>>
>>>> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
>>>> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
>>>> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
>>>> administr"^10.0 | company:"(dba datab) administr")~0.1)
>>>>
>>>> The problem is: when I search for this, I get 5 results. When I search
>>>> for dba only, without the "dba, database administrator" line in the
>>>> synonyms.txt file, I get more than 100 results.
>>>>
>>>> Do you guys know why this is happening?
>>>>
>>>> Thank you,
>>>>
>>>> Leonardo
>>>>    
>>>>  
>>>  
>>>
>>>
>>>
>>>  
>>>  
>
>
> on 3/31/09 3:39 pm, "leonardo dias" <[hidden email]> wrote:
>
> hi, vernon!
>
> we tried both approaches: or and and. in both cases, the results were
> smaller when the synonyms was set up, with no change at all when it comes to
> synonyms.
>
> any other ideas? is it likely to be a bug?
>
> best,
>
> leonardo
>
> vernon chapman escreveu:
>  
> leonardo,
>
> i am no expert but i would check to make sure that the
> defaultoperator parameter in your schema.xml file is set to
> or rather thank and.
>
> vernon
>
> on 3/31/09 3:24 pm, "leonardo dias" <[hidden email]>
> <mailto:[hidden email]>  wrote:
>
>  
>  
>  
> hello there. how are you guys?
>
> we're having problems with synonyms here and i thought that maybe you
> guys could help us on how solr works for synonyms.
>
> the problem is the following: i'd like to setup a synonym like "dba,
> database administrator".
>
> instead of increasing the number of results for the keyword "dba", the
> results got smaller and it only brought me back results that had both
> the keywords "dba" and "database administrator" at the same time instead
> of bringing back both "dba" and "database administrator" as expected
> since our synonym configuration is using expand=true.
>
> since in the past this was not the expected behavior, i'd like to know
> whether something changed in the solr/lucene internals so that this
> functionality is now lost, or if i'm doing something wrong with my setup.
>
> currently all fields pass through the synonym filter factory. the
> analysis shows me that it tries to search for database administrator and
> dba. a debug query also shows me that the query it's trying to do is
> something like this:
>
> +disjunctionmaxquery((title:"(dba datab) administr")~0.1)
> disjunctionmaxquery((title:"(dba datab) administr"^100000.0 |
> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
> administr"^10.0 | company:"(dba datab) administr")~0.1)
>
> the problem is: when i search for this, i get 5 results. when i search
> for dba only, without the "dba, database administrator" line in the
> synonyms.txt file, i get more than 100 results.
>
> do you guys know why this is happening?
>
> thank you,
>
> leonardo
>    
>  
>  
>
>
>
>  
>

Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Mark Ferguson
In reply to this post by Vernon Chapman
It's okay to not use the SynonymFilter for querying and for indexing. In
fact, you would really only want to use one or the other: either index all
synonyms, or query for them, but not both.

I have read that there are issues with multi-word synonyms and my guess is
that this is where your problem is, but my understanding of the issue is
limited. Hopefully someone else can provide more insight.

Mark


On Tue, Mar 31, 2009 at 2:04 PM, Vernon Chapman <[hidden email]>wrote:

> Leonardo,
>
> The only other thing I can think of is check the
> Field type in the schema.xml file make sure that you are using the same
> filters.
>
> For example if in your index analyzer you use the solr.SynonymFilterFactory
> filter make sure your query analyzer also uses the same filter class.
>
> Other than that I am stuck, hope that helps
>
> Vernon
>
>
>
> On 3/31/09 3:39 PM, "Leonardo Dias" <[hidden email]> wrote:
>
> > Hi, Vernon!
> >
> > We tried both approaches: OR and AND. In both cases, the results were
> smaller
> > when the synonyms was set up, with no change at all when it comes to
> synonyms.
> >
> > Any other ideas? Is it likely to be a bug?
> >
> > Best,
> >
> > Leonardo
> >
> > Vernon Chapman escreveu:
> >>
> >> Leonardo,
> >>
> >> I am no expert but I would check to make sure that the
> >> DefaultOperator parameter in your schema.xml file is set to
> >> OR rather thank AND.
> >>
> >> Vernon
> >>
> >> On 3/31/09 3:24 PM, "Leonardo Dias" <[hidden email]>
> >> <mailto:[hidden email]>  wrote:
> >>
> >>
> >>
> >>>
> >>> Hello there. How are you guys?
> >>>
> >>> We're having problems with synonyms here and I thought that maybe you
> >>> guys could help us on how SOLR works for synonyms.
> >>>
> >>> The problem is the following: I'd like to setup a synonym like "dba,
> >>> database administrator".
> >>>
> >>> Instead of increasing the number of results for the keyword "dba", the
> >>> results got smaller and it only brought me back results that had both
> >>> the keywords "dba" and "database administrator" at the same time
> instead
> >>> of bringing back both "dba" and "database administrator" as expected
> >>> since our synonym configuration is using expand=true.
> >>>
> >>> Since in the past this was not the expected behavior, I'd like to know
> >>> whether something changed in the solr/lucene internals so that this
> >>> functionality is now lost, or if I'm doing something wrong with my
> setup.
> >>>
> >>> Currently all fields pass through the Synonym filter factory. The
> >>> analysis shows me that it tries to search for database administrator
> and
> >>> DBA. A debug query also shows me that the query it's trying to do is
> >>> something like this:
> >>>
> >>> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
> >>> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
> >>> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
> >>> administr"^10.0 | company:"(dba datab) administr")~0.1)
> >>>
> >>> The problem is: when I search for this, I get 5 results. When I search
> >>> for dba only, without the "dba, database administrator" line in the
> >>> synonyms.txt file, I get more than 100 results.
> >>>
> >>> Do you guys know why this is happening?
> >>>
> >>> Thank you,
> >>>
> >>> Leonardo
> >>>
> >>>
> >>
> >>
> >>
> >>
> >>
> >>
>
>
> on 3/31/09 3:39 pm, "leonardo dias" <[hidden email]> wrote:
>
> hi, vernon!
>
> we tried both approaches: or and and. in both cases, the results were
> smaller when the synonyms was set up, with no change at all when it comes
> to
> synonyms.
>
> any other ideas? is it likely to be a bug?
>
> best,
>
> leonardo
>
> vernon chapman escreveu:
>
> leonardo,
>
> i am no expert but i would check to make sure that the
> defaultoperator parameter in your schema.xml file is set to
> or rather thank and.
>
> vernon
>
> on 3/31/09 3:24 pm, "leonardo dias" <[hidden email]>
> <mailto:[hidden email]>  wrote:
>
>
>
>
> hello there. how are you guys?
>
> we're having problems with synonyms here and i thought that maybe you
> guys could help us on how solr works for synonyms.
>
> the problem is the following: i'd like to setup a synonym like "dba,
> database administrator".
>
> instead of increasing the number of results for the keyword "dba", the
> results got smaller and it only brought me back results that had both
> the keywords "dba" and "database administrator" at the same time instead
> of bringing back both "dba" and "database administrator" as expected
> since our synonym configuration is using expand=true.
>
> since in the past this was not the expected behavior, i'd like to know
> whether something changed in the solr/lucene internals so that this
> functionality is now lost, or if i'm doing something wrong with my setup.
>
> currently all fields pass through the synonym filter factory. the
> analysis shows me that it tries to search for database administrator and
> dba. a debug query also shows me that the query it's trying to do is
> something like this:
>
> +disjunctionmaxquery((title:"(dba datab) administr")~0.1)
> disjunctionmaxquery((title:"(dba datab) administr"^100000.0 |
> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
> administr"^10.0 | company:"(dba datab) administr")~0.1)
>
> the problem is: when i search for this, i get 5 results. when i search
> for dba only, without the "dba, database administrator" line in the
> synonyms.txt file, i get more than 100 results.
>
> do you guys know why this is happening?
>
> thank you,
>
> leonardo
>
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Problems with synonyms

Vernon Chapman
In reply to this post by Walter Underwood, Netflix
Walter,

Thanks for clarifying my mistake there.
I wouldn't want to send someone down the wrong path.

Thanks
Vernon


On 3/31/09 4:17 PM, "Walter Underwood" <[hidden email]> wrote:

> It looks like you are using synonyms at query time. Don't do that, it
> works very strangely. Only use them at index time. That does the right
> matching and also gives the right idf for scoring.
>
> More details are here:
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4
> ddd82e453dc68fcfc92da77358d46
>
> wunder
>
> On 3/31/09 1:04 PM, "Vernon Chapman" <[hidden email]> wrote:
>
>> Leonardo,
>>
>> The only other thing I can think of is check the
>> Field type in the schema.xml file make sure that you are using the same
>> filters.
>>
>> For example if in your index analyzer you use the solr.SynonymFilterFactory
>> filter make sure your query analyzer also uses the same filter class.
>>
>> Other than that I am stuck, hope that helps
>>
>> Vernon
>>
>>
>>
>> On 3/31/09 3:39 PM, "Leonardo Dias" <[hidden email]> wrote:
>>
>>> Hi, Vernon!
>>>
>>> We tried both approaches: OR and AND. In both cases, the results were
>>> smaller
>>> when the synonyms was set up, with no change at all when it comes to
>>> synonyms.
>>>
>>> Any other ideas? Is it likely to be a bug?
>>>
>>> Best,
>>>
>>> Leonardo
>>>
>>> Vernon Chapman escreveu:
>>>>  
>>>> Leonardo,
>>>>
>>>> I am no expert but I would check to make sure that the
>>>> DefaultOperator parameter in your schema.xml file is set to
>>>> OR rather thank AND.
>>>>
>>>> Vernon
>>>>
>>>> On 3/31/09 3:24 PM, "Leonardo Dias" <[hidden email]>
>>>> <mailto:[hidden email]>  wrote:
>>>>
>>>>  
>>>>  
>>>>>  
>>>>> Hello there. How are you guys?
>>>>>
>>>>> We're having problems with synonyms here and I thought that maybe you
>>>>> guys could help us on how SOLR works for synonyms.
>>>>>
>>>>> The problem is the following: I'd like to setup a synonym like "dba,
>>>>> database administrator".
>>>>>
>>>>> Instead of increasing the number of results for the keyword "dba", the
>>>>> results got smaller and it only brought me back results that had both
>>>>> the keywords "dba" and "database administrator" at the same time instead
>>>>> of bringing back both "dba" and "database administrator" as expected
>>>>> since our synonym configuration is using expand=true.
>>>>>
>>>>> Since in the past this was not the expected behavior, I'd like to know
>>>>> whether something changed in the solr/lucene internals so that this
>>>>> functionality is now lost, or if I'm doing something wrong with my setup.
>>>>>
>>>>> Currently all fields pass through the Synonym filter factory. The
>>>>> analysis shows me that it tries to search for database administrator and
>>>>> DBA. A debug query also shows me that the query it's trying to do is
>>>>> something like this:
>>>>>
>>>>> +DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
>>>>> DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
>>>>> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
>>>>> administr"^10.0 | company:"(dba datab) administr")~0.1)
>>>>>
>>>>> The problem is: when I search for this, I get 5 results. When I search
>>>>> for dba only, without the "dba, database administrator" line in the
>>>>> synonyms.txt file, I get more than 100 results.
>>>>>
>>>>> Do you guys know why this is happening?
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Leonardo
>>>>>    
>>>>>  
>>>>  
>>>>
>>>>
>>>>
>>>>  
>>>>  
>>
>>
>> on 3/31/09 3:39 pm, "leonardo dias" <[hidden email]> wrote:
>>
>> hi, vernon!
>>
>> we tried both approaches: or and and. in both cases, the results were
>> smaller when the synonyms was set up, with no change at all when it comes to
>> synonyms.
>>
>> any other ideas? is it likely to be a bug?
>>
>> best,
>>
>> leonardo
>>
>> vernon chapman escreveu:
>>  
>> leonardo,
>>
>> i am no expert but i would check to make sure that the
>> defaultoperator parameter in your schema.xml file is set to
>> or rather thank and.
>>
>> vernon
>>
>> on 3/31/09 3:24 pm, "leonardo dias" <[hidden email]>
>> <mailto:[hidden email]>  wrote:
>>
>>  
>>  
>>  
>> hello there. how are you guys?
>>
>> we're having problems with synonyms here and i thought that maybe you
>> guys could help us on how solr works for synonyms.
>>
>> the problem is the following: i'd like to setup a synonym like "dba,
>> database administrator".
>>
>> instead of increasing the number of results for the keyword "dba", the
>> results got smaller and it only brought me back results that had both
>> the keywords "dba" and "database administrator" at the same time instead
>> of bringing back both "dba" and "database administrator" as expected
>> since our synonym configuration is using expand=true.
>>
>> since in the past this was not the expected behavior, i'd like to know
>> whether something changed in the solr/lucene internals so that this
>> functionality is now lost, or if i'm doing something wrong with my setup.
>>
>> currently all fields pass through the synonym filter factory. the
>> analysis shows me that it tries to search for database administrator and
>> dba. a debug query also shows me that the query it's trying to do is
>> something like this:
>>
>> +disjunctionmaxquery((title:"(dba datab) administr")~0.1)
>> disjunctionmaxquery((title:"(dba datab) administr"^100000.0 |
>> observation:"(dba datab) administr"^10.0 | description:"(dba datab)
>> administr"^10.0 | company:"(dba datab) administr")~0.1)
>>
>> the problem is: when i search for this, i get 5 results. when i search
>> for dba only, without the "dba, database administrator" line in the
>> synonyms.txt file, i get more than 100 results.
>>
>> do you guys know why this is happening?
>>
>> thank you,
>>
>> leonardo
>>    
>>  
>>  
>>
>>
>>
>>  
>>
>


Reply | Threaded
Open this post in threaded view
|

RE: Problems with synonyms

dma_bamboo
In reply to this post by Leonardo Dias-5
Hi Leonardo,
 
I've been using the synonym filter at index time (expand = true) and it works just fine. Also use OR as the default operator. Once you do it at index time there is no point doing it at query time (which in fact is likely to be the reason of your problems).
 
Have a look at the Wiki page Yonik sent about it.
 
Cheers,
Daniel


From: Leonardo Dias [mailto:[hidden email]]
Sent: 31 March 2009 20:40
To: [hidden email]
Subject: Re: Problems with synonyms

Hi, Vernon!

We tried both approaches: OR and AND. In both cases, the results were smaller when the synonyms was set up, with no change at all when it comes to synonyms.

Any other ideas? Is it likely to be a bug?

Best,

Leonardo

Vernon Chapman escreveu:
Leonardo,

I am no expert but I would check to make sure that the
DefaultOperator parameter in your schema.xml file is set to
OR rather thank AND.

Vernon

On 3/31/09 3:24 PM, "Leonardo Dias" [hidden email] wrote:

  
Hello there. How are you guys?

We're having problems with synonyms here and I thought that maybe you
guys could help us on how SOLR works for synonyms.

The problem is the following: I'd like to setup a synonym like "dba,
database administrator".

Instead of increasing the number of results for the keyword "dba", the
results got smaller and it only brought me back results that had both
the keywords "dba" and "database administrator" at the same time instead
of bringing back both "dba" and "database administrator" as expected
since our synonym configuration is using expand=true.

Since in the past this was not the expected behavior, I'd like to know
whether something changed in the solr/lucene internals so that this
functionality is now lost, or if I'm doing something wrong with my setup.

Currently all fields pass through the Synonym filter factory. The
analysis shows me that it tries to search for database administrator and
DBA. A debug query also shows me that the query it's trying to do is
something like this:

+DisjunctionMaxQuery((title:"(dba datab) administr")~0.1)
DisjunctionMaxQuery((title:"(dba datab) administr"^100000.0 |
observation:"(dba datab) administr"^10.0 | description:"(dba datab)
administr"^10.0 | company:"(dba datab) administr")~0.1)

The problem is: when I search for this, I get 5 results. When I search
for dba only, without the "dba, database administrator" line in the
synonyms.txt file, I get more than 100 results.

Do you guys know why this is happening?

Thank you,

Leonardo
    



  

--

Leonardo Dias
Gerente de Processos Estratégicos
[hidden email]
Tel.:(11) 3177.0742
Ramal: 742
www.catho.com.br
Antes de imprimir,
pense no meio ambiente.
 

Esta mensagem é destinada exclusivamente para a(s) pessoa(s) a quem é dirigida, podendo conter informação confidencial e legalmente protegida. Se você não for destinatário desta mensagem, desde já fica notificado de abster-se a divulgar, copiar, distribuir, examinar ou, de qualquer forma, utilizar a informação contida nesta mensagem, por ser ilegal. Caso você tenha recebido esta mensagem por engano, pedimos que responda essa mensagem informando o acontecido.

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.