weird problem with letters S and T

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

weird problem with letters S and T

Joel Nylund
(I am super new to solr, sorry if this is an easy one)

Hi, I want to support an A-Z type view of my data.

I have a DataImportHandler that uses sql (my query is complex, but the  
part that matters is:

SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f

I can create this index with no issues.

I can query the title with no problem:

http://localhost:8983/solr/select?q=title:super

I can query the first letters mostly with no problem:

http://localhost:8983/solr/select?q=firstLetterTitle:a

Returns all the foo's with the first letter a.

This actually works with every letter except S and T

If I query those, I get no results. The weird thing if I do the title  
query above with "Super" I get lots of results, and the xml shoes the  
firstLetterTitles for those to be "S"

<doc>
<str name="firstLetterTitle">S</str>
<str name="id">84861348</str>
<str name="title">Super Cool</str>
</doc>

<doc>
<str name="firstLetterTitle">S</str>
<str name="id">108692</str>
<str name="title">Super 45</str>
</doc>

<doc>

etc.

Any ideas, are S and T special chars in query for solr?

here is the response from the s query with debug = true

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">24</int>

<lst name="params">
<str name="q">firstLetterTitle:s</str>
<str name="debugQuery">true</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>

<lst name="debug">
<str name="rawquerystring">firstLetterTitle:s</str>
<str name="querystring">firstLetterTitle:s</str>
<str name="parsedquery"/>
<str name="parsedquery_toString"/>
<lst name="explain"/>
<str name="QParser">OldLuceneQParser</str>

<lst name="timing">
<double name="time">2.0</double>

<lst name="prepare">
<double name="time">1.0</double>

<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">1.0</double>
</lst>

<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>

<lst name="process">
<double name="time">0.0</double>

<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>



thanks
Joel

Reply | Threaded
Open this post in threaded view
|

RE: weird problem with letters S and T

bernieh
Hi Joel, I had a similar issue the other day; in my case the solution turned out to be that the letters were stopwords. Don't know if this is your answer, but worth checking.
Bern

-----Original Message-----
From: Joel Nylund [mailto:[hidden email]]
Sent: Thursday, 29 October 2009 9:17 AM
To: [hidden email]
Subject: weird problem with letters S and T

(I am super new to solr, sorry if this is an easy one)

Hi, I want to support an A-Z type view of my data.

I have a DataImportHandler that uses sql (my query is complex, but the  
part that matters is:

SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f

I can create this index with no issues.

I can query the title with no problem:

http://localhost:8983/solr/select?q=title:super

I can query the first letters mostly with no problem:

http://localhost:8983/solr/select?q=firstLetterTitle:a

Returns all the foo's with the first letter a.

This actually works with every letter except S and T

If I query those, I get no results. The weird thing if I do the title  
query above with "Super" I get lots of results, and the xml shoes the  
firstLetterTitles for those to be "S"

<doc>
<str name="firstLetterTitle">S</str>
<str name="id">84861348</str>
<str name="title">Super Cool</str>
</doc>

<doc>
<str name="firstLetterTitle">S</str>
<str name="id">108692</str>
<str name="title">Super 45</str>
</doc>

<doc>

etc.

Any ideas, are S and T special chars in query for solr?

here is the response from the s query with debug = true

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">24</int>

<lst name="params">
<str name="q">firstLetterTitle:s</str>
<str name="debugQuery">true</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>

<lst name="debug">
<str name="rawquerystring">firstLetterTitle:s</str>
<str name="querystring">firstLetterTitle:s</str>
<str name="parsedquery"/>
<str name="parsedquery_toString"/>
<lst name="explain"/>
<str name="QParser">OldLuceneQParser</str>

<lst name="timing">
<double name="time">2.0</double>

<lst name="prepare">
<double name="time">1.0</double>

<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">1.0</double>
</lst>

<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>

<lst name="process">
<double name="time">0.0</double>

<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>

<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>



thanks
Joel

Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Joel Nylund
Thanks Bern, now that you mention it they are in there, I assume if I  
remove them it will work, but I probably dont want to do that right?

Is there a way for this particular query to ignore stopwords

thanks
Joel

On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:

> Hi Joel, I had a similar issue the other day; in my case the  
> solution turned out to be that the letters were stopwords. Don't  
> know if this is your answer, but worth checking.
> Bern
>
> -----Original Message-----
> From: Joel Nylund [mailto:[hidden email]]
> Sent: Thursday, 29 October 2009 9:17 AM
> To: [hidden email]
> Subject: weird problem with letters S and T
>
> (I am super new to solr, sorry if this is an easy one)
>
> Hi, I want to support an A-Z type view of my data.
>
> I have a DataImportHandler that uses sql (my query is complex, but the
> part that matters is:
>
> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>
> I can create this index with no issues.
>
> I can query the title with no problem:
>
> http://localhost:8983/solr/select?q=title:super
>
> I can query the first letters mostly with no problem:
>
> http://localhost:8983/solr/select?q=firstLetterTitle:a
>
> Returns all the foo's with the first letter a.
>
> This actually works with every letter except S and T
>
> If I query those, I get no results. The weird thing if I do the title
> query above with "Super" I get lots of results, and the xml shoes the
> firstLetterTitles for those to be "S"
>
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">84861348</str>
> <str name="title">Super Cool</str>
> </doc>
> −
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">108692</str>
> <str name="title">Super 45</str>
> </doc>
> −
> <doc>
>
> etc.
>
> Any ideas, are S and T special chars in query for solr?
>
> here is the response from the s query with debug = true
>
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">24</int>
> −
> <lst name="params">
> <str name="q">firstLetterTitle:s</str>
> <str name="debugQuery">true</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> −
> <lst name="debug">
> <str name="rawquerystring">firstLetterTitle:s</str>
> <str name="querystring">firstLetterTitle:s</str>
> <str name="parsedquery"/>
> <str name="parsedquery_toString"/>
> <lst name="explain"/>
> <str name="QParser">OldLuceneQParser</str>
> −
> <lst name="timing">
> <double name="time">2.0</double>
> −
> <lst name="prepare">
> <double name="time">1.0</double>
> −
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">1.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> −
> <lst name="process">
> <double name="time">0.0</double>
> −
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
>
>
>
> thanks
> Joel
>

Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Martijn v Groningen
I think that is not a problem, because your are only storing one
character per field. There are other text field types that do not have
the stop word filter, so give your first letter field that field type.
In this way stopword filter analyser is only disabled for searches on
the first letter field.

Cheers,

Martijn

2009/10/28 Joel Nylund <[hidden email]>:

> Thanks Bern, now that you mention it they are in there, I assume if I remove
> them it will work, but I probably dont want to do that right?
>
> Is there a way for this particular query to ignore stopwords
>
> thanks
> Joel
>
> On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:
>
>> Hi Joel, I had a similar issue the other day; in my case the solution
>> turned out to be that the letters were stopwords. Don't know if this is your
>> answer, but worth checking.
>> Bern
>>
>> -----Original Message-----
>> From: Joel Nylund [mailto:[hidden email]]
>> Sent: Thursday, 29 October 2009 9:17 AM
>> To: [hidden email]
>> Subject: weird problem with letters S and T
>>
>> (I am super new to solr, sorry if this is an easy one)
>>
>> Hi, I want to support an A-Z type view of my data.
>>
>> I have a DataImportHandler that uses sql (my query is complex, but the
>> part that matters is:
>>
>> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>>
>> I can create this index with no issues.
>>
>> I can query the title with no problem:
>>
>> http://localhost:8983/solr/select?q=title:super
>>
>> I can query the first letters mostly with no problem:
>>
>> http://localhost:8983/solr/select?q=firstLetterTitle:a
>>
>> Returns all the foo's with the first letter a.
>>
>> This actually works with every letter except S and T
>>
>> If I query those, I get no results. The weird thing if I do the title
>> query above with "Super" I get lots of results, and the xml shoes the
>> firstLetterTitles for those to be "S"
>>
>> <doc>
>> <str name="firstLetterTitle">S</str>
>> <str name="id">84861348</str>
>> <str name="title">Super Cool</str>
>> </doc>
>> −
>> <doc>
>> <str name="firstLetterTitle">S</str>
>> <str name="id">108692</str>
>> <str name="title">Super 45</str>
>> </doc>
>> −
>> <doc>
>>
>> etc.
>>
>> Any ideas, are S and T special chars in query for solr?
>>
>> here is the response from the s query with debug = true
>>
>> <response>
>> −
>> <lst name="responseHeader">
>> <int name="status">0</int>
>> <int name="QTime">24</int>
>> −
>> <lst name="params">
>> <str name="q">firstLetterTitle:s</str>
>> <str name="debugQuery">true</str>
>> </lst>
>> </lst>
>> <result name="response" numFound="0" start="0"/>
>> −
>> <lst name="debug">
>> <str name="rawquerystring">firstLetterTitle:s</str>
>> <str name="querystring">firstLetterTitle:s</str>
>> <str name="parsedquery"/>
>> <str name="parsedquery_toString"/>
>> <lst name="explain"/>
>> <str name="QParser">OldLuceneQParser</str>
>> −
>> <lst name="timing">
>> <double name="time">2.0</double>
>> −
>> <lst name="prepare">
>> <double name="time">1.0</double>
>> −
>> <lst name="org.apache.solr.handler.component.QueryComponent">
>> <double name="time">1.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.FacetComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.DebugComponent">
>> <double name="time">0.0</double>
>> </lst>
>> </lst>
>> −
>> <lst name="process">
>> <double name="time">0.0</double>
>> −
>> <lst name="org.apache.solr.handler.component.QueryComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.FacetComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>> <double name="time">0.0</double>
>> </lst>
>> −
>> <lst name="org.apache.solr.handler.component.DebugComponent">
>> <double name="time">0.0</double>
>> </lst>
>> </lst>
>> </lst>
>> </lst>
>> </response>
>>
>>
>>
>> thanks
>> Joel
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Joel Nylund
Well I tried removing those 2 letters from stopwords, didnt seem to  
help, I also tried changing the field type to "text_ws", didnt seem to  
work. Any other ideas?

thanks
Joel

On Oct 28, 2009, at 6:42 PM, Martijn v Groningen wrote:

> I think that is not a problem, because your are only storing one
> character per field. There are other text field types that do not have
> the stop word filter, so give your first letter field that field type.
> In this way stopword filter analyser is only disabled for searches on
> the first letter field.
>
> Cheers,
>
> Martijn
>
> 2009/10/28 Joel Nylund <[hidden email]>:
>> Thanks Bern, now that you mention it they are in there, I assume if  
>> I remove
>> them it will work, but I probably dont want to do that right?
>>
>> Is there a way for this particular query to ignore stopwords
>>
>> thanks
>> Joel
>>
>> On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:
>>
>>> Hi Joel, I had a similar issue the other day; in my case the  
>>> solution
>>> turned out to be that the letters were stopwords. Don't know if  
>>> this is your
>>> answer, but worth checking.
>>> Bern
>>>
>>> -----Original Message-----
>>> From: Joel Nylund [mailto:[hidden email]]
>>> Sent: Thursday, 29 October 2009 9:17 AM
>>> To: [hidden email]
>>> Subject: weird problem with letters S and T
>>>
>>> (I am super new to solr, sorry if this is an easy one)
>>>
>>> Hi, I want to support an A-Z type view of my data.
>>>
>>> I have a DataImportHandler that uses sql (my query is complex, but  
>>> the
>>> part that matters is:
>>>
>>> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>>>
>>> I can create this index with no issues.
>>>
>>> I can query the title with no problem:
>>>
>>> http://localhost:8983/solr/select?q=title:super
>>>
>>> I can query the first letters mostly with no problem:
>>>
>>> http://localhost:8983/solr/select?q=firstLetterTitle:a
>>>
>>> Returns all the foo's with the first letter a.
>>>
>>> This actually works with every letter except S and T
>>>
>>> If I query those, I get no results. The weird thing if I do the  
>>> title
>>> query above with "Super" I get lots of results, and the xml shoes  
>>> the
>>> firstLetterTitles for those to be "S"
>>>
>>> <doc>
>>> <str name="firstLetterTitle">S</str>
>>> <str name="id">84861348</str>
>>> <str name="title">Super Cool</str>
>>> </doc>
>>> −
>>> <doc>
>>> <str name="firstLetterTitle">S</str>
>>> <str name="id">108692</str>
>>> <str name="title">Super 45</str>
>>> </doc>
>>> −
>>> <doc>
>>>
>>> etc.
>>>
>>> Any ideas, are S and T special chars in query for solr?
>>>
>>> here is the response from the s query with debug = true
>>>
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">24</int>
>>> −
>>> <lst name="params">
>>> <str name="q">firstLetterTitle:s</str>
>>> <str name="debugQuery">true</str>
>>> </lst>
>>> </lst>
>>> <result name="response" numFound="0" start="0"/>
>>> −
>>> <lst name="debug">
>>> <str name="rawquerystring">firstLetterTitle:s</str>
>>> <str name="querystring">firstLetterTitle:s</str>
>>> <str name="parsedquery"/>
>>> <str name="parsedquery_toString"/>
>>> <lst name="explain"/>
>>> <str name="QParser">OldLuceneQParser</str>
>>> −
>>> <lst name="timing">
>>> <double name="time">2.0</double>
>>> −
>>> <lst name="prepare">
>>> <double name="time">1.0</double>
>>> −
>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>> <double name="time">1.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> </lst>
>>> −
>>> <lst name="process">
>>> <double name="time">0.0</double>
>>> −
>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> </lst>
>>> </lst>
>>> </lst>
>>> </response>
>>>
>>>
>>>
>>> thanks
>>> Joel
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

RE: weird problem with letters S and T

bernieh
In reply to this post by Joel Nylund
Hi Joel, I'm a relative beginner to solr myself. I think the "s" and "t" are probably in the stopwords list because there will be a lot of them resulting from analysing of words such as "don't" and "person's". Whether that's (hey, another example!) an issue for you will probably depend on what analysers you're using.

If you want that query to ignore stopwords, you might have to set up another index for that field, once with stopwords and once without.
bern

-----Original Message-----
From: Joel Nylund [mailto:[hidden email]]
Sent: Thursday, 29 October 2009 9:31 AM
To: [hidden email]
Subject: Re: weird problem with letters S and T

Thanks Bern, now that you mention it they are in there, I assume if I  
remove them it will work, but I probably dont want to do that right?

Is there a way for this particular query to ignore stopwords

thanks
Joel

On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:

> Hi Joel, I had a similar issue the other day; in my case the  
> solution turned out to be that the letters were stopwords. Don't  
> know if this is your answer, but worth checking.
> Bern
>
> -----Original Message-----
> From: Joel Nylund [mailto:[hidden email]]
> Sent: Thursday, 29 October 2009 9:17 AM
> To: [hidden email]
> Subject: weird problem with letters S and T
>
> (I am super new to solr, sorry if this is an easy one)
>
> Hi, I want to support an A-Z type view of my data.
>
> I have a DataImportHandler that uses sql (my query is complex, but the
> part that matters is:
>
> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>
> I can create this index with no issues.
>
> I can query the title with no problem:
>
> http://localhost:8983/solr/select?q=title:super
>
> I can query the first letters mostly with no problem:
>
> http://localhost:8983/solr/select?q=firstLetterTitle:a
>
> Returns all the foo's with the first letter a.
>
> This actually works with every letter except S and T
>
> If I query those, I get no results. The weird thing if I do the title
> query above with "Super" I get lots of results, and the xml shoes the
> firstLetterTitles for those to be "S"
>
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">84861348</str>
> <str name="title">Super Cool</str>
> </doc>
> −
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">108692</str>
> <str name="title">Super 45</str>
> </doc>
> −
> <doc>
>
> etc.
>
> Any ideas, are S and T special chars in query for solr?
>
> here is the response from the s query with debug = true
>
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">24</int>
> −
> <lst name="params">
> <str name="q">firstLetterTitle:s</str>
> <str name="debugQuery">true</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> −
> <lst name="debug">
> <str name="rawquerystring">firstLetterTitle:s</str>
> <str name="querystring">firstLetterTitle:s</str>
> <str name="parsedquery"/>
> <str name="parsedquery_toString"/>
> <lst name="explain"/>
> <str name="QParser">OldLuceneQParser</str>
> −
> <lst name="timing">
> <double name="time">2.0</double>
> −
> <lst name="prepare">
> <double name="time">1.0</double>
> −
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">1.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> −
> <lst name="process">
> <double name="time">0.0</double>
> −
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> −
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
>
>
>
> thanks
> Joel
>

Reply | Threaded
Open this post in threaded view
|

RE: weird problem with letters S and T

bernieh
In reply to this post by Joel Nylund
Joel, did you restart tomcat? Need to restart each time you change schema.xml.
bern

-----Original Message-----
From: Joel Nylund [mailto:[hidden email]]
Sent: Thursday, 29 October 2009 10:21 AM
To: [hidden email]
Subject: Re: weird problem with letters S and T

Well I tried removing those 2 letters from stopwords, didnt seem to  
help, I also tried changing the field type to "text_ws", didnt seem to  
work. Any other ideas?

thanks
Joel

On Oct 28, 2009, at 6:42 PM, Martijn v Groningen wrote:

> I think that is not a problem, because your are only storing one
> character per field. There are other text field types that do not have
> the stop word filter, so give your first letter field that field type.
> In this way stopword filter analyser is only disabled for searches on
> the first letter field.
>
> Cheers,
>
> Martijn
>
> 2009/10/28 Joel Nylund <[hidden email]>:
>> Thanks Bern, now that you mention it they are in there, I assume if  
>> I remove
>> them it will work, but I probably dont want to do that right?
>>
>> Is there a way for this particular query to ignore stopwords
>>
>> thanks
>> Joel
>>
>> On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:
>>
>>> Hi Joel, I had a similar issue the other day; in my case the  
>>> solution
>>> turned out to be that the letters were stopwords. Don't know if  
>>> this is your
>>> answer, but worth checking.
>>> Bern
>>>
>>> -----Original Message-----
>>> From: Joel Nylund [mailto:[hidden email]]
>>> Sent: Thursday, 29 October 2009 9:17 AM
>>> To: [hidden email]
>>> Subject: weird problem with letters S and T
>>>
>>> (I am super new to solr, sorry if this is an easy one)
>>>
>>> Hi, I want to support an A-Z type view of my data.
>>>
>>> I have a DataImportHandler that uses sql (my query is complex, but  
>>> the
>>> part that matters is:
>>>
>>> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>>>
>>> I can create this index with no issues.
>>>
>>> I can query the title with no problem:
>>>
>>> http://localhost:8983/solr/select?q=title:super
>>>
>>> I can query the first letters mostly with no problem:
>>>
>>> http://localhost:8983/solr/select?q=firstLetterTitle:a
>>>
>>> Returns all the foo's with the first letter a.
>>>
>>> This actually works with every letter except S and T
>>>
>>> If I query those, I get no results. The weird thing if I do the  
>>> title
>>> query above with "Super" I get lots of results, and the xml shoes  
>>> the
>>> firstLetterTitles for those to be "S"
>>>
>>> <doc>
>>> <str name="firstLetterTitle">S</str>
>>> <str name="id">84861348</str>
>>> <str name="title">Super Cool</str>
>>> </doc>
>>> −
>>> <doc>
>>> <str name="firstLetterTitle">S</str>
>>> <str name="id">108692</str>
>>> <str name="title">Super 45</str>
>>> </doc>
>>> −
>>> <doc>
>>>
>>> etc.
>>>
>>> Any ideas, are S and T special chars in query for solr?
>>>
>>> here is the response from the s query with debug = true
>>>
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">24</int>
>>> −
>>> <lst name="params">
>>> <str name="q">firstLetterTitle:s</str>
>>> <str name="debugQuery">true</str>
>>> </lst>
>>> </lst>
>>> <result name="response" numFound="0" start="0"/>
>>> −
>>> <lst name="debug">
>>> <str name="rawquerystring">firstLetterTitle:s</str>
>>> <str name="querystring">firstLetterTitle:s</str>
>>> <str name="parsedquery"/>
>>> <str name="parsedquery_toString"/>
>>> <lst name="explain"/>
>>> <str name="QParser">OldLuceneQParser</str>
>>> −
>>> <lst name="timing">
>>> <double name="time">2.0</double>
>>> −
>>> <lst name="prepare">
>>> <double name="time">1.0</double>
>>> −
>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>> <double name="time">1.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> </lst>
>>> −
>>> <lst name="process">
>>> <double name="time">0.0</double>
>>> −
>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> −
>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>> <double name="time">0.0</double>
>>> </lst>
>>> </lst>
>>> </lst>
>>> </lst>
>>> </response>
>>>
>>>
>>>
>>> thanks
>>> Joel
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Dave Searle
Or just reload the app pool. No need to restart the whole server

On 28 Oct 2009, at 23:23, "Bernadette Houghton" <[hidden email]
 > wrote:

> Joel, did you restart tomcat? Need to restart each time you change  
> schema.xml.
> bern
>
> -----Original Message-----
> From: Joel Nylund [mailto:[hidden email]]
> Sent: Thursday, 29 October 2009 10:21 AM
> To: [hidden email]
> Subject: Re: weird problem with letters S and T
>
> Well I tried removing those 2 letters from stopwords, didnt seem to
> help, I also tried changing the field type to "text_ws", didnt seem to
> work. Any other ideas?
>
> thanks
> Joel
>
> On Oct 28, 2009, at 6:42 PM, Martijn v Groningen wrote:
>
>> I think that is not a problem, because your are only storing one
>> character per field. There are other text field types that do not  
>> have
>> the stop word filter, so give your first letter field that field  
>> type.
>> In this way stopword filter analyser is only disabled for searches on
>> the first letter field.
>>
>> Cheers,
>>
>> Martijn
>>
>> 2009/10/28 Joel Nylund <[hidden email]>:
>>> Thanks Bern, now that you mention it they are in there, I assume if
>>> I remove
>>> them it will work, but I probably dont want to do that right?
>>>
>>> Is there a way for this particular query to ignore stopwords
>>>
>>> thanks
>>> Joel
>>>
>>> On Oct 28, 2009, at 6:20 PM, Bernadette Houghton wrote:
>>>
>>>> Hi Joel, I had a similar issue the other day; in my case the
>>>> solution
>>>> turned out to be that the letters were stopwords. Don't know if
>>>> this is your
>>>> answer, but worth checking.
>>>> Bern
>>>>
>>>> -----Original Message-----
>>>> From: Joel Nylund [mailto:[hidden email]]
>>>> Sent: Thursday, 29 October 2009 9:17 AM
>>>> To: [hidden email]
>>>> Subject: weird problem with letters S and T
>>>>
>>>> (I am super new to solr, sorry if this is an easy one)
>>>>
>>>> Hi, I want to support an A-Z type view of my data.
>>>>
>>>> I have a DataImportHandler that uses sql (my query is complex, but
>>>> the
>>>> part that matters is:
>>>>
>>>> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM  
>>>> Foo f
>>>>
>>>> I can create this index with no issues.
>>>>
>>>> I can query the title with no problem:
>>>>
>>>> http://localhost:8983/solr/select?q=title:super
>>>>
>>>> I can query the first letters mostly with no problem:
>>>>
>>>> http://localhost:8983/solr/select?q=firstLetterTitle:a
>>>>
>>>> Returns all the foo's with the first letter a.
>>>>
>>>> This actually works with every letter except S and T
>>>>
>>>> If I query those, I get no results. The weird thing if I do the
>>>> title
>>>> query above with "Super" I get lots of results, and the xml shoes
>>>> the
>>>> firstLetterTitles for those to be "S"
>>>>
>>>> <doc>
>>>> <str name="firstLetterTitle">S</str>
>>>> <str name="id">84861348</str>
>>>> <str name="title">Super Cool</str>
>>>> </doc>
>>>> −
>>>> <doc>
>>>> <str name="firstLetterTitle">S</str>
>>>> <str name="id">108692</str>
>>>> <str name="title">Super 45</str>
>>>> </doc>
>>>> −
>>>> <doc>
>>>>
>>>> etc.
>>>>
>>>> Any ideas, are S and T special chars in query for solr?
>>>>
>>>> here is the response from the s query with debug = true
>>>>
>>>> <response>
>>>> −
>>>> <lst name="responseHeader">
>>>> <int name="status">0</int>
>>>> <int name="QTime">24</int>
>>>> −
>>>> <lst name="params">
>>>> <str name="q">firstLetterTitle:s</str>
>>>> <str name="debugQuery">true</str>
>>>> </lst>
>>>> </lst>
>>>> <result name="response" numFound="0" start="0"/>
>>>> −
>>>> <lst name="debug">
>>>> <str name="rawquerystring">firstLetterTitle:s</str>
>>>> <str name="querystring">firstLetterTitle:s</str>
>>>> <str name="parsedquery"/>
>>>> <str name="parsedquery_toString"/>
>>>> <lst name="explain"/>
>>>> <str name="QParser">OldLuceneQParser</str>
>>>> −
>>>> <lst name="timing">
>>>> <double name="time">2.0</double>
>>>> −
>>>> <lst name="prepare">
>>>> <double name="time">1.0</double>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>>> <double name="time">1.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst  
>>>> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> </lst>
>>>> −
>>>> <lst name="process">
>>>> <double name="time">0.0</double>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.QueryComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.FacetComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst  
>>>> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.HighlightComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> −
>>>> <lst name="org.apache.solr.handler.component.DebugComponent">
>>>> <double name="time">0.0</double>
>>>> </lst>
>>>> </lst>
>>>> </lst>
>>>> </lst>
>>>> </response>
>>>>
>>>>
>>>>
>>>> thanks
>>>> Joel
>>>>
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Avlesh Singh
In reply to this post by Joel Nylund
>
> Any ideas, are S and T special chars in query for solr?
>
Nope, they are NOT. My guess is that

   - You are using a "text" type field for firstLetterTitle which has the
   stopword filter applied to it.
   - Your "stopwords.txt" file contains the characters "s" and "t" because
   of which the above mentioned filter "eats" them up while indexing and
   searching.

If the above assumptions are correct, then there are two ways to fix it -

   - Remove the characters "s" and "t" from your stopwords.txt file and do a
   re-index. Searches should work fine after that.
   - For this particular use-case, you can keep your firstLetterTitle field
   as a "string" type untokenized field. You will not have to worry about
   stopwords in that case.

Cheers
Avlesh

On Thu, Oct 29, 2009 at 3:47 AM, Joel Nylund <[hidden email]> wrote:

> (I am super new to solr, sorry if this is an easy one)
>
> Hi, I want to support an A-Z type view of my data.
>
> I have a DataImportHandler that uses sql (my query is complex, but the part
> that matters is:
>
> SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f
>
> I can create this index with no issues.
>
> I can query the title with no problem:
>
> http://localhost:8983/solr/select?q=title:super
>
> I can query the first letters mostly with no problem:
>
> http://localhost:8983/solr/select?q=firstLetterTitle:a
>
> Returns all the foo's with the first letter a.
>
> This actually works with every letter except S and T
>
> If I query those, I get no results. The weird thing if I do the title query
> above with "Super" I get lots of results, and the xml shoes the
> firstLetterTitles for those to be "S"
>
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">84861348</str>
> <str name="title">Super Cool</str>
> </doc>
> -
> <doc>
> <str name="firstLetterTitle">S</str>
> <str name="id">108692</str>
> <str name="title">Super 45</str>
> </doc>
> -
> <doc>
>
> etc.
>
> Any ideas, are S and T special chars in query for solr?
>
> here is the response from the s query with debug = true
>
> <response>
> -
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">24</int>
> -
> <lst name="params">
> <str name="q">firstLetterTitle:s</str>
> <str name="debugQuery">true</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> -
> <lst name="debug">
> <str name="rawquerystring">firstLetterTitle:s</str>
> <str name="querystring">firstLetterTitle:s</str>
> <str name="parsedquery"/>
> <str name="parsedquery_toString"/>
> <lst name="explain"/>
> <str name="QParser">OldLuceneQParser</str>
> -
> <lst name="timing">
> <double name="time">2.0</double>
> -
> <lst name="prepare">
> <double name="time">1.0</double>
> -
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">1.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> -
> <lst name="process">
> <double name="time">0.0</double>
> -
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
>
>
>
> thanks
> Joel
>
>
Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Norberto Meijome-6
In reply to this post by Joel Nylund
On Wed, 28 Oct 2009 19:20:37 -0400
Joel Nylund <[hidden email]> wrote:

> Well I tried removing those 2 letters from stopwords, didnt seem to  
> help, I also tried changing the field type to "text_ws", didnt seem to  
> work. Any other ideas?


Hi Joel,
if your stop word filter was applied on index, you will have to reindex again (at least those documents with S and T).

If your stop filter was *only* on query, then it should work after you reloaded your app.

b

_________________________
{Beto|Norberto|Numard} Meijome

"Those who do not remember the past are condemned to repeat it."
   George Santayana

I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Michel Bottan
Hi Joel,

If you intend querying for the TITLE which starts with specifics letters, I
have another solution which seems to be easier, since you don't need a
specific field for the first letter.

1. Create a new type in your schema.xml using the following analyzer

    <fieldType name="text_sort" class="solr.TextField"
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-zA-Z0-9])" replacement="" replace="all"/>
      </analyzer>
     </fieldType>

2. Create a copy field from its original

    <field name="title_sort"        type="text_sort" indexed="true"
stored="false"/>

<copyField source="title"           dest="title_sort"/>

3. Use Filter Quey to filter

i.e. &fq=title_sort:[a TO b]&s=title_sort asc (títulos começando em A até N)


4. Read field value for presentation from the original field

Cheers!
Michel Bottan

On Thu, Oct 29, 2009 at 1:23 AM, Norberto Meijome <[hidden email]>wrote:

> On Wed, 28 Oct 2009 19:20:37 -0400
> Joel Nylund <[hidden email]> wrote:
>
> > Well I tried removing those 2 letters from stopwords, didnt seem to
> > help, I also tried changing the field type to "text_ws", didnt seem to
> > work. Any other ideas?
>
>
> Hi Joel,
> if your stop word filter was applied on index, you will have to reindex
> again (at least those documents with S and T).
>
> If your stop filter was *only* on query, then it should work after you
> reloaded your app.
>
> b
>
> _________________________
> {Beto|Norberto|Numard} Meijome
>
> "Those who do not remember the past are condemned to repeat it."
>   George Santayana
>
> I speak for myself, not my employer. Contents may be hot. Slippery when
> wet. Reading disclaimers makes you go blind. Writing them is worse. You have
> been Warned.
>
Reply | Threaded
Open this post in threaded view
|

Re: weird problem with letters S and T

Joel Nylund
Hey everyone thanks for the help, it seems to be working this am after  
a restart & reindex (maybe I was just too sleepy last night), and  
using field type of text_ws.

Im curios about the pro's and cons of Michel's approach below, this  
seems like another good way to do it, is there any difference in terms  
of performance and/or index size or anything else I  need to worry  
about. My index will have about 3million records in prod, im testing  
with 300k (1/10 scale) now and it seems fine.

thanks
Joel

On Oct 29, 2009, at 8:09 AM, Michel Bottan wrote:

> Hi Joel,
>
> If you intend querying for the TITLE which starts with specifics  
> letters, I
> have another solution which seems to be easier, since you don't need a
> specific field for the first letter.
>
> 1. Create a new type in your schema.xml using the following analyzer
>
>    <fieldType name="text_sort" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer>
>        <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.ISOLatin1AccentFilterFactory"/>
>        <filter class="solr.TrimFilterFactory"/>
>        <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^a-zA-Z0-9])" replacement="" replace="all"/>
>      </analyzer>
>     </fieldType>
>
> 2. Create a copy field from its original
>
>    <field name="title_sort"        type="text_sort" indexed="true"
> stored="false"/>
>
> <copyField source="title"           dest="title_sort"/>
>
> 3. Use Filter Quey to filter
>
> i.e. &fq=title_sort:[a TO b]&s=title_sort asc (títulos começando em  
> A até N)
>
>
> 4. Read field value for presentation from the original field
>
> Cheers!
> Michel Bottan
>
> On Thu, Oct 29, 2009 at 1:23 AM, Norberto Meijome  
> <[hidden email]>wrote:
>
>> On Wed, 28 Oct 2009 19:20:37 -0400
>> Joel Nylund <[hidden email]> wrote:
>>
>>> Well I tried removing those 2 letters from stopwords, didnt seem to
>>> help, I also tried changing the field type to "text_ws", didnt  
>>> seem to
>>> work. Any other ideas?
>>
>>
>> Hi Joel,
>> if your stop word filter was applied on index, you will have to  
>> reindex
>> again (at least those documents with S and T).
>>
>> If your stop filter was *only* on query, then it should work after  
>> you
>> reloaded your app.
>>
>> b
>>
>> _________________________
>> {Beto|Norberto|Numard} Meijome
>>
>> "Those who do not remember the past are condemned to repeat it."
>>  George Santayana
>>
>> I speak for myself, not my employer. Contents may be hot. Slippery  
>> when
>> wet. Reading disclaimers makes you go blind. Writing them is worse.  
>> You have
>> been Warned.
>>