Weird Problem with Lucene

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Weird Problem with Lucene

Urs Eichmann-2
My index consists of about 26 fields. I have a very wierd problem: On
certain fields, I cannot search - i.e. the search always returns 0
documents. I used Luke's Lucene Index Toolbox, and the behaviour there is
weird as well:
 
I do the following in Luke's Program:
 
a) go to the Documents Tab
b) Enter term field-name: unit and value="DOSE", hit "Show all docs"
c) A list of 5 documents is displayed, which is ok. The query is unit:DOSE.
The parsed query is unit:DOSE and the rewritten query is unit:dose
d) Then I just hit the "Search" button without changing the query
e) now the result list is empty. The only difference I can see is that the
parsed query is unit:dose now instead of unit:DOSE.
 
Does anyone have an explanation for this behaviour? The problem is, the same
behaviour is in my program, e.g. if I look for "unit:DOSE", I will get no
documents returned. However, on many of the other 26 fields, it runs OK, and
I can't see any difference in the field definitions.
 
I had this problem in 1.4.3, changed now to 1.9 RC1, but the problem is
still the same.
 
Many thanks for any help!
Urs
 
Reply | Threaded
Open this post in threaded view
|

Re: Weird Problem with Lucene

JM Tinghir
Hi,

Actually I have the same problem. Queries are working on a few fields
but not all of them, although the index is ok (checked it with luke).

But I have no idea to solve that...


Jean-Marie Tinghir


2005/6/22, Urs Eichmann <[hidden email]>:

> My index consists of about 26 fields. I have a very wierd problem: On
> certain fields, I cannot search - i.e. the search always returns 0
> documents. I used Luke's Lucene Index Toolbox, and the behaviour there is
> weird as well:
>
> I do the following in Luke's Program:
>
> a) go to the Documents Tab
> b) Enter term field-name: unit and value="DOSE", hit "Show all docs"
> c) A list of 5 documents is displayed, which is ok. The query is unit:DOSE.
> The parsed query is unit:DOSE and the rewritten query is unit:dose
> d) Then I just hit the "Search" button without changing the query
> e) now the result list is empty. The only difference I can see is that the
> parsed query is unit:dose now instead of unit:DOSE.
>
> Does anyone have an explanation for this behaviour? The problem is, the same
> behaviour is in my program, e.g. if I look for "unit:DOSE", I will get no
> documents returned. However, on many of the other 26 fields, it runs OK, and
> I can't see any difference in the field definitions.
>
> I had this problem in 1.4.3, changed now to 1.9 RC1, but the problem is
> still the same.
>
> Many thanks for any help!
> Urs
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Weird Problem with Lucene

Erik Hatcher
Please send us a (short and sweet) example that demonstrates this  
issue - preferably using RAMDirectory in a way that is easily  
runnable by someone else.

Thanks,
     Erik


On Jun 22, 2005, at 3:36 PM, JM Tinghir wrote:

> Hi,
>
> Actually I have the same problem. Queries are working on a few fields
> but not all of them, although the index is ok (checked it with luke).
>
> But I have no idea to solve that...
>
>
> Jean-Marie Tinghir
>
>
> 2005/6/22, Urs Eichmann <[hidden email]>:
>
>> My index consists of about 26 fields. I have a very wierd problem: On
>> certain fields, I cannot search - i.e. the search always returns 0
>> documents. I used Luke's Lucene Index Toolbox, and the behaviour  
>> there is
>> weird as well:
>>
>> I do the following in Luke's Program:
>>
>> a) go to the Documents Tab
>> b) Enter term field-name: unit and value="DOSE", hit "Show all docs"
>> c) A list of 5 documents is displayed, which is ok. The query is  
>> unit:DOSE.
>> The parsed query is unit:DOSE and the rewritten query is unit:dose
>> d) Then I just hit the "Search" button without changing the query
>> e) now the result list is empty. The only difference I can see is  
>> that the
>> parsed query is unit:dose now instead of unit:DOSE.
>>
>> Does anyone have an explanation for this behaviour? The problem  
>> is, the same
>> behaviour is in my program, e.g. if I look for "unit:DOSE", I will  
>> get no
>> documents returned. However, on many of the other 26 fields, it  
>> runs OK, and
>> I can't see any difference in the field definitions.
>>
>> I had this problem in 1.4.3, changed now to 1.9 RC1, but the  
>> problem is
>> still the same.
>>
>> Many thanks for any help!
>> Urs
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Weird Problem with Lucene

Chris Hostetter-3
In reply to this post by Urs Eichmann-2
: I do the following in Luke's Program:

I have to confess, i've never acctually gotten arround to using Luke, but
if i understand what you're saying, and if Luke is doing what i think it
is, then i believe your problem is an Analyzer issue...

: b) Enter term field-name: unit and value="DOSE", hit "Show all docs"
: c) A list of 5 documents is displayed, which is ok. The query is unit:DOSE.
: The parsed query is unit:DOSE and the rewritten query is unit:dose

I'm assuming that when you enter that info in, Luke is doing a strict
TermDocs lookup for Term("unit","DOSE") and finding your docs.  Then maybe
it's making a TermQuery out of that Term, showing you the toString on that
query, and the toString of hte query it gets from the QueryParser ... but
it looks like the QueryParser is using an Analizer that lowercases your
term (making it "dose" instead of "DOSE")

When you do a search on this new TermQuery -- you get nothing, becuase
that's not the acctual term in your index.

: Does anyone have an explanation for this behaviour? The problem is, the same
: behaviour is in my program, e.g. if I look for "unit:DOSE", I will get no
: documents returned. However, on many of the other 26 fields, it runs OK, and
: I can't see any difference in the field definitions.

do you see common behavior for all fields which are (non-)tokenized?

if so, then like i said: analyzer.  your query parser is probabbly not
using the analyzer you want it to.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Weird Problem with Lucene

Urs Eichmann-2
In reply to this post by Urs Eichmann-2
First, sorry for the double post. I had problems with subscribing to the
mailing list and thought my first message didn't go through.

Thanks you Chris and the others for your valuable tips. It was indeed a
problem with the Analyzer. I used the SimpleAnalyzer and thought from the
doc's that it will change uppercase to lowercase. It seems like it didn't do
that. I changed the Analyzer to WhitespaceAnalyzer, and now it seems to work
OK. I don't completely understand why it didn't work before, but now it
does, and that is all it counts...

Urs
 

-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: Thursday, June 23, 2005 1:02 AM
To: '[hidden email]'
Subject: Re: Weird Problem with Lucene

: I do the following in Luke's Program:

I have to confess, i've never acctually gotten arround to using Luke, but if
i understand what you're saying, and if Luke is doing what i think it is,
then i believe your problem is an Analyzer issue...

: b) Enter term field-name: unit and value="DOSE", hit "Show all docs"
: c) A list of 5 documents is displayed, which is ok. The query is
unit:DOSE.
: The parsed query is unit:DOSE and the rewritten query is unit:dose

I'm assuming that when you enter that info in, Luke is doing a strict
TermDocs lookup for Term("unit","DOSE") and finding your docs.  Then maybe
it's making a TermQuery out of that Term, showing you the toString on that
query, and the toString of hte query it gets from the QueryParser ... but it
looks like the QueryParser is using an Analizer that lowercases your term
(making it "dose" instead of "DOSE")

When you do a search on this new TermQuery -- you get nothing, becuase
that's not the acctual term in your index.

: Does anyone have an explanation for this behaviour? The problem is, the
same
: behaviour is in my program, e.g. if I look for "unit:DOSE", I will get no
: documents returned. However, on many of the other 26 fields, it runs OK,
and
: I can't see any difference in the field definitions.

do you see common behavior for all fields which are (non-)tokenized?

if so, then like i said: analyzer.  your query parser is probabbly not using
the analyzer you want it to.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]