Query not finding indexed data

classic Classic list List threaded Threaded
4 messages Options
adb
Reply | Threaded
Open this post in threaded view
|

Query not finding indexed data

adb
Hi,

I have a field "attname" that is indexed with Field.Store.YES,
Field.Index.UN_TOKENIZED.  I have a document with the attname of
"IqTstAdminGuide2.pdf".

QueryParser parser = new QueryParser("body", new StandardAnalyzer());
Query query = parser.parse("attname:IqTstAdminGuide2.pdf");

fails to find the Document, which I guess is because of StandardAnalyzer
lowercasing the filename.

How can one instruct the QueryParser only to use the Analyzer to analyse fields
in an expression that were tokenized during the indexing process and to not
analyse those that were UN_TOKENIZED?

Regards
Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Query not finding indexed data

Doron Cohen
Hi Antony, you cannot instruct the query parser to do that. Note that an
application can add both tokenized and un_tokenized data under the same
field name. This is an application logic to know that a certain query is
not to be tokenized. In this case you could create your query with:
  query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");

Hope this helps,
Doron

Antony Bowesman <[hidden email]> wrote on 15/10/2006 20:08:37:

> Hi,
>
> I have a field "attname" that is indexed with Field.Store.YES,
> Field.Index.UN_TOKENIZED.  I have a document with the attname of
> "IqTstAdminGuide2.pdf".
>
> QueryParser parser = new QueryParser("body", new StandardAnalyzer());
> Query query = parser.parse("attname:IqTstAdminGuide2.pdf");
>
> fails to find the Document, which I guess is because of StandardAnalyzer
> lowercasing the filename.
>
> How can one instruct the QueryParser only to use the Analyzer to
> analyse fields
> in an expression that were tokenized during the indexing process and to
not

> analyse those that were UN_TOKENIZED?
>
> Regards
> Antony
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

adb
Reply | Threaded
Open this post in threaded view
|

Re: Query not finding indexed data

adb
Doron Cohen wrote:
> Hi Antony, you cannot instruct the query parser to do that. Note that an

Thanks, I suspected as much.  I've changed it to make the field tokenized.

> field name. This is an application logic to know that a certain query is
> not to be tokenized. In this case you could create your query with:
>   query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");

The query is user driven, so I can't know without parsing whether it should be
tokenised or not.  I would have to extend the parser to make use of TermQuery -
it's easier just to tokenize the field now I understand Lucene's behaviour.

Regards
Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Query not finding indexed data

Erik Hatcher

On Oct 16, 2006, at 2:44 AM, Antony Bowesman wrote:

> Doron Cohen wrote:
>> Hi Antony, you cannot instruct the query parser to do that. Note  
>> that an
>
> Thanks, I suspected as much.  I've changed it to make the field  
> tokenized.
>
>> field name. This is an application logic to know that a certain  
>> query is
>> not to be tokenized. In this case you could create your query with:
>>   query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");
>
> The query is user driven, so I can't know without parsing whether  
> it should be tokenised or not.  I would have to extend the parser  
> to make use of TermQuery - it's easier just to tokenize the field  
> now I understand Lucene's behaviour.

You can also use PerFieldAnalyzerWrapper as the analyzer for  
QueryParser, and for all your untokenized fields, specify a  
KeywordAnalyzer.  That will keep untokenized fields from being split  
(as best it can given QueryParser meta-syntax).

        Erik



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]