case-insensitive index and queries

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

case-insensitive index and queries

G.Long
Hi :)

I would like the "text" field of my index to be case-insensitive.
I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this
field for both indexing and querying. I read that StandardAnalyzer uses
LowerCaseFilter to lowercase the value of the field but when I run a
query, it doesn' work.

Here is my query :

IndexSearcher isearcher = new IndexSearcher(directory);
BooleanQuery query = new BooleanQuery();
PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();

QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query param = parser.parse(value);
query.add(param, BooleanClause.Occur.MUST);

TopFieldCollector collector = TopFieldCollector.create(new
Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
isearcher.search(query, collector);


The getPerFieldAnalyzer() methods looks like :

if(perFieldAnalyzerWrapper==null){
             perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new
KeywordAnalyzer());
             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new
StandardAnalyzer(Version.LUCENE_31));
             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new
StandardAnalyzer(Version.LUCENE_31));
}
return perFieldAnalyzerWrapper;

Is there something wrong with this code?

Thank you :)


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: case-insensitive index and queries

Ian Lea
From a glance the code looks OK, but there's lots you're not showing
that could cause it not to work - whatever you mean by that. Fails to
get hits on docs you think are in the index?

Look at the index with Luke to see what actually has been indexed.

Look at Query.toString() to see how the query has been parsed.

Read the bit of the FAQ titled something like "Why are my searches not
working?".


--
Ian.


On Wed, Nov 7, 2012 at 3:50 PM, G.Long <[hidden email]> wrote:

> Hi :)
>
> I would like the "text" field of my index to be case-insensitive.
> I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this field
> for both indexing and querying. I read that StandardAnalyzer uses
> LowerCaseFilter to lowercase the value of the field but when I run a query,
> it doesn' work.
>
> Here is my query :
>
> IndexSearcher isearcher = new IndexSearcher(directory);
> BooleanQuery query = new BooleanQuery();
> PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();
>
> QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
> parser.setDefaultOperator(QueryParser.AND_OPERATOR);
> Query param = parser.parse(value);
> query.add(param, BooleanClause.Occur.MUST);
>
> TopFieldCollector collector = TopFieldCollector.create(new
> Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
> isearcher.search(query, collector);
>
>
> The getPerFieldAnalyzer() methods looks like :
>
> if(perFieldAnalyzerWrapper==null){
>             perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new
> KeywordAnalyzer());
>             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new
> StandardAnalyzer(Version.LUCENE_31));
>             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new
> StandardAnalyzer(Version.LUCENE_31));
> }
> return perFieldAnalyzerWrapper;
>
> Is there something wrong with this code?
>
> Thank you :)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: case-insensitive index and queries

G.Long
Thank you for the tips. I looked at the index and the query and nothing
seemed to be wrong. Then I realized that someone put a condition in the
code after getting the results of the query. this condition removed docs
which did not contain the exact words of the query. This condition was
case sensitive u_u.

Problem solved :)



Le 07/11/2012 17:09, Ian Lea a écrit :

>  From a glance the code looks OK, but there's lots you're not showing
> that could cause it not to work - whatever you mean by that. Fails to
> get hits on docs you think are in the index?
>
> Look at the index with Luke to see what actually has been indexed.
>
> Look at Query.toString() to see how the query has been parsed.
>
> Read the bit of the FAQ titled something like "Why are my searches not
> working?".
>
>
> --
> Ian.
>
>
> On Wed, Nov 7, 2012 at 3:50 PM, G.Long <[hidden email]> wrote:
>> Hi :)
>>
>> I would like the "text" field of my index to be case-insensitive.
>> I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this field
>> for both indexing and querying. I read that StandardAnalyzer uses
>> LowerCaseFilter to lowercase the value of the field but when I run a query,
>> it doesn' work.
>>
>> Here is my query :
>>
>> IndexSearcher isearcher = new IndexSearcher(directory);
>> BooleanQuery query = new BooleanQuery();
>> PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();
>>
>> QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
>> parser.setDefaultOperator(QueryParser.AND_OPERATOR);
>> Query param = parser.parse(value);
>> query.add(param, BooleanClause.Occur.MUST);
>>
>> TopFieldCollector collector = TopFieldCollector.create(new
>> Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
>> isearcher.search(query, collector);
>>
>>
>> The getPerFieldAnalyzer() methods looks like :
>>
>> if(perFieldAnalyzerWrapper==null){
>>              perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new
>> KeywordAnalyzer());
>>              perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new
>> StandardAnalyzer(Version.LUCENE_31));
>>              perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new
>> StandardAnalyzer(Version.LUCENE_31));
>> }
>> return perFieldAnalyzerWrapper;
>>
>> Is there something wrong with this code?
>>
>> Thank you :)
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]