Searching a untokenized field using SnowballAnalyzer

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Searching a untokenized field using SnowballAnalyzer

Lorenzo Di Gaetano-2
Hi all,

I have the following problem. I use SnowballAnalyzer to index Documents
containing tokenized and untokenized fields. But when I try to search a
document using one of the untokenized fields (usually keywords and
unique identifiers) it doesn't find anything...

Simple exampe of code:

doc.add(new Field("car","ferrari",Field.Store.NO,Field.Index.UN_TOKENIZED);

when I try to search it using the following search strings:

car:ferrari

or

car:"ferrari"

it finds nothing.

If I use StandardAnalyzer instead of SnowballAnalyzer it finds the
Document correctly!!! Even the field name and the field value are
lowercases, it seems that there is a problem on querying untokenized
fields using SnowballAnalyzer... The only way I have to find my "car"
field is using TermQueries...  But  I absolutely need  to make complex
queries on multiple field values at once.

Please help me! Thank you in advance.

Lorenzo

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Searching a untokenized field using SnowballAnalyzer

Chris Hostetter-3
: doc.add(new Field("car","ferrari",Field.Store.NO,Field.Index.UN_TOKENIZED);
:
: when I try to search it using the following search strings:
:
: car:ferrari

: it finds nothing.

the IndexWriter knew that the "car" field was UN_TOKENIZED, but the
QueryParser doesn't -- you've told it every query should be processed with
the SnowballAnalyzer.  (take a look at the query.toString() to see what i
mean)

Try using telling the QueryParser to use a PerFieldAnalyzer with the
KeywordAnalyzer configured for the fields you left UN_TOKENIZED and see if
that helps.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Searching a untokenized field using SnowballAnalyzer

Mark Miller-3
In reply to this post by Lorenzo Di Gaetano-2
My guess? When you store those field untokenized, they are untokenized. When
you use the SnowBall analyzer with the query parser and search those
untokenized fields, you're query is tokenized. As you can imagine, a
tokenized search by not match un untokenzied field. Why does this not happen
with StandardAnalyzer? Most likely because StandardAnalyzer does not modify
ferrari during it's processing (in fact I know it does not) while
SnowBallAnalyzer probably does modify ferrari...perhaps to ferrar.

The results:

search query: ferrari
query parser /SnowballAnalyzer: ferrar
query parser /StandardAnalyzer: ferrari

- Mark

On 8/21/06, Lorenzo Di Gaetano <[hidden email]> wrote:

>
> Hi all,
>
> I have the following problem. I use SnowballAnalyzer to index Documents
> containing tokenized and untokenized fields. But when I try to search a
> document using one of the untokenized fields (usually keywords and
> unique identifiers) it doesn't find anything...
>
> Simple exampe of code:
>
> doc.add(new Field("car","ferrari",Field.Store.NO,Field.Index.UN_TOKENIZED
> );
>
> when I try to search it using the following search strings:
>
> car:ferrari
>
> or
>
> car:"ferrari"
>
> it finds nothing.
>
> If I use StandardAnalyzer instead of SnowballAnalyzer it finds the
> Document correctly!!! Even the field name and the field value are
> lowercases, it seems that there is a problem on querying untokenized
> fields using SnowballAnalyzer... The only way I have to find my "car"
> field is using TermQueries...  But  I absolutely need  to make complex
> queries on multiple field values at once.
>
> Please help me! Thank you in advance.
>
> Lorenzo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Searching a untokenized field using SnowballAnalyzer

Lorenzo Di Gaetano-2
In reply to this post by Chris Hostetter-3

>Try using telling the QueryParser to use a PerFieldAnalyzer with the
>KeywordAnalyzer configured for the fields you left UN_TOKENIZED and see if
>that helps.
>  
>
It helps!!! I wrapped (with PerFieldAnalyzerWrapper) KeywordAnalyzer as
default analyzer and SnowballAnalyzer only for the field that I need to
be tokenized. You solved my problem!!!

Thank you all very much!

Regards,

Lorenzo

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]