Field.Store.YES and Field.Store.Tokenized with CustomAnalyzer - Double Hit

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Field.Store.YES and Field.Store.Tokenized with CustomAnalyzer - Double Hit

Furash Gary
The behavior I want is that if I store a name (Gary Furash), a user who
searches for "Gary Furash" gets a strong hit, wheras a user who seaches
for "Gray Furish" gets a moderate hit.  I currently achieve this by

1. using a custom analyzer on insertion/search that tokenizes a
"soundex" version of the name field in the document (so "Gray" and
"Gary" become the same soundex code);
2. creating two separate fields in the document (one tokenized and one
plain text) -using a multiquery

So, when I search against the two fields, "Gary Furash" hits against two
fields in the document, but "Gray Furish" hits only one.

However, I noticed that when I say "Field.Store.YES" it stores the
original, pre-tokenized version, so it seems like I'm doubling up here.
Is there a better way to do this?

Gary Furash, MBA, PMP
Applications Manager, Maricopa County Attorney's Office

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Field.Store.YES and Field.Store.Tokenized with CustomAnalyzer - Double Hit

Erick Erickson
The only reason you need to store a token is if you need to retrieve it from
the document, storing is completely unnecessary for answering the question
"is this term in the document?". So I guess I'm wondering why you don't just
use Field.Store.NO on *both* of the fields.......

On 8/29/06, Furash Gary <[hidden email]> wrote:

>
> The behavior I want is that if I store a name (Gary Furash), a user who
> searches for "Gary Furash" gets a strong hit, wheras a user who seaches
> for "Gray Furish" gets a moderate hit.  I currently achieve this by
>
> 1. using a custom analyzer on insertion/search that tokenizes a
> "soundex" version of the name field in the document (so "Gray" and
> "Gary" become the same soundex code);
> 2. creating two separate fields in the document (one tokenized and one
> plain text) -using a multiquery
>
> So, when I search against the two fields, "Gary Furash" hits against two
> fields in the document, but "Gray Furish" hits only one.
>
> However, I noticed that when I say "Field.Store.YES" it stores the
> original, pre-tokenized version, so it seems like I'm doubling up here.
> Is there a better way to do this?
>
> Gary Furash, MBA, PMP
> Applications Manager, Maricopa County Attorney's Office
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Field.Store.YES and Field.Store.Tokenized with CustomAnalyzer - Double Hit

Chris Hostetter-3
In reply to this post by Furash Gary

: However, I noticed that when I say "Field.Store.YES" it stores the
: original, pre-tokenized version, so it seems like I'm doubling up here.
: Is there a better way to do this?

if you are doubling up to get the benefit of two seperate Analyzers, then
there is no need to "Store.YES" in both fields -- just store the value in
one (it doesn't matter which)

you could even choose not to store it in either of those fields, and
instead store it in a differnet field which is *not* indexed -- Stored
fields are really a very different beast then Indexed fields.  hen you say
you want a Stored and Indexed field, the exact same thing is done under
the coveres as if you said you want two Fields with teh same name, one
stored and not indexed and one indexed and not stored.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]