Multiple Analyzer on Single field

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple Analyzer on Single field

Allahbaksh Mohammedali Asadullah
Hi,
I want to add multiple Analyzer on single field. I want properties of KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is there any easy way to have all analyzer bundled on single field.
Regards,
Allahbaksh







**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Analyzer on Single field

Erick Erickson
This really doesn't make sense. KeywordAnalyzer will NOT
tokenize the input stream. StandardAnalyzer WILL tokenize
the input stream. I can't imagine what it means to do both at
the same time.

Perhaps you could give us some examples of what your desired
inputs and outputs are we could steer you in the right direction.

I suspect you're thinking more in terms of TokenFilters and/or
Tokenizers...

Best
Erick

On Mon, Apr 6, 2009 at 10:52 AM, Allahbaksh Mohammedali Asadullah <
[hidden email]> wrote:

> Hi,
> I want to add multiple Analyzer on single field. I want properties of
> KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
> there any easy way to have all analyzer bundled on single field.
> Regards,
> Allahbaksh
>
>
>
>
>
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Analyzer on Single field

Douglas Campos
In reply to this post by Allahbaksh Mohammedali Asadullah
What I've done is to put copies of the same field, built with different
analyzers, and later use a MultiFieldQueryParser matching all fields.

eg: "name", "name_phonetic", "name_keyword", ad nauseum

To define which analyzer will go to which field, use PerFieldAnalyzerWrapper

On Mon, Apr 6, 2009 at 11:52 AM, Allahbaksh Mohammedali Asadullah <
[hidden email]> wrote:

> Hi,
> I want to add multiple Analyzer on single field. I want properties of
> KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
> there any easy way to have all analyzer bundled on single field.
> Regards,
> Allahbaksh
>
>
>
>
>
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>



--
Douglas Campos
Theros Consulting
+55 11 9267 4540
+55 11 3020 8168
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Analyzer on Single field

Matthew Hall-7
... erm.. I'm still not quite sure what you are talking about.

But what you are trying to do, really isn't that hard.  Here's some
sample code that should get you to where you want to be:

During document creation time do something like this:

        doc.add(new Field("data",
                /data/,
                Field.Store.YES, Field.Index.TOKENIZED));
        doc.add(new Field("sdata",
                 /data/,
                Field.Store.YES, Field.Index.TOKENIZED));

Repeat for as many times you want the same data to be searchable via a
different analyzer.



Then at indexWriter time do something like this:

                PerFieldAnalyzerWrapper aWrapper = new
PerFieldAnalyzerWrapper(
                        new StandardAnalyzer());
                aWrapper.addAnalyzer("data", new /AnalyzerForDataField/);
                aWrapper.addAnalyzer("sdata", new /AnalyzerForsdataField/);
                writer = new IndexWriter(INDEX_DIR, aWrapper, true);

Again repeating for each field you want searchable via a different analyzer.

At search time, make sure your query parser uses this same
PerFieldAnalyzerWrapper, and you will be all set.

Well.. unless I really didn't understand what you were trying to do here...

There's always that possibility.

Matt

Douglas Campos wrote:

> What I've done is to put copies of the same field, built with different
> analyzers, and later use a MultiFieldQueryParser matching all fields.
>
> eg: "name", "name_phonetic", "name_keyword", ad nauseum
>
> To define which analyzer will go to which field, use PerFieldAnalyzerWrapper
>
> On Mon, Apr 6, 2009 at 11:52 AM, Allahbaksh Mohammedali Asadullah <
> [hidden email]> wrote:
>
>  
>> Hi,
>> I want to add multiple Analyzer on single field. I want properties of
>> KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
>> there any easy way to have all analyzer bundled on single field.
>> Regards,
>> Allahbaksh
>>
>>
>>
>>
>>
>>
>>
>> **************** CAUTION - Disclaimer *****************
>> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
>> solely
>> for the use of the addressee(s). If you are not the intended recipient,
>> please
>> notify the sender by e-mail and delete the original message. Further, you
>> are not
>> to copy, disclose, or distribute this e-mail or its contents to any other
>> person and
>> any such actions are unlawful. This e-mail may contain viruses. Infosys has
>> taken
>> every reasonable precaution to minimize this risk, but is not liable for
>> any damage
>> you may sustain as a result of any virus in this e-mail. You should carry
>> out your
>> own virus checks before opening the e-mail or attachment. Infosys reserves
>> the
>> right to monitor and review the content of all messages sent to or from
>> this e-mail
>> address. Messages sent to or from this e-mail address may be stored on the
>> Infosys e-mail system.
>> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>>
>>    
>
>
>
>  



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Multiple Analyzer on Single field

Allahbaksh Mohammedali Asadullah
In reply to this post by Erick Erickson
Hi All,
Sorry for the confused email.

Suppose I have a field text with content below

KeyWordAnalyzer is a class. this keyword is used in java.

Here the KeyWordAnalyzer into Key Word Analyzer and class should be a Key word. So if some one search. Apart from this I want Key Word Analzer to tokenized properly so that search become better.
Regards,
Allahbaksh
 
 
 
-----Original Message-----
From: Erick Erickson [mailto:[hidden email]]
Sent: Monday, April 06, 2009 9:31 PM
To: [hidden email]
Subject: Re: Multiple Analyzer on Single field

This really doesn't make sense. KeywordAnalyzer will NOT
tokenize the input stream. StandardAnalyzer WILL tokenize
the input stream. I can't imagine what it means to do both at
the same time.

Perhaps you could give us some examples of what your desired
inputs and outputs are we could steer you in the right direction.

I suspect you're thinking more in terms of TokenFilters and/or
Tokenizers...

Best
Erick

On Mon, Apr 6, 2009 at 10:52 AM, Allahbaksh Mohammedali Asadullah <
[hidden email]> wrote:

> Hi,
> I want to add multiple Analyzer on single field. I want properties of
> KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
> there any easy way to have all analyzer bundled on single field.
> Regards,
> Allahbaksh
>
>
>
>
>
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Multiple Analyzer on Single field

Erick Erickson
Hmmmm. There's nothing in Lucene that I know of that will do what you
want, you'll have to do one of two things:

In general, you'll have to break up your token stream yourself, either
through pre-processing or building your own analyzers. There's
nothing already built that I know of that will break up, for instance,
KeyWordAnalyzer into three tokens.

Part of the confusion is the use of the phrase "keyword" as in
"and class should be a Key word". If I'm reading this right, you'll
want "class" to be in a separate field since it's special (in your
context). Again, to accomplish this you either need to pre-process
the input stream, extract "class", and put it in a separate field or
create your own analyzer that extracts only "class" from the
input stream. Then you'd feed the entire contents into *both* fields (say
"content" and "key"). The analyzer attached to the "content" field
(see PerFieldAnalyzerWrapper) would take care of breaking up
things like KeyWordAnalyzer, and the analyzer attached to the
"key" field would throw away everything except "class"..

Hope this helps
Erick

On Tue, Apr 7, 2009 at 8:57 AM, Allahbaksh Mohammedali Asadullah <
[hidden email]> wrote:

> Hi All,
> Sorry for the confused email.
>
> Suppose I have a field text with content below
>
> KeyWordAnalyzer is a class. this keyword is used in java.
>
> Here the KeyWordAnalyzer into Key Word Analyzer and class should be a Key
> word. So if some one search. Apart from this I want Key Word Analzer to
> tokenized properly so that search become better.
> Regards,
> Allahbaksh
>
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:[hidden email]]
> Sent: Monday, April 06, 2009 9:31 PM
> To: [hidden email]
> Subject: Re: Multiple Analyzer on Single field
>
> This really doesn't make sense. KeywordAnalyzer will NOT
> tokenize the input stream. StandardAnalyzer WILL tokenize
> the input stream. I can't imagine what it means to do both at
> the same time.
>
> Perhaps you could give us some examples of what your desired
> inputs and outputs are we could steer you in the right direction.
>
> I suspect you're thinking more in terms of TokenFilters and/or
> Tokenizers...
>
> Best
> Erick
>
> On Mon, Apr 6, 2009 at 10:52 AM, Allahbaksh Mohammedali Asadullah <
> [hidden email]> wrote:
>
> > Hi,
> > I want to add multiple Analyzer on single field. I want properties of
> > KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
> > there any easy way to have all analyzer bundled on single field.
> > Regards,
> > Allahbaksh
> >
> >
> >
> >
> >
> >
> >
> > **************** CAUTION - Disclaimer *****************
> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> > solely
> > for the use of the addressee(s). If you are not the intended recipient,
> > please
> > notify the sender by e-mail and delete the original message. Further, you
> > are not
> > to copy, disclose, or distribute this e-mail or its contents to any other
> > person and
> > any such actions are unlawful. This e-mail may contain viruses. Infosys
> has
> > taken
> > every reasonable precaution to minimize this risk, but is not liable for
> > any damage
> > you may sustain as a result of any virus in this e-mail. You should carry
> > out your
> > own virus checks before opening the e-mail or attachment. Infosys
> reserves
> > the
> > right to monitor and review the content of all messages sent to or from
> > this e-mail
> > address. Messages sent to or from this e-mail address may be stored on
> the
> > Infosys e-mail system.
> > ***INFOSYS******** End of Disclaimer ********INFOSYS***
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>