Wildcard Query

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Wildcard Query

Pace Davis

I have been using Lucene for about a month now and trying to port the same
functionality to Solr.  How do I do a wildcard query with a leading "*"
 ...This is possible with Lucene if you do not use the standard query
parser.  How do you do this with Solr????  This is probably very easy but I
can not find any information in docs or mailing list.

Please help

Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Chris Hostetter-3

: I have been using Lucene for about a month now and trying to port the same
: functionality to Solr.  How do I do a wildcard query with a leading "*"
:  ...This is possible with Lucene if you do not use the standard query
: parser.  How do you do this with Solr????  This is probably very easy but I
: can not find any information in docs or mailing list.

There is no easy way to change this just by modifying configuration --
you'll need to write your own request handler which uses the QueryParser
of your choice.


-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Pace Davis
Ok, before I go start writing a new request handler....let me ask a dumb
question and see if I am approaching this wrong in Solr. If I am trying to
search a field where I have one doc with a field that has a value of "Hello
World"...if the search query is "ello"  ...currently is there a way to make
this query match this field?


> From: Chris Hostetter <[hidden email]>
> Reply-To: [hidden email]
> Date: Tue, 20 Jun 2006 01:09:52 -0700 (PDT)
> To: [hidden email]
> Subject: Re: Wildcard Query
>
>
> : I have been using Lucene for about a month now and trying to port the same
> : functionality to Solr.  How do I do a wildcard query with a leading "*"
> :  ...This is possible with Lucene if you do not use the standard query
> : parser.  How do you do this with Solr????  This is probably very easy but I
> : can not find any information in docs or mailing list.
>
> There is no easy way to change this just by modifying configuration --
> you'll need to write your own request handler which uses the QueryParser
> of your choice.
>
>
> -Hoss
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Erik Hatcher
On Jun 20, 2006, at 6:07 AM, Pace Davis wrote:
> Ok, before I go start writing a new request handler....let me ask a  
> dumb
> question and see if I am approaching this wrong in Solr. If I am  
> trying to
> search a field where I have one doc with a field that has a value  
> of "Hello
> World"...if the search query is "ello"  ...currently is there a way  
> to make
> this query match this field?

This is more a Lucene question than Solr.  You will need to either do  
special analysis that would index "hello" in various pieces such as  
"ello", "llo"... or do as Hoss suggested and create a custom request  
handler that searched and returned results however you like.  I have  
written several custom request handlers in my application.  You could  
do this pretty easily yourself by copying StandardRequestHandler to  
your own class name, modifying how it creates the Query, and  
configuring it in solrconfig.xml file.

        Erik

Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Bill Au
If it is just a matter of matching lower case to upper case and upper case
to lower case,
one can simply use the LowercaseFilter.

Bill

On 6/20/06, Erik Hatcher <[hidden email]> wrote:

>
> On Jun 20, 2006, at 6:07 AM, Pace Davis wrote:
> > Ok, before I go start writing a new request handler....let me ask a
> > dumb
> > question and see if I am approaching this wrong in Solr. If I am
> > trying to
> > search a field where I have one doc with a field that has a value
> > of "Hello
> > World"...if the search query is "ello"  ...currently is there a way
> > to make
> > this query match this field?
>
> This is more a Lucene question than Solr.  You will need to either do
> special analysis that would index "hello" in various pieces such as
> "ello", "llo"... or do as Hoss suggested and create a custom request
> handler that searched and returned results however you like.  I have
> written several custom request handlers in my application.  You could
> do this pretty easily yourself by copying StandardRequestHandler to
> your own class name, modifying how it creates the Query, and
> configuring it in solrconfig.xml file.
>
>         Erik
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Yonik Seeley
In reply to this post by Pace Davis
On 6/20/06, Pace Davis <[hidden email]> wrote:
> I have been using Lucene for about a month now and trying to port the same
> functionality to Solr.  How do I do a wildcard query with a leading "*"
>  ...This is possible with Lucene if you do not use the standard query
> parser.

It's not really possible to do efficiently with Lucene out-of-the-box either.
Terms are sorted, so foo* is a relatively quick query, but *foo is
horribly slow since all terms must be scanned.

You can do things like what Erik suggests... index all the variants:
  "Hello" "ello" "llo", etc
Another more limited form that would take up less index space would be
to index the reverse of the token as well:
  Index="olleH",  query="olle*"

We don't yet have an analyzer to do this (and neither does Lucene AFAIK).
As Chris points out, in addition to analysis components, the
QueryParse would need to be changed as well.

I've thought about hooking in the QueryParser to the FieldTypes more before...
One reason is to know if something like a prefix query should be
lowercased or not.
Another reason could be to handle special construction of wildcard
queries when there is support for "*foo".


-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Pace Davis
Thanks for all the help.  The only field where I need this is to search sku
fields...example being "19-JN910"  The search needs to be able to pull a
match if the query were "JN" ...Erik's solution is the way to go and simple
to implement.  



> From: "Yonik Seeley" <[hidden email]>
> Reply-To: [hidden email]
> Date: Tue, 20 Jun 2006 09:53:53 -0400
> To: [hidden email]
> Subject: Re: Wildcard Query
>
> On 6/20/06, Pace Davis <[hidden email]> wrote:
>> I have been using Lucene for about a month now and trying to port the same
>> functionality to Solr.  How do I do a wildcard query with a leading "*"
>>  ...This is possible with Lucene if you do not use the standard query
>> parser.
>
> It's not really possible to do efficiently with Lucene out-of-the-box either.
> Terms are sorted, so foo* is a relatively quick query, but *foo is
> horribly slow since all terms must be scanned.
>
> You can do things like what Erik suggests... index all the variants:
> "Hello" "ello" "llo", etc
> Another more limited form that would take up less index space would be
> to index the reverse of the token as well:
> Index="olleH",  query="olle*"
>
> We don't yet have an analyzer to do this (and neither does Lucene AFAIK).
> As Chris points out, in addition to analysis components, the
> QueryParse would need to be changed as well.
>
> I've thought about hooking in the QueryParser to the FieldTypes more before...
> One reason is to know if something like a prefix query should be
> lowercased or not.
> Another reason could be to handle special construction of wildcard
> queries when there is support for "*foo".
>
>
> -Yonik
>

Reply | Threaded
Open this post in threaded view
|

Re: Wildcard Query

Yonik Seeley
On 6/20/06, Pace Davis <[hidden email]> wrote:
> Thanks for all the help.  The only field where I need this is to search sku
> fields...example being "19-JN910"  The search needs to be able to pull a
> match if the query were "JN" ...Erik's solution is the way to go and simple
> to implement.

For SKUs, another possible soultion is to use the WordDelimiterFilter.
 It's good if there is a person is typing in SKUs that might use a
different delimiter by mistake.

19-JN910 would be indexed as "19 JN 910", and the following queries
would all match it:  "19"  "JN"  "910"  "19JN"  "JN 910" "19JN910"
"19/JN-910", etc.

-Yonik