Multiple fq fields in URL

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple fq fields in URL

Jack L
Hello ,

I'm not sure if I understand how to specify multiple fq in
CommonQueryParameters, which is described in this page
http://wiki.apache.org/solr/CommonQueryParameters

It says:

----------- cut ---------
fq=+popularity:[10 TO *] +section:0
is equivalent to
fq=popularity:[10 TO *]&fq=section:0
----------- cut ---------

But is the first line a valid URL with the space before the +?

The problem with the second line is that fq shows up multiple times.
In the web framework I am using, URL arguments are parsed and
converted into a dictionary and only the last value is used.

--
Best regards,
Jack

Reply | Threaded
Open this post in threaded view
|

Re: Multiple fq fields in URL

Jack L
Could anyone please explain a bit about the line below?
No spaces is allowed in a URL so I suppose this needs
to be quoted, but, what if there is space in the fq value field,
which will create ambiguity between fq field boundary and
term boundary?

fq=+popularity:[10 TO *] +section:0


> I'm not sure if I understand how to specify multiple fq in
> CommonQueryParameters, which is described in this page
> http://wiki.apache.org/solr/CommonQueryParameters

> It says:

> ----------- cut ---------
> fq=+popularity:[10 TO *] +section:0
> is equivalent to
> fq=popularity:[10 TO *]&fq=section:0
> ----------- cut ---------

> But is the first line a valid URL with the space before the +?

> The problem with the second line is that fq shows up multiple times.
> In the web framework I am using, URL arguments are parsed and
> converted into a dictionary and only the last value is used.


Reply | Threaded
Open this post in threaded view
|

Re: Multiple fq fields in URL

Chris Hostetter-3
In reply to this post by Jack L

first off, please have a little patience when sending mai lto the mailing
list ... 17 hours between "reposting" isn't relaly a lot ... especially on
a weekend, when many people are out of hte country for a confrence, and
i'm sick :)

On to your question...

: fq=+popularity:[10 TO *] +section:0
: is equivalent to
: fq=popularity:[10 TO *]&fq=section:0

: But is the first line a valid URL with the space before the +?

it's not really valid in the "meets the RFC" sense, it's mainly for
illustrative purposes ... you could type that into your browser and it
would probably work, but to be more strict we should be explicit...

  fq=%2Bpopularity:[10%20TO%20*]%20%2Bsection:0
  is equivalent to
  fq=popularity:[10%20TO%20*]&fq=section:0

... but isn't the version o nthe wiki a bit more readable?

: The problem with the second line is that fq shows up multiple times.
: In the web framework I am using, URL arguments are parsed and
: converted into a dictionary and only the last value is used.

i'm not sure what do tell you about that, other then that it sounds
like your web framework sucks.  multi valued params are extremely common,
and i can't imagine a web framework not supporting them.



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re[2]: Multiple fq fields in URL

Jack L
Hello Chris,

Thanks for the reply. And sorry for sounding impatient -
I've been working in the weekend and I wasn't sure if I expressed
my question well, hence why the second email :)

I understand I can url-encode the space. My question, which I failed
to mention in my first email, is that, what if the search term
in fq also has a space in it? Quoting with " " does not seem to
be a solution because quotes may appear in the query as well?

fq=+name:jack smith +section:0

BTW, space is sometimes encoded as %20, sometimes as "+".
So, solr expects %20 in this case? Or both are fine?

--
Best regards,
Jack

Saturday, May 5, 2007, 6:43:23 PM, you wrote:


> first off, please have a little patience when sending mai lto the mailing
> list ... 17 hours between "reposting" isn't relaly a lot ... especially on
> a weekend, when many people are out of hte country for a confrence, and
> i'm sick :)

> On to your question...

> : fq=+popularity:[10 TO *] +section:0
> : is equivalent to
> : fq=popularity:[10 TO *]&fq=section:0

> : But is the first line a valid URL with the space before the +?

> it's not really valid in the "meets the RFC" sense, it's mainly for
> illustrative purposes ... you could type that into your browser and it
> would probably work, but to be more strict we should be explicit...

>   fq=%2Bpopularity:[10%20TO%20*]%20%2Bsection:0
>   is equivalent to
>   fq=popularity:[10%20TO%20*]&fq=section:0

> ... but isn't the version o nthe wiki a bit more readable?

> : The problem with the second line is that fq shows up multiple times.
> : In the web framework I am using, URL arguments are parsed and
> : converted into a dictionary and only the last value is used.

> i'm not sure what do tell you about that, other then that it sounds
> like your web framework sucks.  multi valued params are extremely common,
> and i can't imagine a web framework not supporting them.



> -Hoss

Reply | Threaded
Open this post in threaded view
|

Re[2]: Multiple fq fields in URL

Chris Hostetter-3


: Thanks for the reply. And sorry for sounding impatient -
: I've been working in the weekend and I wasn't sure if I expressed
: my question well, hence why the second email :)

yeah .. my apologies as well, i realize now i didn't answer your question
very well at all ... i've been sick and it's wrecking havock with my mood.

: I understand I can url-encode the space. My question, which I failed
: to mention in my first email, is that, what if the search term
: in fq also has a space in it? Quoting with " " does not seem to
: be a solution because quotes may appear in the query as well?
:
: fq=+name:jack smith +section:0

start by ignoring the URL escaping, treat the concept of a filter query
exactly as you would any other lucene query string for the standard
request handler ... if you want to require that the name field contain the
phrase "jack smith" and that the section field contain the number 0, then
that query can be expressed as...

     +name:"jack smith" +section:0

if you wnat your query to contain a literal '"' character, then the Lucene
QueryParser requires that you backslash escape it...

     +name:"jack quote\" smith" +section:0

now all you have to do is URL escape that whole thing, and you can pass it
to Solr as the value of an fq (or q) param.

you cna see that it's working and doing what you want by looking at the
debugQuery output.

the key the wiki is trying to point out to you, si thta if you tend to
have a lot of differnet Filter Queries that look like this..

     +name:"jack smith" +section:0
     +name:"Joe smith" +section:0
     +name:"lee harvey" +section:0

...it's going to be more efficient from a caching perspective to break
those up so that "section:0" is used as it's own fq.

: BTW, space is sometimes encoded as %20, sometimes as "+".
: So, solr expects %20 in this case? Or both are fine?

this is where i realized i did a really bad job answering your question
... + as an escping for space is fine .. it's unfortunate thta it's also
the charcter denoting a mandatory clause in a lucene query, and that when
splitting your big fq into two seperate fqs you no longer need the + to
make them mandatory since they are seperate params that by definition must
match.


-Hoss

Reply | Threaded
Open this post in threaded view
|

Re[3]: Multiple fq fields in URL

Jack L
Hello Chris,

Thanks a lot for the explanation. Get well soon!

--
Best regards,
Jack

Saturday, May 5, 2007, 9:02:36 PM, you wrote:

> start by ignoring the URL escaping, treat the concept of a filter query
> exactly as you would any other lucene query string for the standard
> request handler ... if you want to require that the name field contain the
> phrase "jack smith" and that the section field contain the number 0, then
> that query can be expressed as...

...

Reply | Threaded
Open this post in threaded view
|

Re[3]: Multiple fq fields in URL

Jack L
In reply to this post by Chris Hostetter-3
Hello Chris,

When I read your email again, I find that I didn't understand it
very well, because although fq works for me now, when I need to
construct the q variable, I run into the same set of questions
and am still not clear :)

> : BTW, space is sometimes encoded as %20, sometimes as "+".
> : So, solr expects %20 in this case? Or both are fine?

> this is where i realized i did a really bad job answering your question
> ... + as an escping for space is fine .. it's unfortunate thta it's also
> the charcter denoting a mandatory clause in a lucene query, and that when
> splitting your big fq into two seperate fqs you no longer need the + to
> make them mandatory since they are seperate params that by definition must
> match.

1. I didn't understand the part above in your reply. If I search for
samsung camera, the query should be like this in the select URL:

  q=samsung+camera

And if samsung is mandatory, the query will be like this: (or not:)

  q=+samsung+camera

And the first + will be interpreted as mandatory flag?

 
2. It seems that + may come from a few different things:
   - mandatory flag in Lucene query
   - space escaping
   - to specify multiple q or fq clauses as in:
     fq=+popularity:[10 TO *] +section:0

And I have a hard time assuring myself that they'll work correctly
for me.


In my case, I want to search in a title field and a content field.
And I think the query should look like this in the URL: (no mandatory terms)

   q=+title:samsung camera +content:samsung camera

with spaces escaped with +:

   q=+title:samsung+camera++content:samsung+camera

Is this correct? I'm very unsure already because of the ++.


3. And if I'm correct, I can do the same query with multiple
q variables in the URL, just like what I can do with fq:

   q=title:samsung camera&q=content:samsung camera

with spaces escaped with +:

   q=title:samsung+camera&q=content:samsung+camera


4. To simplify it a bit, if I have defined content field as
default search field, the query will be:

   q=+title:samsung camera +samsung camera

with spaces escaped with +:

   q=+title:samsung+camera++samsung+camera


5. To complicate it a bit, let's say the query is supposed to
search samsung camera in both fields, but samsung in title is
mandatory:

   q=+title:+samsung camera +content:samsung camera

with spaces escaped:

   q=+title:+samsung+camera++content:samsung+camera

Is this correct?

   
6. I'm compelled to list both the non-escaped format and the escaped
format above because I can not help thinking that the + may
cause problems :)

--
Best regards,
Jack


Reply | Threaded
Open this post in threaded view
|

Re: Re[3]: Multiple fq fields in URL

Erik Hatcher
Jack,


On May 13, 2007, at 6:45 PM, Jack L wrote:
> 1. I didn't understand the part above in your reply. If I search for
> samsung camera, the query should be like this in the select URL:
>
>   q=samsung+camera
>
> And if samsung is mandatory, the query will be like this: (or not:)
>
>   q=+samsung+camera

Or not.   You need that first plus encoded as %2b:

        q=%2bsamsung+camera

> And the first + will be interpreted as mandatory flag?

It is interpreted as a space.

Here's a good way to see how things get interpreted:

        /solr/select?q=+bob&wt=ruby&indent=on

You'll see the params in the responseHeader:

        'q'=>' bob'

> 6. I'm compelled to list both the non-escaped format and the escaped
> format above because I can not help thinking that the + may
> cause problems :)

It causes confusion, that's for sure.  We do have to be careful to  
note whether an example is an encoded URL, or whether its a loose  
example to keep from cluttering it up with encoded characters.

        Erik

Reply | Threaded
Open this post in threaded view
|

Re[3]: Multiple fq fields in URL

Chris Hostetter-3
In reply to this post by Jack L

:   q=samsung+camera
:
: And if samsung is mandatory, the query will be like this: (or not:)
:
:   q=+samsung+camera
:
: And the first + will be interpreted as mandatory flag?

No.  bottom line, forget all about URLs and URL escape.  step #1:
understand the Lucene query syntax...

   http://lucene.apache.org/java/docs/queryparsersyntax.html

in that syntax, this says samsung is mandatory and camera is optional...

        +samsung camera

Step #2: use the admin form in Solr to type in queries, check the
debug enable option to see exactly what query structures you are
getting at the botom of your results...

   http://localhost:8983/solr/admin/form.jsp

step#3: only after you are sure you understand the syntax, and what result
you ar getting as a result, should you look at the URL to see how the
Lucene query syntax is being URL escaped.

Solr doesn't do anything magic with the URL, it doesn't do any special
Solr specific parsing ... the URL must be legal, and it must be valid, it
will be parsed/unescaped just like any other CGI/form style URL .. and
then the args will be interpreted.

I've updated the wiki page that started this thread to try and eli8minate
any ambiguity about URL escaping...

http://wiki.apache.org/solr/CommonQueryParameters#fq


-Hoss