Problems with field names in solr functions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Problems with field names in solr functions

iker huerga
Hi all,

I am having problems when sorting solr documents using solr functions due
to the field names.


Imagine we want to sort the solr documents based on the sum of the scores
of the matching fields. These field are created as follows


<dynamicField name="foo/bar-*" type="float" indexed="true" stored="true"/>


The idea is that these fields store float values as in this example *<field
name="foo/bar-1234"> 50.45</field>*



The examples below illustrate the issue


This query - http://URL/solr/select/?q=(*foo/bar-1234*:*)+AND+(
<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
*foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
*-2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
*foo/bar-1234*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
 , <http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
*foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
*-2345* )+desc&wt=json



it gives me the following exception

*
*

*The request sent by the client was syntactically incorrect (sort param
could not be parsed as a query, and is not a field that exists in the
index: sum(foo/bar-1234,foo/bar-2345)).*


Whereas if I rename the field removing the "/" and "-" the following query
will work -

http://URL/solr/select/?q=(*bar1234*:*)+AND+(*bar2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
*bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
,
*bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
 )+desc&wt=json



  "response":{"numFound":2,"start":0,"docs":[

      {

        "primaryDescRes":"DescRes2",

        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
 ":45.54,

        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
 ":100.0},

      {

        "primaryDescRes":"DescRes1",

        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
  ":100.5,

        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
 ":25.22}]

  }}



I tried escaping the character as indicated in solr documentation [1], i.e.
foo%2Fbar-12345 instead of foo/bar-12345, without success



Could this be caused by the query parser?


I would be extremely grateful if you could let me know any workaround for
this



Best

Iker



[1]
http://wiki.apache.org/solr/SolrQuerySyntax#NOTE:_URL_Escaping_Special_Characters

--
Iker Huerga
http://www.ikerhuerga.com/
Reply | Threaded
Open this post in threaded view
|

Re: Problems with field names in solr functions

Erick Erickson
I know there are edge cases where "odd" field naming causes
problems, field names not well-defined/enforced with Solr. Rather than
banging my head against the wall and finding these cases
at inopportune moments, I'd confine myself to lower-case
and underscores.

Other stuff _may_ work, like capital letters or '-'. But '-' is part
of the solr query syntax and has the chance to getting confused
by the query parser.

Really, why add to your headaches by insisting on using some
"dangerous" characters?

Up to you, of course....

Best
Erick

On Thu, May 10, 2012 at 11:28 AM, Iker Huerga <[hidden email]> wrote:

> Hi all,
>
> I am having problems when sorting solr documents using solr functions due
> to the field names.
>
>
> Imagine we want to sort the solr documents based on the sum of the scores
> of the matching fields. These field are created as follows
>
>
> <dynamicField name="foo/bar-*" type="float" indexed="true" stored="true"/>
>
>
> The idea is that these fields store float values as in this example *<field
> name="foo/bar-1234"> 50.45</field>*
>
>
>
> The examples below illustrate the issue
>
>
> This query - http://URL/solr/select/?q=(*foo/bar-1234*:*)+AND+(
> <http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *-2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> *foo/bar-1234*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
>  , <http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *-2345* )+desc&wt=json
>
>
>
> it gives me the following exception
>
> *
> *
>
> *The request sent by the client was syntactically incorrect (sort param
> could not be parsed as a query, and is not a field that exists in the
> index: sum(foo/bar-1234,foo/bar-2345)).*
>
>
> Whereas if I rename the field removing the "/" and "-" the following query
> will work -
>
> http://URL/solr/select/?q=(*bar1234*:*)+AND+(*bar2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> <http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
> *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
> ,
> *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  )+desc&wt=json
>
>
>
>  "response":{"numFound":2,"start":0,"docs":[
>
>      {
>
>        "primaryDescRes":"DescRes2",
>
>        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":45.54,
>
>        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":100.0},
>
>      {
>
>        "primaryDescRes":"DescRes1",
>
>        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":100.5,
>
>        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":25.22}]
>
>  }}
>
>
>
> I tried escaping the character as indicated in solr documentation [1], i.e.
> foo%2Fbar-12345 instead of foo/bar-12345, without success
>
>
>
> Could this be caused by the query parser?
>
>
> I would be extremely grateful if you could let me know any workaround for
> this
>
>
>
> Best
>
> Iker
>
>
>
> [1]
> http://wiki.apache.org/solr/SolrQuerySyntax#NOTE:_URL_Escaping_Special_Characters
>
> --
> Iker Huerga
> http://www.ikerhuerga.com/
Reply | Threaded
Open this post in threaded view
|

Re: Problems with field names in solr functions

Yonik Seeley-2-2
In reply to this post by iker huerga
In trunk, see:
* SOLR-2335: New 'field("...")' function syntax for refering to complex
  field names (containing whitespace or special characters) in functions.

The schema in trunk also specifies:
   <!-- field names should consist of alphanumeric or underscore
characters only and
      not start with a digit.  This is not currently strictly enforced,
      but other field names will not have first class support from all
components
      and back compatibility is not guaranteed.
   -->

-Yonik
http://lucidimagination.com


On Thu, May 10, 2012 at 11:28 AM, Iker Huerga <[hidden email]> wrote:

> Hi all,
>
> I am having problems when sorting solr documents using solr functions due
> to the field names.
>
>
> Imagine we want to sort the solr documents based on the sum of the scores
> of the matching fields. These field are created as follows
>
>
> <dynamicField name="foo/bar-*" type="float" indexed="true" stored="true"/>
>
>
> The idea is that these fields store float values as in this example *<field
> name="foo/bar-1234"> 50.45</field>*
>
>
>
> The examples below illustrate the issue
>
>
> This query - http://URL/solr/select/?q=(*foo/bar-1234*:*)+AND+(
> <http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *-2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> *foo/bar-1234*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
>  , <http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *foo/bar*<http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+desc&wt=json>
> *-2345* )+desc&wt=json
>
>
>
> it gives me the following exception
>
> *
> *
>
> *The request sent by the client was syntactically incorrect (sort param
> could not be parsed as a query, and is not a field that exists in the
> index: sum(foo/bar-1234,foo/bar-2345)).*
>
>
> Whereas if I rename the field removing the "/" and "-" the following query
> will work -
>
> http://URL/solr/select/?q=(*bar1234*:*)+AND+(*bar2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> <http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
> *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
> ,
> *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  )+desc&wt=json
>
>
>
>  "response":{"numFound":2,"start":0,"docs":[
>
>      {
>
>        "primaryDescRes":"DescRes2",
>
>        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":45.54,
>
>        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":100.0},
>
>      {
>
>        "primaryDescRes":"DescRes1",
>
>        " *bar1234*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":100.5,
>
>        " *bar2345*<http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(Concept5348008,Concept5347854)+desc&wt=json>
>  ":25.22}]
>
>  }}
>
>
>
> I tried escaping the character as indicated in solr documentation [1], i.e.
> foo%2Fbar-12345 instead of foo/bar-12345, without success
>
>
>
> Could this be caused by the query parser?
>
>
> I would be extremely grateful if you could let me know any workaround for
> this
>
>
>
> Best
>
> Iker
>
>
>
> [1]
> http://wiki.apache.org/solr/SolrQuerySyntax#NOTE:_URL_Escaping_Special_Characters
>
> --
> Iker Huerga
> http://www.ikerhuerga.com/