related multivalued fields

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

related multivalued fields

jt95129
I am a newbie to Solr and found it very easy to get started!
However, now I am stuck at this issue of dealing with correlated vector fields.
for example
the data on scientific publications. It will have a list of authors and their
respective organization. Sample data can be represented as:
<publication>
 <title>Toward better searching</title>
 <author>
    <name>John Smith</name>
    <organization>ACME</organization>
 <author>
    <name>Mary Ann</name>
    <organization>Jumbo Inc</organization>
 <author>
<publication>

How can I make Solr handle query like:
author:"John Smith" AND organization:"ACME"?

It seems I have to collapse the above sample into:
<publication>
  <title>....</title>
  <author_name>John Smith, Mary Ann<author_name>
  <author_organization>ACME, Jumbo Inc</author_organization>
</publication>
Which obviously won't give me the answer I wanted.

This seems like a generic problem in handling hierarchical data
and right now I am hitting a roadblock in that solr only handles
flat scalar field values.

Would like to hear your suggestion/experience on how to handle the problem.

Regards,
-Jerry

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 
Reply | Threaded
Open this post in threaded view
|

Re: related multivalued fields

Chris Hostetter-3

one appraoch would be to have a single field called "citation" and use a
custom Analyzer that will put a "medium" sized gap between a persons name
and their organization, and a "large" gap between each person ... so
citation:"John ACME"~10 will give you articles by people named John who
work for companies named ACME.

if you really want to get creative, there was talk a while back about
Phrase/SPan Queries that could know about "parallel" fields ... where the
terms in one field "line up" with the terms in another field .. not true
"hierarchical" queries, but good enough for the class of problems you are
talking about...

http://www.nabble.com/Re%3A-One-item%2C-multiple-fields%2C-and-range-queries-p8377712.html





-Hoss