multiple "things" in a document

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

multiple "things" in a document

Geoffrey Young
hi all :)

I'm just getting up to speed with solr (and lucene, for that matter) for
a new project.  after reading through the available docs I'm not finding
an answer to my most basic (newbie, certainly) question.  please feel
free to just point me to the proper doc :)

this isn't my actual use case, but it's close enough for general
understanding... say I want to store data on a collection of SKUs which
(for the unfamiliar :) are a combination of item + location.  so we
might have

   sku
     id
     name
     item
     location

   item
     id
     name

   location
     id
     name

all of the schema.xml examples seem to deal with just a flat "thing"
perhaps with multiple entries of the same field.  what I'm after is how
to represent this kind of relationship in the schema, such that I can
limit my result set to, say, a sku or item, but if I search on sku I can
discriminate between the sku name and the item name in my results.

from my reading on lucene this is pretty basic stuff, but I don't see
how the solr layer approaches this at all.  again, doc pointers much
appreciated.

thanks for listening :)

--Geoff
Reply | Threaded
Open this post in threaded view
|

RE: multiple "things" in a document

Will Johnson-2
Usually you do something like: (assuming this is in a rdbms)

SELECT sku.id as skuid, sku.name as skuname, item.name as itemname,
location.name as locationname
FROM sku, item, location
WHERE sku.item = item.id AND sku.location = location.id

The you can search on any part of the 'flat' record and know what field
comes from where.

Depending on the size of you corpus, and the type of queries you want to be
able to server there are a million ways to optimize this but this should get
you up and searching quickly enough.

- will



-----Original Message-----
From: Geoffrey Young [mailto:[hidden email]]
Sent: Friday, February 22, 2008 9:19 AM
To: [hidden email]
Subject: multiple "things" in a document

hi all :)

I'm just getting up to speed with solr (and lucene, for that matter) for
a new project.  after reading through the available docs I'm not finding
an answer to my most basic (newbie, certainly) question.  please feel
free to just point me to the proper doc :)

this isn't my actual use case, but it's close enough for general
understanding... say I want to store data on a collection of SKUs which
(for the unfamiliar :) are a combination of item + location.  so we
might have

   sku
     id
     name
     item
     location

   item
     id
     name

   location
     id
     name

all of the schema.xml examples seem to deal with just a flat "thing"
perhaps with multiple entries of the same field.  what I'm after is how
to represent this kind of relationship in the schema, such that I can
limit my result set to, say, a sku or item, but if I search on sku I can
discriminate between the sku name and the item name in my results.

from my reading on lucene this is pretty basic stuff, but I don't see
how the solr layer approaches this at all.  again, doc pointers much
appreciated.

thanks for listening :)

--Geoff