injecting fields looked up from DB at the runtime - Solr/Lucene question

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

injecting fields looked up from DB at the runtime - Solr/Lucene question

Vladimir Olenin

Hi,

I wonder if the below is the correct way of doing things...

- when the Hits objects are returned from IndexSearcher (as a result of some search), 'inject' 'info' fields into the 'Hit' objects at runtime by looking the values up in the DB. The main purpose is to avoid storing 'info' fields in the index as 'stored' fields.
  * in other words, I want to keep in Lucene index ONLY 'indexed' fields and keep all 'stored' fields (some of which might be big BLOB entries) in relational DB. I do want however to provide 'generic' transparent access to these stored fields through Lucene APIs (one of the rational is to be able to use some frameworks around, like Solr, transparently, no matter which fields are stored in index and which are stored in the DB).

I wonder if adding 'Fields' to the returned Document object (linked with Hit object) will do the trick for me? In other words, will the below work?

Hits hits = indexSearcher.search(luceneQuery);
HitsIterator iter = (HitsIterator)hits.iterator();
Hit hit = (Hit)iter.next();
Document doc = hit.getDocument();
String docId = doc.getField("docId");
doc.addField(getFieldBy("someStoredDBField", docId));

Would this disturb the index in any way? Would this be reflected in all other objects in the returned Hits set (eg, will 'hit.get("someStoredDBField")' return the value looked up in the DB?)

Thanks.

Vlad
Reply | Threaded
Open this post in threaded view
|

Re: injecting fields looked up from DB at the runtime - Solr/Lucene question

Yonik Seeley-2
On 11/5/06, Vladimir Olenin <[hidden email]> wrote:
> - when the Hits objects are returned from IndexSearcher (as a result of some search), 'inject' 'info' fields into the 'Hit' objects at runtime by looking the values up in the DB. The main purpose is to avoid storing 'info' fields in the index as 'stored' fields.

Yes, I've considered doing that at the Solr layer... adding something
like a subclassable SolrDocument.  An implementation could add other
fields by retrieving them from a database.

Upsides: simpler clients that don't need to understand where stored
fields are coming from.
Downsides: you tie yourself to a database (another thing to worry
about in an environment where you need HA).

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: injecting fields looked up from DB at the runtime - Solr/Lucene question

Vladimir Olenin
So, if I'll be using Solr, what's the right strategy? Is it possible to
redefine SolrDocument class through configuration? If not, would it be
safe to inject these properties through Aspects while maintaining the
whole framework intact and in workable condition? (eg, if at some point
for some reason the field is cross checked with content of the index,
which won't work if I'm injecting field from the DB, or smth like
that...)

Vlad


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Yonik
Seeley
Sent: Sunday, November 05, 2006 9:36 AM
To: [hidden email]
Subject: Re: injecting fields looked up from DB at the runtime -
Solr/Lucene question

On 11/5/06, Vladimir Olenin <[hidden email]> wrote:
> - when the Hits objects are returned from IndexSearcher (as a result
of some search), 'inject' 'info' fields into the 'Hit' objects at
runtime by looking the values up in the DB. The main purpose is to avoid
storing 'info' fields in the index as 'stored' fields.

Yes, I've considered doing that at the Solr layer... adding something
like a subclassable SolrDocument.  An implementation could add other
fields by retrieving them from a database.

Upsides: simpler clients that don't need to understand where stored
fields are coming from.
Downsides: you tie yourself to a database (another thing to worry about
in an environment where you need HA).

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search
server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: injecting fields looked up from DB at the runtime - Solr/Lucene question

Yonik Seeley-2
On 11/6/06, Vladimir Olenin <[hidden email]> wrote:
> So, if I'll be using Solr, what's the right strategy? Is it possible to
> redefine SolrDocument class through configuration?

There isn't currently a SolrDocument class... it's all hypothetical.
But yes, I imagine it would work by allowing one to specify their own
implementation via configuration.


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]