Request two databases at the same time ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Request two databases at the same time ?

Bruno Mannina
Dear All,

I use Apache-SOLR3.6, on Ubuntu (newbie user).

I have a big database named BigDB1 with 90M documents,
each document contains several fields (docid, title, author, date, etc...)

I received today from another source, abstract of some documents (there
are also the same docid field in this source).
I don't want to modify my BigDB1 to update documents with abstract
because BigDB1 is always updated twice by week.

Do you think it's possible to create a new database named AbsDB1 and
request the both database at the same time ?
  if I do for example:
title:airplane AND abstract:plastic

I would like to obtain documents from BigDB1 and AbsDB1.

Many thanks for your help, information and others things that can help me.

Regards,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Request two databases at the same time ?

Erick Erickson
bq: I don't want to modify my BigDB1 to update documents with abstract
because BigDB1 is always updated twice by week.

Why not? Solr/Lucene handle updating docs, if a doc in the index has
the same <uniqueKey>, the old doc is deleted and the new one takes its
place. So why not just put the new abstracts into BigDB1? If you
re-index the docs later (your twice/week comment), then they'll be
overwritten. This will be much simpler than trying to maintain two.

But if you cannot update BigDB1 just fire off two queries and combine
them. Or specify the shards parameter on the URL pointing to both
collections. Do note, though, that the relevance calculations may not
be absolutely comparable, so mixing the results may show some
surprises...

Best,
Erick

On Fri, Jan 9, 2015 at 9:12 AM, Bruno Mannina <[hidden email]> wrote:

> Dear All,
>
> I use Apache-SOLR3.6, on Ubuntu (newbie user).
>
> I have a big database named BigDB1 with 90M documents,
> each document contains several fields (docid, title, author, date, etc...)
>
> I received today from another source, abstract of some documents (there are
> also the same docid field in this source).
> I don't want to modify my BigDB1 to update documents with abstract because
> BigDB1 is always updated twice by week.
>
> Do you think it's possible to create a new database named AbsDB1 and request
> the both database at the same time ?
>  if I do for example:
> title:airplane AND abstract:plastic
>
> I would like to obtain documents from BigDB1 and AbsDB1.
>
> Many thanks for your help, information and others things that can help me.
>
> Regards,
> Bruno
>
> ---
> Ce courrier électronique ne contient aucun virus ou logiciel malveillant
> parce que la protection avast! Antivirus est active.
> http://www.avast.com
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Request two databases at the same time ?

Bruno Mannina
Dear Erick,

thank you for your answer.

My answers are below.

Le 09/01/2015 20:43, Erick Erickson a écrit :
> bq: I don't want to modify my BigDB1 to update documents with abstract
> because BigDB1 is always updated twice by week.
>
> Why not? Solr/Lucene handle updating docs, if a doc in the index has
> the same <uniqueKey>, the old doc is deleted and the new one takes its
> place. So why not just put the new abstracts into BigDB1? If you
> re-index the docs later (your twice/week comment), then they'll be
> overwritten. This will be much simpler than trying to maintain two.
I understand this process, I use it for other collections and twice time
by week for BigDB1.
But, i.e. Doc1 is updated with Abstract on Monday. Tuesday I must update
it with new data, then Abstract will be lost.
I can't check/get abstract before to re-insert it in the new doc because
I receive several thousand docs every week (new and amend),
i think it will take a long time to do that.

> But if you cannot update BigDB1 just fire off two queries and combine
> them. Or specify the shards parameter on the URL pointing to both
> collections. Do note, though, that the relevance calculations may not
> be absolutely comparable, so mixing the results may show some
> surprises...
Shards..I wilkl take a look to this, I don't know this param.
Concerning relevance, I don't really use it, so it won't be a problem I
think.


Sincerely,

> Best,
> Erick
>
> On Fri, Jan 9, 2015 at 9:12 AM, Bruno Mannina <[hidden email]> wrote:
>> Dear All,
>>
>> I use Apache-SOLR3.6, on Ubuntu (newbie user).
>>
>> I have a big database named BigDB1 with 90M documents,
>> each document contains several fields (docid, title, author, date, etc...)
>>
>> I received today from another source, abstract of some documents (there are
>> also the same docid field in this source).
>> I don't want to modify my BigDB1 to update documents with abstract because
>> BigDB1 is always updated twice by week.
>>
>> Do you think it's possible to create a new database named AbsDB1 and request
>> the both database at the same time ?
>>   if I do for example:
>> title:airplane AND abstract:plastic
>>
>> I would like to obtain documents from BigDB1 and AbsDB1.
>>
>> Many thanks for your help, information and others things that can help me.
>>
>> Regards,
>> Bruno
>>
>> ---
>> Ce courrier électronique ne contient aucun virus ou logiciel malveillant
>> parce que la protection avast! Antivirus est active.
>> http://www.avast.com
>>
>


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com

Loading...