Does Distributed Search are Cached Only the By Node That Runs Query?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Does Distributed Search are Cached Only the By Node That Runs Query?

kamaci
I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
SolrCloud as like that:

ip_of_node_1:8983solr/select?q=*:*&rows=10000

and when I check admin page I see that:

I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.

Before my search it was something like: 150 MB dark gray, 500 MB gray.

I understand that when I do a search like that, fields are cached. However
when I look at other SolrCloud nodes' admin pages there are no differences.
Why that query is cached only by the node that I run that query on?
Reply | Threaded
Open this post in threaded view
|

Re: Does Distributed Search are Cached Only the By Node That Runs Query?

Otis Gospodnetić
You are looking at jvm heap but attributing it to caching only. Not quite
right...there are other things in that jvm heap.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 3:55 PM, "Furkan KAMACI" <[hidden email]> wrote:

> I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
> SolrCloud as like that:
>
> ip_of_node_1:8983solr/select?q=*:*&rows=10000
>
> and when I check admin page I see that:
>
> I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.
>
> Before my search it was something like: 150 MB dark gray, 500 MB gray.
>
> I understand that when I do a search like that, fields are cached. However
> when I look at other SolrCloud nodes' admin pages there are no differences.
> Why that query is cached only by the node that I run that query on?
>
Reply | Threaded
Open this post in threaded view
|

Re: Does Distributed Search are Cached Only the By Node That Runs Query?

Joel Bernstein
How many shards are in your collection? The query aggregator node will pull
pack that results from each shard and hold the results in memory. Then it
will add the results to a priority queue to sort them. This queue will need
to be as large as the page that is being generated.

After the query is finished this memory should be collectable.


On Thu, May 9, 2013 at 8:00 PM, Otis Gospodnetic <[hidden email]
> wrote:

> You are looking at jvm heap but attributing it to caching only. Not quite
> right...there are other things in that jvm heap.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On May 9, 2013 3:55 PM, "Furkan KAMACI" <[hidden email]> wrote:
>
> > I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
> > SolrCloud as like that:
> >
> > ip_of_node_1:8983solr/select?q=*:*&rows=10000
> >
> > and when I check admin page I see that:
> >
> > I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.
> >
> > Before my search it was something like: 150 MB dark gray, 500 MB gray.
> >
> > I understand that when I do a search like that, fields are cached.
> However
> > when I look at other SolrCloud nodes' admin pages there are no
> differences.
> > Why that query is cached only by the node that I run that query on?
> >
>



--
Joel Bernstein
Professional Services LucidWorks
Reply | Threaded
Open this post in threaded view
|

Re: Does Distributed Search are Cached Only the By Node That Runs Query?

Jason Hellman
And for 10,000 documents across n shards, that can be significant!

On May 10, 2013, at 11:43 AM, Joel Bernstein <[hidden email]> wrote:

> How many shards are in your collection? The query aggregator node will pull
> pack that results from each shard and hold the results in memory. Then it
> will add the results to a priority queue to sort them. This queue will need
> to be as large as the page that is being generated.
>
> After the query is finished this memory should be collectable.
>
>
> On Thu, May 9, 2013 at 8:00 PM, Otis Gospodnetic <[hidden email]
>> wrote:
>
>> You are looking at jvm heap but attributing it to caching only. Not quite
>> right...there are other things in that jvm heap.
>>
>> Otis
>> Solr & ElasticSearch Support
>> http://sematext.com/
>> On May 9, 2013 3:55 PM, "Furkan KAMACI" <[hidden email]> wrote:
>>
>>> I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
>>> SolrCloud as like that:
>>>
>>> ip_of_node_1:8983solr/select?q=*:*&rows=10000
>>>
>>> and when I check admin page I see that:
>>>
>>> I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.
>>>
>>> Before my search it was something like: 150 MB dark gray, 500 MB gray.
>>>
>>> I understand that when I do a search like that, fields are cached.
>> However
>>> when I look at other SolrCloud nodes' admin pages there are no
>> differences.
>>> Why that query is cached only by the node that I run that query on?
>>>
>>
>
>
>
> --
> Joel Bernstein
> Professional Services LucidWorks

Reply | Threaded
Open this post in threaded view
|

Re: Does Distributed Search are Cached Only the By Node That Runs Query?

kamaci
I have 5 shard and they are at Amazon EC2 as Large instances. I just make
some tests and when I start pre-production step at  my data center I will
have many Solr machines and millions of documents so this issue maybe a
problem for me.

2013/5/10 Jason Hellman <[hidden email]>

> And for 10,000 documents across n shards, that can be significant!
>
> On May 10, 2013, at 11:43 AM, Joel Bernstein <[hidden email]> wrote:
>
> > How many shards are in your collection? The query aggregator node will
> pull
> > pack that results from each shard and hold the results in memory. Then it
> > will add the results to a priority queue to sort them. This queue will
> need
> > to be as large as the page that is being generated.
> >
> > After the query is finished this memory should be collectable.
> >
> >
> > On Thu, May 9, 2013 at 8:00 PM, Otis Gospodnetic <
> [hidden email]
> >> wrote:
> >
> >> You are looking at jvm heap but attributing it to caching only. Not
> quite
> >> right...there are other things in that jvm heap.
> >>
> >> Otis
> >> Solr & ElasticSearch Support
> >> http://sematext.com/
> >> On May 9, 2013 3:55 PM, "Furkan KAMACI" <[hidden email]> wrote:
> >>
> >>> I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
> >>> SolrCloud as like that:
> >>>
> >>> ip_of_node_1:8983solr/select?q=*:*&rows=10000
> >>>
> >>> and when I check admin page I see that:
> >>>
> >>> I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.
> >>>
> >>> Before my search it was something like: 150 MB dark gray, 500 MB gray.
> >>>
> >>> I understand that when I do a search like that, fields are cached.
> >> However
> >>> when I look at other SolrCloud nodes' admin pages there are no
> >> differences.
> >>> Why that query is cached only by the node that I run that query on?
> >>>
> >>
> >
> >
> >
> > --
> > Joel Bernstein
> > Professional Services LucidWorks
>
>