Sharing buffer between large number of IndexWriters?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Sharing buffer between large number of IndexWriters?

Marcin Okraszewski
Hi,
I want to create a separate index per tenant in application. It is due to
both strong data separation requirements as well as query performance
(active tenants with large indices affect others). The number of active
IndexWriters would go into a few thousands. One of the concerns that rises
is RAM buffers needed by IndexWritters, as even a few MBs of buffer per
writer translates into heavy GBs of RAM.

Is there any way to give all IndexWriters one cumulative limit of RAM so
that they can share it proportionally to their traffic?

Thank you,
Marcin
Reply | Threaded
Open this post in threaded view
|

Re: Sharing buffer between large number of IndexWriters?

Michael McCandless-2
Hello Marcin,

Alas, Lucene does not have this capability out of the box.

However, you are able to live-update the
IndexWriterConfig.setRAMBufferSizeMB, and the change should take effect on
the next document indexed in that IndexWriter instance.  So you could build
your own "proportional RAM" on top of that.

But I would worry about the little not-accounted-for RAM that IndexWriter
uses ... summed across a few thousand instances that might start to matter.

When there are no merges running, IndexWriter should be quick to close and
re-open; maybe you want to do that more aggressively.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 16, 2020 at 9:25 AM Marcin Okraszewski <[hidden email]> wrote:

> Hi,
> I want to create a separate index per tenant in application. It is due to
> both strong data separation requirements as well as query performance
> (active tenants with large indices affect others). The number of active
> IndexWriters would go into a few thousands. One of the concerns that rises
> is RAM buffers needed by IndexWritters, as even a few MBs of buffer per
> writer translates into heavy GBs of RAM.
>
> Is there any way to give all IndexWriters one cumulative limit of RAM so
> that they can share it proportionally to their traffic?
>
> Thank you,
> Marcin
>
Reply | Threaded
Open this post in threaded view
|

Re: Sharing buffer between large number of IndexWriters?

Andrzej Białecki-2
Hi Marcin,

I’m working on a somewhat similar problem in Solr, also with the goal to better handle multi-tenant Solr clusters (SOLR-13579). It’s probably not directly applicable to your scenario but one costly lesson that I learned (obvious in hindsight ;) ) is that when things happen not instantaneously but over time, a number of “interesting” dynamic aspects come into play…

You may think that “proportional allotment relative to traffic” sounds like a simple formula but I can assure you it isn’t - as soon as you start considering how actually things happen over time: how you measure the traffic rate (time window? exponentially decaying? with what ratio? sampled how often?) and the delays between the adjustment of the controlled parameter (RAM size) and the change in your monitored values, and the delay until that change is reflected in your metrics. This is a classical control theory problem of tuning a feedback loop, and you can use a PID controller to manage it - but even then it’s far from simple, many thick volumes have been written on PID tuning…



Andrzej Białecki

> On 22 Jun 2020, at 16:27, Michael McCandless <[hidden email]> wrote:
>
> Hello Marcin,
>
> Alas, Lucene does not have this capability out of the box.
>
> However, you are able to live-update the
> IndexWriterConfig.setRAMBufferSizeMB, and the change should take effect on
> the next document indexed in that IndexWriter instance.  So you could build
> your own "proportional RAM" on top of that.
>
> But I would worry about the little not-accounted-for RAM that IndexWriter
> uses ... summed across a few thousand instances that might start to matter.
>
> When there are no merges running, IndexWriter should be quick to close and
> re-open; maybe you want to do that more aggressively.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Jun 16, 2020 at 9:25 AM Marcin Okraszewski <[hidden email]> wrote:
>
>> Hi,
>> I want to create a separate index per tenant in application. It is due to
>> both strong data separation requirements as well as query performance
>> (active tenants with large indices affect others). The number of active
>> IndexWriters would go into a few thousands. One of the concerns that rises
>> is RAM buffers needed by IndexWritters, as even a few MBs of buffer per
>> writer translates into heavy GBs of RAM.
>>
>> Is there any way to give all IndexWriters one cumulative limit of RAM so
>> that they can share it proportionally to their traffic?
>>
>> Thank you,
>> Marcin
>>