MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

Aaron Daubman
Greetings,

Most times I've seen the topic of storing one's index in memory, it
seems the asker was referring (or understood to be referring) to the
(in)famous "not intended to work with huge indexes" Solr RAMDirectory.

Let me be clear that that I am not interested in RAMDirectory.
However, I would like to better understand the oft-recommended and
currently-default MMapDirectory, and what the tradeoffs would be, when
using a 64-bit linux server dedicated to this single solr instance,
with plenty (more than 2x index size) of RAM, of storing the index
files on SSDs versus on a ramfs mount.

I understand that using the default MMapDirectory will allow caching
of the index in-memory, however, my understanding is that mmaped files
are demand-paged (lazy evaluated), meaning that only after a block is
read from disk will it be paged into memory - is this correct? is it
actually block-by-block (page size by page size?) - any pointers to
decent documentation on this regardless of the effectiveness of the
approach would be appreciated...

My concern with using MMapDirectory for an index stored on disk (even
SSDs), if my understanding is correct, is that there is still a large
startup cost to MMapDirectory, as it may take many queries before even
most of a 20G index has been loaded into memory, and there may yet
still be "dark corners" that only come up in edge-case queries that
cause QTime spikes should these queries ever occur.

I would like to ensure that, at startup, no query will incur
disk-seek/read penalties.

Is the "right" way to achieve this to copy the index to a ramfs (NOT
ramdisk) mount and then continue to use MMapDirectory in Solr to read
the index? I am under the impression that when using ramfs (rather
than ramdisk, for which this would not work) a file mmaped on a ramfs
mount will actually share the same address space, and so would not
incur the typical double-ram overhead of mmaping a file in memory just
o have yet another copy of the file created in a second memory
location. Is this correct? If not, would you please point me to
documentation stating otherwise (I haven't found much documentation
either way).

Finally, given the desire to be quick at startup with a large index
that will still easily fit within a system's memory, am I thinking
about this wrong or are there other better approaches?

Thanks, as always,
     Aaron
Reply | Threaded
Open this post in threaded view
|

Re: MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

François Schiettecatte
Aaron

The best way to make sure the index is cached by the OS is to just cat it on startup:

        cat `find /path/to/solr/index` > /dev/null

Just make sure your index is smaller than RAM otherwise data will be rotated out.

Memory mapping is built on the virtual memory system, and I suspect that ramfs is too, so I doubt very much that copying your index to ramfs will help at all. Sidebar - a while ago I did a bunch of testing copying indices to shared memory (/dev/shm in this case) and there was no advantage compared to just accessing indices on disc when using memory mapping once the system got to a steady state.

There has been a lot written about this topic on the list. Basically it come down to using MMapDirectory (which is the default), make sure your index is smaller than your RAM, and allocate just enough memory to the Java VM. That last part requires some benchmarking because it is so workload dependent.

Best regards

François

On Oct 24, 2012, at 8:29 PM, Aaron Daubman <[hidden email]> wrote:

> Greetings,
>
> Most times I've seen the topic of storing one's index in memory, it
> seems the asker was referring (or understood to be referring) to the
> (in)famous "not intended to work with huge indexes" Solr RAMDirectory.
>
> Let me be clear that that I am not interested in RAMDirectory.
> However, I would like to better understand the oft-recommended and
> currently-default MMapDirectory, and what the tradeoffs would be, when
> using a 64-bit linux server dedicated to this single solr instance,
> with plenty (more than 2x index size) of RAM, of storing the index
> files on SSDs versus on a ramfs mount.
>
> I understand that using the default MMapDirectory will allow caching
> of the index in-memory, however, my understanding is that mmaped files
> are demand-paged (lazy evaluated), meaning that only after a block is
> read from disk will it be paged into memory - is this correct? is it
> actually block-by-block (page size by page size?) - any pointers to
> decent documentation on this regardless of the effectiveness of the
> approach would be appreciated...
>
> My concern with using MMapDirectory for an index stored on disk (even
> SSDs), if my understanding is correct, is that there is still a large
> startup cost to MMapDirectory, as it may take many queries before even
> most of a 20G index has been loaded into memory, and there may yet
> still be "dark corners" that only come up in edge-case queries that
> cause QTime spikes should these queries ever occur.
>
> I would like to ensure that, at startup, no query will incur
> disk-seek/read penalties.
>
> Is the "right" way to achieve this to copy the index to a ramfs (NOT
> ramdisk) mount and then continue to use MMapDirectory in Solr to read
> the index? I am under the impression that when using ramfs (rather
> than ramdisk, for which this would not work) a file mmaped on a ramfs
> mount will actually share the same address space, and so would not
> incur the typical double-ram overhead of mmaping a file in memory just
> o have yet another copy of the file created in a second memory
> location. Is this correct? If not, would you please point me to
> documentation stating otherwise (I haven't found much documentation
> either way).
>
> Finally, given the desire to be quick at startup with a large index
> that will still easily fit within a system's memory, am I thinking
> about this wrong or are there other better approaches?
>
> Thanks, as always,
>     Aaron

Reply | Threaded
Open this post in threaded view
|

Re: MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

Mark Miller-3
Was going to say the same thing. It's also usually a good idea to reduce paging (eg 0 swappiness in linux).

- Mark

On Oct 24, 2012, at 9:36 PM, François Schiettecatte <[hidden email]> wrote:

> Aaron
>
> The best way to make sure the index is cached by the OS is to just cat it on startup:
>
> cat `find /path/to/solr/index` > /dev/null
>
> Just make sure your index is smaller than RAM otherwise data will be rotated out.
>
> Memory mapping is built on the virtual memory system, and I suspect that ramfs is too, so I doubt very much that copying your index to ramfs will help at all. Sidebar - a while ago I did a bunch of testing copying indices to shared memory (/dev/shm in this case) and there was no advantage compared to just accessing indices on disc when using memory mapping once the system got to a steady state.
>
> There has been a lot written about this topic on the list. Basically it come down to using MMapDirectory (which is the default), make sure your index is smaller than your RAM, and allocate just enough memory to the Java VM. That last part requires some benchmarking because it is so workload dependent.
>
> Best regards
>
> François
>
> On Oct 24, 2012, at 8:29 PM, Aaron Daubman <[hidden email]> wrote:
>
>> Greetings,
>>
>> Most times I've seen the topic of storing one's index in memory, it
>> seems the asker was referring (or understood to be referring) to the
>> (in)famous "not intended to work with huge indexes" Solr RAMDirectory.
>>
>> Let me be clear that that I am not interested in RAMDirectory.
>> However, I would like to better understand the oft-recommended and
>> currently-default MMapDirectory, and what the tradeoffs would be, when
>> using a 64-bit linux server dedicated to this single solr instance,
>> with plenty (more than 2x index size) of RAM, of storing the index
>> files on SSDs versus on a ramfs mount.
>>
>> I understand that using the default MMapDirectory will allow caching
>> of the index in-memory, however, my understanding is that mmaped files
>> are demand-paged (lazy evaluated), meaning that only after a block is
>> read from disk will it be paged into memory - is this correct? is it
>> actually block-by-block (page size by page size?) - any pointers to
>> decent documentation on this regardless of the effectiveness of the
>> approach would be appreciated...
>>
>> My concern with using MMapDirectory for an index stored on disk (even
>> SSDs), if my understanding is correct, is that there is still a large
>> startup cost to MMapDirectory, as it may take many queries before even
>> most of a 20G index has been loaded into memory, and there may yet
>> still be "dark corners" that only come up in edge-case queries that
>> cause QTime spikes should these queries ever occur.
>>
>> I would like to ensure that, at startup, no query will incur
>> disk-seek/read penalties.
>>
>> Is the "right" way to achieve this to copy the index to a ramfs (NOT
>> ramdisk) mount and then continue to use MMapDirectory in Solr to read
>> the index? I am under the impression that when using ramfs (rather
>> than ramdisk, for which this would not work) a file mmaped on a ramfs
>> mount will actually share the same address space, and so would not
>> incur the typical double-ram overhead of mmaping a file in memory just
>> o have yet another copy of the file created in a second memory
>> location. Is this correct? If not, would you please point me to
>> documentation stating otherwise (I haven't found much documentation
>> either way).
>>
>> Finally, given the desire to be quick at startup with a large index
>> that will still easily fit within a system's memory, am I thinking
>> about this wrong or are there other better approaches?
>>
>> Thanks, as always,
>>    Aaron
>

Reply | Threaded
Open this post in threaded view
|

Re: MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

Shawn Heisey-4
In reply to this post by Aaron Daubman
On 10/24/2012 6:29 PM, Aaron Daubman wrote:

> Let me be clear that that I am not interested in RAMDirectory.
> However, I would like to better understand the oft-recommended and
> currently-default MMapDirectory, and what the tradeoffs would be, when
> using a 64-bit linux server dedicated to this single solr instance,
> with plenty (more than 2x index size) of RAM, of storing the index
> files on SSDs versus on a ramfs mount.
>
> I understand that using the default MMapDirectory will allow caching
> of the index in-memory, however, my understanding is that mmaped files
> are demand-paged (lazy evaluated), meaning that only after a block is
> read from disk will it be paged into memory - is this correct? is it
> actually block-by-block (page size by page size?) - any pointers to
> decent documentation on this regardless of the effectiveness of the
> approach would be appreciated...

You are correct that the data must have just been accessed to be in the
disk cache.This does however include writes -- so any data that gets
indexed will be in the cache because it has just been written.  I do
believe that it is read in one page block at a time, and I believe that
the blocks are 4k in size.

> My concern with using MMapDirectory for an index stored on disk (even
> SSDs), if my understanding is correct, is that there is still a large
> startup cost to MMapDirectory, as it may take many queries before even
> most of a 20G index has been loaded into memory, and there may yet
> still be "dark corners" that only come up in edge-case queries that
> cause QTime spikes should these queries ever occur.
>
> I would like to ensure that, at startup, no query will incur
> disk-seek/read penalties.
>
> Is the "right" way to achieve this to copy the index to a ramfs (NOT
> ramdisk) mount and then continue to use MMapDirectory in Solr to read
> the index? I am under the impression that when using ramfs (rather
> than ramdisk, for which this would not work) a file mmaped on a ramfs
> mount will actually share the same address space, and so would not
> incur the typical double-ram overhead of mmaping a file in memory just
> o have yet another copy of the file created in a second memory
> location. Is this correct? If not, would you please point me to
> documentation stating otherwise (I haven't found much documentation
> either way).

I am not familiar with any "double-ram overhead" from using mmap.  It
should be extroardinarily efficient, so much so that even when your
index won't fit in RAM, performance is typically still excellent.  Using
an SSD instead of a spinning disk will increase performance across the
board, until enough of the index is cached in RAM, after which it won't
make a lot of difference.

My parting thoughts, with a general note to the masses: Do not try this
if you are not absolutely sure your index will fit in memory!  It will
tend to cause WAY more problems than it will solve for most people with
large indexes.

If you actually do have considerably more RAM than your index size, and
you know that the index will never grow to where it might not fit, you
can use a simple trick to get it all cached, even before running
queries.  Just read the entire contents of the index, discarding
everything you read.  There are two main OS variants to consider here,
and both can be scripted, as noted below.  Run the command twice to see
the difference that caching makes for the second run.  Note that an SSD
would speed the first run of these commands up considerably:

*NIX (may work on a mac too):
cat /path/to/index/files/* > /dev/null

Windows:
type C:\Path\To\Index\Files\* > NUL

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

Erick Erickson
You may well have already seen this, but in case not:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

FWIW,
Erick

On Wed, Oct 24, 2012 at 9:51 PM, Shawn Heisey <[hidden email]> wrote:

> On 10/24/2012 6:29 PM, Aaron Daubman wrote:
>>
>> Let me be clear that that I am not interested in RAMDirectory.
>> However, I would like to better understand the oft-recommended and
>> currently-default MMapDirectory, and what the tradeoffs would be, when
>> using a 64-bit linux server dedicated to this single solr instance,
>> with plenty (more than 2x index size) of RAM, of storing the index
>> files on SSDs versus on a ramfs mount.
>>
>> I understand that using the default MMapDirectory will allow caching
>> of the index in-memory, however, my understanding is that mmaped files
>> are demand-paged (lazy evaluated), meaning that only after a block is
>> read from disk will it be paged into memory - is this correct? is it
>> actually block-by-block (page size by page size?) - any pointers to
>> decent documentation on this regardless of the effectiveness of the
>> approach would be appreciated...
>
>
> You are correct that the data must have just been accessed to be in the disk
> cache.This does however include writes -- so any data that gets indexed will
> be in the cache because it has just been written.  I do believe that it is
> read in one page block at a time, and I believe that the blocks are 4k in
> size.
>
>
>> My concern with using MMapDirectory for an index stored on disk (even
>> SSDs), if my understanding is correct, is that there is still a large
>> startup cost to MMapDirectory, as it may take many queries before even
>> most of a 20G index has been loaded into memory, and there may yet
>> still be "dark corners" that only come up in edge-case queries that
>> cause QTime spikes should these queries ever occur.
>>
>> I would like to ensure that, at startup, no query will incur
>> disk-seek/read penalties.
>>
>> Is the "right" way to achieve this to copy the index to a ramfs (NOT
>> ramdisk) mount and then continue to use MMapDirectory in Solr to read
>> the index? I am under the impression that when using ramfs (rather
>> than ramdisk, for which this would not work) a file mmaped on a ramfs
>> mount will actually share the same address space, and so would not
>> incur the typical double-ram overhead of mmaping a file in memory just
>> o have yet another copy of the file created in a second memory
>> location. Is this correct? If not, would you please point me to
>> documentation stating otherwise (I haven't found much documentation
>> either way).
>
>
> I am not familiar with any "double-ram overhead" from using mmap.  It should
> be extroardinarily efficient, so much so that even when your index won't fit
> in RAM, performance is typically still excellent.  Using an SSD instead of a
> spinning disk will increase performance across the board, until enough of
> the index is cached in RAM, after which it won't make a lot of difference.
>
> My parting thoughts, with a general note to the masses: Do not try this if
> you are not absolutely sure your index will fit in memory!  It will tend to
> cause WAY more problems than it will solve for most people with large
> indexes.
>
> If you actually do have considerably more RAM than your index size, and you
> know that the index will never grow to where it might not fit, you can use a
> simple trick to get it all cached, even before running queries.  Just read
> the entire contents of the index, discarding everything you read.  There are
> two main OS variants to consider here, and both can be scripted, as noted
> below.  Run the command twice to see the difference that caching makes for
> the second run.  Note that an SSD would speed the first run of these
> commands up considerably:
>
> *NIX (may work on a mac too):
> cat /path/to/index/files/* > /dev/null
>
> Windows:
> type C:\Path\To\Index\Files\* > NUL
>
> Thanks,
> Shawn
>