Solr Server crashes when requesting a result with too large resultRows

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr Server crashes when requesting a result with too large resultRows

Georg Fette
We run the server version 7.3.1. on a machine with 32GB RAM in a mode
having -10g.

When requesting a query with

q={!boost
b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647

the server takes all available memory up to 10GB and is then no longer
accessible with one processor at 100%.

When we reduce the rows parameter to 10000000 the query works. The query
returns only 581 results.

The documentation at https://wiki.apache.org/solr/CommonQueryParameters 
states that as the "rows" parameter a "ridiculously large value" may be
used, but this could pose a problem. The number we used was Int.max from
Java.

Greetings
Georg

--
---------------------------------------------------------------------
Dipl.-Inf. Georg Fette      Raum: B001
Universität Würzburg        Tel.: +49-(0)931-31-85516
Am Hubland                  Fax.: +49-(0)931-31-86732
97074 Würzburg              mail: [hidden email]
---------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Solr Server crashes when requesting a result with too large resultRows

Georg Fette
Hello,
We run the server version 7.3.1. on a machine with 32GB RAM in a mode
having -10g.
When requesting a query with
q={!boost
b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
the server takes all available memory up to 10GB and is then no longer
accessible with one processor at 100%.
When we reduce the rows parameter to 10000000 the query works. The query
returns only 581 results.
The documentation at https://wiki.apache.org/solr/CommonQueryParameters 
states that as the "rows" parameter a "ridiculously large value" may be
used, but this could pose a problem. The number we used was Int.max from
Java.
Greetings
Georg

--
---------------------------------------------------------------------
Dipl.-Inf. Georg Fette      Raum: B001
Universität Würzburg        Tel.: +49-(0)931-31-85516
Am Hubland                  Fax.: +49-(0)931-31-86732
97074 Würzburg              mail: [hidden email]
---------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

RE: Solr Server crashes when requesting a result with too large resultRows

Markus Jelsma-2
Hello Georg,

As you have seen, a high rows parameter is a bad idea. Use cursor mark [1] instead.

Regards,
Markus

[1] https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html
 
 
-----Original message-----

> From:Georg Fette <[hidden email]>
> Sent: Tuesday 31st July 2018 10:44
> To: [hidden email]
> Subject: Solr Server crashes when requesting a result with too large resultRows
>
> Hello,
> We run the server version 7.3.1. on a machine with 32GB RAM in a mode
> having -10g.
> When requesting a query with
> q={!boost
> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
> the server takes all available memory up to 10GB and is then no longer
> accessible with one processor at 100%.
> When we reduce the rows parameter to 10000000 the query works. The query
> returns only 581 results.
> The documentation at https://wiki.apache.org/solr/CommonQueryParameters 
> states that as the "rows" parameter a "ridiculously large value" may be
> used, but this could pose a problem. The number we used was Int.max from
> Java.
> Greetings
> Georg
>
> --
> ---------------------------------------------------------------------
> Dipl.-Inf. Georg Fette      Raum: B001
> Universität Würzburg        Tel.: +49-(0)931-31-85516
> Am Hubland                  Fax.: +49-(0)931-31-86732
> 97074 Würzburg              mail: [hidden email]
> ---------------------------------------------------------------------
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Andrea Gazzarini-6
In reply to this post by Georg Fette
Hi Georg,
I would say, without knowing your context, that this is not what Solr is
supposed to do. You're asking to load everything in a single
request/response and this poses a problem.
Since I guess that, even we assume it works, you should then iterate
those results one by one or in blocks, an option would be to do this
part (block scrolling) using Solr [2].
I suggest you to have a look at

  * the export endpoint [1]
  * the cursor API [2]

Best,
Andrea

[1] https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html
[2]
https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors

On 31/07/18 10:44, Georg Fette wrote:

> Hello,
> We run the server version 7.3.1. on a machine with 32GB RAM in a mode
> having -10g.
> When requesting a query with
> q={!boost
> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
> the server takes all available memory up to 10GB and is then no longer
> accessible with one processor at 100%.
> When we reduce the rows parameter to 10000000 the query works. The
> query returns only 581 results.
> The documentation at
> https://wiki.apache.org/solr/CommonQueryParameters states that as the
> "rows" parameter a "ridiculously large value" may be used, but this
> could pose a problem. The number we used was Int.max from Java.
> Greetings
> Georg
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Georg Fette
Hi Andrea,
I agree that receiving too much data in one request is bad. But I was
surprised that the query works with a lower but still very large rows
parameter and that there is a threshold at which it crashes the server.
Furthermore, it seems that the reason for the crash is not the size of
the actual results because those are only 581.
Greetings
Georg

Am 31.07.2018 um 10:53 schrieb Andrea Gazzarini:

> Hi Georg,
> I would say, without knowing your context, that this is not what Solr
> is supposed to do. You're asking to load everything in a single
> request/response and this poses a problem.
> Since I guess that, even we assume it works, you should then iterate
> those results one by one or in blocks, an option would be to do this
> part (block scrolling) using Solr [2].
> I suggest you to have a look at
>
>  * the export endpoint [1]
>  * the cursor API [2]
>
> Best,
> Andrea
>
> [1] https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html
> [2]
> https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors
>
> On 31/07/18 10:44, Georg Fette wrote:
>> Hello,
>> We run the server version 7.3.1. on a machine with 32GB RAM in a mode
>> having -10g.
>> When requesting a query with
>> q={!boost
>> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
>> the server takes all available memory up to 10GB and is then no
>> longer accessible with one processor at 100%.
>> When we reduce the rows parameter to 10000000 the query works. The
>> query returns only 581 results.
>> The documentation at
>> https://wiki.apache.org/solr/CommonQueryParameters states that as the
>> "rows" parameter a "ridiculously large value" may be used, but this
>> could pose a problem. The number we used was Int.max from Java.
>> Greetings
>> Georg
>>
>
>

--
---------------------------------------------------------------------
Dipl.-Inf. Georg Fette      Raum: B009
Universität Würzburg        Tel.: +49-(0)931-31-85516
Am Hubland                  Fax.: +49-(0)931-31-86732
97074 Würzburg              mail: [hidden email]
---------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Andrea Gazzarini-6
Yes, but 581 is the final number you got in the response, which is the
result of the main query intersected with the filter query so I wouldn't
take in account this number. The main and the filter query are executed
separately, so I guess (but I'm guessing because I don't know these
internals) that's here where the "rows" parameter matters.

Again, I'm guessing, I'm sure some Solr committer here can explain you
how things are working.

Best,
Andrea

On 31/07/18 11:12, Fette, Georg wrote:

> Hi Andrea,
> I agree that receiving too much data in one request is bad. But I was
> surprised that the query works with a lower but still very large rows
> parameter and that there is a threshold at which it crashes the
> server. Furthermore, it seems that the reason for the crash is not the
> size of the actual results because those are only 581.
> Greetings
> Georg
>
> Am 31.07.2018 um 10:53 schrieb Andrea Gazzarini:
>> Hi Georg,
>> I would say, without knowing your context, that this is not what Solr
>> is supposed to do. You're asking to load everything in a single
>> request/response and this poses a problem.
>> Since I guess that, even we assume it works, you should then iterate
>> those results one by one or in blocks, an option would be to do this
>> part (block scrolling) using Solr [2].
>> I suggest you to have a look at
>>
>>  * the export endpoint [1]
>>  * the cursor API [2]
>>
>> Best,
>> Andrea
>>
>> [1] https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html
>> [2]
>> https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors
>>
>> On 31/07/18 10:44, Georg Fette wrote:
>>> Hello,
>>> We run the server version 7.3.1. on a machine with 32GB RAM in a
>>> mode having -10g.
>>> When requesting a query with
>>> q={!boost
>>> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
>>> the server takes all available memory up to 10GB and is then no
>>> longer accessible with one processor at 100%.
>>> When we reduce the rows parameter to 10000000 the query works. The
>>> query returns only 581 results.
>>> The documentation at
>>> https://wiki.apache.org/solr/CommonQueryParameters states that as
>>> the "rows" parameter a "ridiculously large value" may be used, but
>>> this could pose a problem. The number we used was Int.max from Java.
>>> Greetings
>>> Georg
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Christopher Schultz
In reply to this post by Georg Fette
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Georg,

On 7/31/18 4:39 AM, Georg Fette wrote:
> We run the server version 7.3.1. on a machine with 32GB RAM in a
> mode having -10g.
>
> When requesting a query with
>
> q={!boost
> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string
_field_type:catalog_entry&rows=2147483647
>
>
>
> the server takes all available memory up to 10GB and is then no
> longer accessible with one processor at 100%.

Is it a single thread which takes the CPU or more than one? Can you
identify that thread and take a thread dump to get a backtrace for
that thread?

> When we reduce the rows parameter to 10000000 the query works. The
> query returns only 581 results.
>
> The documentation at
> https://wiki.apache.org/solr/CommonQueryParameters states that as
> the "rows" parameter a "ridiculously large value" may be used, but
> this could pose a problem. The number we used was Int.max from
> Java.

Interesting. I wonder if Solr attempts to pre-allocate a result
buffer. Requesting 2147483647 rows can have an adverse affect on most
pre-allocated data structures.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAltgXy4ACgkQHPApP6U8
pFicOQ//c1Qe0hLOHIbSvmxAMVEhqZTjQlzEGFoYYhC1aGrYpw++RKQYtBLD2kmN
DcLLkwFOmwv5CDft+Mn+g5ZWhEuZSKnwFgxsPfTAbRGjDYGQ7qCzzGq2JGacoxTJ
rPgizyRlZQ4f5QY0RHohAGFx/QhgPtLdSl0V32eERWH8fVJWvDH3iYTTTSDN4UCY
/bpB34nrruBgh2iTz9UcGR1jnTw9iU57OVYRwtTk8ETeOivcBM5MTXzKbwQ8/w5m
c7lmKWqMG0G5XKKu6KDbWFZwSwYLBvHTUQurqgS2pkm+r2c4xP5/U0+uI5D9EseS
1HiOjWBuhWFEIveioKCOQbPAWL+C0i4xMbBLiC4RZPnTs6LSQ0aXm4Jx05NFoAWt
3HA2VCb9rrK5y8cICSCbVGaPNNBT9HHqJqeo2eGbzLaZXP5iRCc8BdkjHTPrSqCq
gh8FEAK9pVS3ejO96DZvIoiIEpcmRNuSHczdE7YKwCv5XvytSh4QXa0SKluEhpYo
acPXOtjIbqFcTZ1f+hZTfiG1/PeCUnYshta8VdSyvIjm748wOB7wqs7uYhl0b6zx
i6OgoQ3bOel8e7oAO4Fmv5LE56b8A4tOPzPBf4Y1ehb8e8HbBdSzZuzqZZrQqChQ
AUfrEzaXUKIBsmlaUneT2qjsLLZZmU+Gk0EYJnmHw63RQR/QxKg=
=IXGx
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Shawn Heisey-2
In reply to this post by Georg Fette
On 7/31/2018 2:39 AM, Georg Fette wrote:

> We run the server version 7.3.1. on a machine with 32GB RAM in a mode
> having -10g.
>
> When requesting a query with
>
> q={!boost
> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string_field_type:catalog_entry&rows=2147483647
>
> the server takes all available memory up to 10GB and is then no longer
> accessible with one processor at 100%.
>
> When we reduce the rows parameter to 10000000 the query works. The
> query returns only 581 results.

This is happening because of the way that Solr prepares for searching. 
Objects are allocated in heap memory according to the rows value before
the query even gets executed.  If you run Solr on an operating system
other than Windows, the resulting OutOfMemoryError will cause the Solr
process to be killed.  If it's running on Windows, Solr would stay
running, but we have no way of knowing whether it would work *correctly*
after OOME.

https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_too_many_rows

At the link above is a link to a blog post that covers the problem in
great detail.

https://sbdevel.wordpress.com/2015/10/05/speeding-up-core-search/

With a rows parameter of over 2 billion, Solr (actually it's Lucene,
which provides most of Solr's functionality) will allocate that many
ScoreDoc objects, which needs about 60GB of heap memory.  So it's not
possible on your hardware.

As you'll see if you read the blog post, Toke has some ideas about how
to improve the situation.  I don't think an issue has been filed, but I
could be wrong about that.

Right now, switching to cursorMark or the /export handler is a better
way to get a very large result set.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Georg Fette
In reply to this post by Christopher Schultz
Hi Christoph,
Yes ist is only one of the processors that is at maximum capacity.
How do I do something like a thread-dump of a single thread ? We run the
Solr from the command line out-of-the-box and not in a code development
environment. Are there parameters that can be configured so that the
server creates dumps ?
Greetings
Georg

Am 31.07.2018 um 15:07 schrieb Christopher Schultz:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Georg,
>
> On 7/31/18 4:39 AM, Georg Fette wrote:
>> We run the server version 7.3.1. on a machine with 32GB RAM in a
>> mode having -10g.
>>
>> When requesting a query with
>>
>> q={!boost
>> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=string
> _field_type:catalog_entry&rows=2147483647
>>
>>
>> the server takes all available memory up to 10GB and is then no
>> longer accessible with one processor at 100%.
> Is it a single thread which takes the CPU or more than one? Can you
> identify that thread and take a thread dump to get a backtrace for
> that thread?
>
>> When we reduce the rows parameter to 10000000 the query works. The
>> query returns only 581 results.
>>
>> The documentation at
>> https://wiki.apache.org/solr/CommonQueryParameters states that as
>> the "rows" parameter a "ridiculously large value" may be used, but
>> this could pose a problem. The number we used was Int.max from
>> Java.
> Interesting. I wonder if Solr attempts to pre-allocate a result
> buffer. Requesting 2147483647 rows can have an adverse affect on most
> pre-allocated data structures.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAltgXy4ACgkQHPApP6U8
> pFicOQ//c1Qe0hLOHIbSvmxAMVEhqZTjQlzEGFoYYhC1aGrYpw++RKQYtBLD2kmN
> DcLLkwFOmwv5CDft+Mn+g5ZWhEuZSKnwFgxsPfTAbRGjDYGQ7qCzzGq2JGacoxTJ
> rPgizyRlZQ4f5QY0RHohAGFx/QhgPtLdSl0V32eERWH8fVJWvDH3iYTTTSDN4UCY
> /bpB34nrruBgh2iTz9UcGR1jnTw9iU57OVYRwtTk8ETeOivcBM5MTXzKbwQ8/w5m
> c7lmKWqMG0G5XKKu6KDbWFZwSwYLBvHTUQurqgS2pkm+r2c4xP5/U0+uI5D9EseS
> 1HiOjWBuhWFEIveioKCOQbPAWL+C0i4xMbBLiC4RZPnTs6LSQ0aXm4Jx05NFoAWt
> 3HA2VCb9rrK5y8cICSCbVGaPNNBT9HHqJqeo2eGbzLaZXP5iRCc8BdkjHTPrSqCq
> gh8FEAK9pVS3ejO96DZvIoiIEpcmRNuSHczdE7YKwCv5XvytSh4QXa0SKluEhpYo
> acPXOtjIbqFcTZ1f+hZTfiG1/PeCUnYshta8VdSyvIjm748wOB7wqs7uYhl0b6zx
> i6OgoQ3bOel8e7oAO4Fmv5LE56b8A4tOPzPBf4Y1ehb8e8HbBdSzZuzqZZrQqChQ
> AUfrEzaXUKIBsmlaUneT2qjsLLZZmU+Gk0EYJnmHw63RQR/QxKg=
> =IXGx
> -----END PGP SIGNATURE-----
>

--
---------------------------------------------------------------------
Dipl.-Inf. Georg Fette      Raum: B009
Universität Würzburg        Tel.: +49-(0)931-31-85516
Am Hubland                  Fax.: +49-(0)931-31-86732
97074 Würzburg              mail: [hidden email]
---------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Christopher Schultz
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Georg,

On 7/31/18 12:33 PM, Georg Fette wrote:
> Yes ist is only one of the processors that is at maximum capacity.

Ok.

> How do I do something like a thread-dump of a single thread ?

Here's how to get a thread dump of the whole JVM:
https://wiki.apache.org/tomcat/HowTo#How_do_I_obtain_a_thread_dump_of_my
_running_webapp_.3F

The "tid" field of each thread is usually the same as the process-id
from a "top" or "ps" listing, except it's often shown in hex instead
of decimal.

Have a look at this for some guidance:
http://javadrama.blogspot.com/2012/02/why-is-java-eating-my-cpu.html

Some tools dump the tid in hex, others in decimal. It's frustrating
sometimes.

> We run the Solr from the command line out-of-the-box and not in a
> code development environment. Are there parameters that can be
> configured so that the server creates dumps ?
You don't want this to happen automatically. Instead, you'll want to
trigger a dump manually for debugging purposes.

- -chris


> Am 31.07.2018 um 15:07 schrieb Christopher Schultz: Georg,
>
> On 7/31/18 4:39 AM, Georg Fette wrote:
>>>> We run the server version 7.3.1. on a machine with 32GB RAM
>>>> in a mode having -10g.
>>>>
>>>> When requesting a query with
>>>>
>>>> q={!boost
>>>> b=sv_int_catalog_count_document}string_catalog_aliases:(*2*)&fq=str
ing
>
>>>>
_field_type:catalog_entry&rows=2147483647

>>>>
>>>>
>>>> the server takes all available memory up to 10GB and is then
>>>> no longer accessible with one processor at 100%.
> Is it a single thread which takes the CPU or more than one? Can
> you identify that thread and take a thread dump to get a backtrace
> for that thread?
>
>>>> When we reduce the rows parameter to 10000000 the query
>>>> works. The query returns only 581 results.
>>>>
>>>> The documentation at
>>>> https://wiki.apache.org/solr/CommonQueryParameters states
>>>> that as the "rows" parameter a "ridiculously large value" may
>>>> be used, but this could pose a problem. The number we used
>>>> was Int.max from Java.
> Interesting. I wonder if Solr attempts to pre-allocate a result
> buffer. Requesting 2147483647 rows can have an adverse affect on
> most pre-allocated data structures.
>
> -chris
>>
>
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAltgySgACgkQHPApP6U8
pFjgKxAAxfbUmcj81+CpmTwHaPsz8Zb70HX4o/1eDGwALMhuvg8MyTaZnR9rSPy3
LHhAn0dtdnhp7Pe3NWRrYFdzKOZjQ85jiEcW96bzCe5ggJmnvs9a9VeEJ+5b4AXN
XMtSMo8Ph7BvAWeTQcwmsiK8w2grAzaV6zXEetxaXgL0+16wfIjyNBteiQHkpcjo
T5T5UzSzwyuAxFJkxSdbsF6SAJD7+zwbOEUQlURlUBsmzgam124ojgNl3gEG8d/V
SSFhI1vnuj7pkdFLSZm7BDdAw6KjnOeM3yE3VKh5Lem4CRNLrP3ZvKrzKVlWTFJ4
dAIuJL6GUSMEFU0MCwQZjFxmtWNMwl/MIdDD8Yp9m/GislLXbcOi4oBbmWTNnuqU
SPtmjdV+7fcIRl8AWc0bzLbK4nFYlVFzhiijR5am+pvF13TB/WQ8eOn9uifSPxWb
OHzrU+fMV0fvIe5pZxqkcHEBas5QiZKZ5yH6Zz+xLldF4nh9Q4A6CJu/21qU/Kxd
Dp2lenZEjKc90FKpSVMXqxJNM0n7geRmTSgv8imeoQf5+H6VU7dll1xGQkTnXtR9
UyV/U1fj12z2UjzcY6ePuJ8BadIx+cSf6H3q4bcJOGZ884lI+bDX08C/89hb/5vT
2NE5+tK1jAOX/ESClb6eFFMcJzBww/CoIxb9PpRqgw3HJKYuVpY=
=mS/y
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: Solr Server crashes when requesting a result with too large resultRows

Toke Eskildsen-2
In reply to this post by Georg Fette
On Tue, 2018-07-31 at 11:12 +0200, Fette, Georg wrote:
> I agree that receiving too much data in one request is bad. But I
> was surprised that the query works with a lower but still very large
> rows parameter and that there is a threshold at which it crashes the
> server. 
> Furthermore, it seems that the reason for the crash is not the size
> of the actual results because those are only 581.

Under the hood, a priority queue is initialized with room for
min(#docs_in_index, rows) document markers. Furthermore that queue is
initialized with placeholder objects (called sentinels). This
structure becomes heavy when entering millions territory, both in terms
of raw memory and in terms of GC overhead due to all the objects. You
could have 1 hit and it would still hit OOM.

It is possible to optimize that part of the Solr code for larger
requests (see https://issues.apache.org/jira/browse/LUCENE-2127 and htt
ps://issues.apache.org/jira/browse/LUCENE-6828), but that would just be
a temporary fix until even larger indexes are queried. The deep paging
or streaming exports that Andrea suggests scales indefinitely in terms
of both documents in the index and documents in the result set.

I would argue your OOM with small result sets and huge rows is a good
thing: You encounter the problem immediately, instead of hitting it at
some random time when a match-a-lot query is issued by a user.

- Toke Eskildsen, Royal Danish Library