Solr long q values

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr long q values

solrnoobie
So whenever we have long q values (from a sentence to a small paragraph), we
encounter some heap problems (OOM) and I guess this is normal?

So my question would be is how should we handle this type of problem? Of
course we could always limit the size of the search term queries in the
application side but is there anything we could do in our configuration that
could prevent the OOM issues even if some random user intentionally bombard
us with long search queries in the front end?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

Shawn Heisey-2
On 5/3/2019 2:32 AM, solrnoobie wrote:
> So whenever we have long q values (from a sentence to a small paragraph), we
> encounter some heap problems (OOM) and I guess this is normal?
>
> So my question would be is how should we handle this type of problem? Of
> course we could always limit the size of the search term queries in the
> application side but is there anything we could do in our configuration that
> could prevent the OOM issues even if some random user intentionally bombard
> us with long search queries in the front end?

If you're running out of memory, then Solr will need a larger heap, or
you'll need to change something so it requires less heap.

A large query string is one of those things that might require a larger
heap.

The default heap size that Solr has shipped with since 5.0 is 512MB ...
which is VERY small.  Virtually all Solr users will need to increase
this or they will run into OOME, or find that their server is running
extremely slow.  It does not take very much index data to require more
than 512MB heap.

A thought for Erick and other committers:  I know we are trying to
reduce log verbosity.  But along the same lines as the log entries about
file and process limits, I was thinking it might be a good idea to have
a one-line WARN entry if the max heap size is 1GB or less.  And a config
option to disable the logging.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

Walter Underwood
We run very long queries with an 8 GB heap. 30 million documents in 8 shards with an average query length of 25 terms.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On May 3, 2019, at 6:49 PM, Shawn Heisey <[hidden email]> wrote:
>
> On 5/3/2019 2:32 AM, solrnoobie wrote:
>> So whenever we have long q values (from a sentence to a small paragraph), we
>> encounter some heap problems (OOM) and I guess this is normal?
>> So my question would be is how should we handle this type of problem? Of
>> course we could always limit the size of the search term queries in the
>> application side but is there anything we could do in our configuration that
>> could prevent the OOM issues even if some random user intentionally bombard
>> us with long search queries in the front end?
>
> If you're running out of memory, then Solr will need a larger heap, or you'll need to change something so it requires less heap.
>
> A large query string is one of those things that might require a larger heap.
>
> The default heap size that Solr has shipped with since 5.0 is 512MB ... which is VERY small.  Virtually all Solr users will need to increase this or they will run into OOME, or find that their server is running extremely slow.  It does not take very much index data to require more than 512MB heap.
>
> A thought for Erick and other committers:  I know we are trying to reduce log verbosity.  But along the same lines as the log entries about file and process limits, I was thinking it might be a good idea to have a one-line WARN entry if the max heap size is 1GB or less.  And a config option to disable the logging.
>
> Thanks,
> Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

Erick Erickson
Shawn:

We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?

Feel free to, raise a JIRA, but I won’t have any time to work on it….

> On May 3, 2019, at 3:27 PM, Walter Underwood <[hidden email]> wrote:
>
> We run very long queries with an 8 GB heap. 30 million documents in 8 shards with an average query length of 25 terms.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>> On May 3, 2019, at 6:49 PM, Shawn Heisey <[hidden email]> wrote:
>>
>> On 5/3/2019 2:32 AM, solrnoobie wrote:
>>> So whenever we have long q values (from a sentence to a small paragraph), we
>>> encounter some heap problems (OOM) and I guess this is normal?
>>> So my question would be is how should we handle this type of problem? Of
>>> course we could always limit the size of the search term queries in the
>>> application side but is there anything we could do in our configuration that
>>> could prevent the OOM issues even if some random user intentionally bombard
>>> us with long search queries in the front end?
>>
>> If you're running out of memory, then Solr will need a larger heap, or you'll need to change something so it requires less heap.
>>
>> A large query string is one of those things that might require a larger heap.
>>
>> The default heap size that Solr has shipped with since 5.0 is 512MB ... which is VERY small.  Virtually all Solr users will need to increase this or they will run into OOME, or find that their server is running extremely slow.  It does not take very much index data to require more than 512MB heap.
>>
>> A thought for Erick and other committers:  I know we are trying to reduce log verbosity.  But along the same lines as the log entries about file and process limits, I was thinking it might be a good idea to have a one-line WARN entry if the max heap size is 1GB or less.  And a config option to disable the logging.
>>
>> Thanks,
>> Shawn
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

Shawn Heisey-2
On 5/3/2019 1:37 PM, Erick Erickson wrote:
> We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?
>
> Feel free to, raise a JIRA, but I won’t have any time to work on it….

Done.

https://issues.apache.org/jira/browse/SOLR-13446

I think that for typical server systems, starting with a 512MB heap is a
little bit nuts.

I think I know why such a low number was chosen.  Without a much smarter
startup, a super low default is the only way to ensure that Solr will
start on virtually any system that somebody tries it on, like the small
AWS servers.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

Walter Underwood
512M was the default heap for Java 1.1. We never changed the default. So no size was “chosen”.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On May 3, 2019, at 10:11 PM, Shawn Heisey <[hidden email]> wrote:
>
> On 5/3/2019 1:37 PM, Erick Erickson wrote:
>> We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?
>> Feel free to, raise a JIRA, but I won’t have any time to work on it….
>
> Done.
>
> https://issues.apache.org/jira/browse/SOLR-13446
>
> I think that for typical server systems, starting with a 512MB heap is a little bit nuts.
>
> I think I know why such a low number was chosen.  Without a much smarter startup, a super low default is the only way to ensure that Solr will start on virtually any system that somebody tries it on, like the small AWS servers.
>
> Thanks,
> Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr long q values

solrnoobie
In reply to this post by solrnoobie
Thank you for the replies!

Because of everyone's insight, I was able to deduce that the problem was on
our configuration.

Our heap size was 10 gigs so I don't think this is the problem since we only
have 900k data. So when we took a closer look at our schema, 2 of the
relevant fields has ShingleFilterFactory on query time so this caused the
OOM for long q values!



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html