[jira] [Updated] (SOLR-11475) Endless loop and OOM in PeerSync

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (SOLR-11475) Endless loop and OOM in PeerSync

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrey Kudryavtsev updated SOLR-11475:
--------------------------------------
    Description:
After problem described in SOLR-11459, I restarted cluster and got OOM on start.

[PeerSync#handleVersionsWithRanges|https://github.com/apache/lucene-solr/blob/68bda0be421ce18811e03b229781fd6152fcc04a/solr/core/src/java/org/apache/solr/update/PeerSync.java#L539] contains this logic:

{code}
    while (otherUpdatesIndex >= 0) {
      // we have run out of ourUpdates, pick up all the remaining versions from the other versions
      if (ourUpdatesIndex < 0) {
        String range = otherVersions.get(otherUpdatesIndex) + "..." + otherVersions.get(0);
        rangesToRequest.add(range);
        totalRequestedVersions += otherUpdatesIndex + 1;
        break;
      }

      // stop when the entries get old enough that reorders may lead us to see updates we don't need
      if (!completeList && Math.abs(otherVersions.get(otherUpdatesIndex)) < ourLowThreshold) break;

      if (ourUpdates.get(ourUpdatesIndex).longValue() == otherVersions.get(otherUpdatesIndex).longValue()) {
        ourUpdatesIndex--;
        otherUpdatesIndex--;
      } else if (Math.abs(ourUpdates.get(ourUpdatesIndex)) < Math.abs(otherVersions.get(otherUpdatesIndex))) {
        ourUpdatesIndex--;
      } else {
        long rangeStart = otherVersions.get(otherUpdatesIndex);
        while ((otherUpdatesIndex < otherVersions.size())
            && (Math.abs(otherVersions.get(otherUpdatesIndex)) < Math.abs(ourUpdates.get(ourUpdatesIndex)))) {
          otherUpdatesIndex--;
          totalRequestedVersions++;
        }
        // construct range here
        rangesToRequest.add(rangeStart + "..." + otherVersions.get(otherUpdatesIndex + 1));
      }
    }
{code}

If at some point there will be
{code} ourUpdates.get(ourUpdatesIndex) = -otherVersions.get(otherUpdatesIndex) {code}
loop will never end. It will same string again and again into {{rangesToRequest}} until process runs out of memory.




  was:
After problem described in SOLR-11459, I restarted cluster and got OOM on start.

PeerSync#handleVersionsWithRanges contains this logic:

{code}
    while (otherUpdatesIndex >= 0) {
      // we have run out of ourUpdates, pick up all the remaining versions from the other versions
      if (ourUpdatesIndex < 0) {
        String range = otherVersions.get(otherUpdatesIndex) + "..." + otherVersions.get(0);
        rangesToRequest.add(range);
        totalRequestedVersions += otherUpdatesIndex + 1;
        break;
      }

      // stop when the entries get old enough that reorders may lead us to see updates we don't need
      if (!completeList && Math.abs(otherVersions.get(otherUpdatesIndex)) < ourLowThreshold) break;

      if (ourUpdates.get(ourUpdatesIndex).longValue() == otherVersions.get(otherUpdatesIndex).longValue()) {
        ourUpdatesIndex--;
        otherUpdatesIndex--;
      } else if (Math.abs(ourUpdates.get(ourUpdatesIndex)) < Math.abs(otherVersions.get(otherUpdatesIndex))) {
        ourUpdatesIndex--;
      } else {
        long rangeStart = otherVersions.get(otherUpdatesIndex);
        while ((otherUpdatesIndex < otherVersions.size())
            && (Math.abs(otherVersions.get(otherUpdatesIndex)) < Math.abs(ourUpdates.get(ourUpdatesIndex)))) {
          otherUpdatesIndex--;
          totalRequestedVersions++;
        }
        // construct range here
        rangesToRequest.add(rangeStart + "..." + otherVersions.get(otherUpdatesIndex + 1));
      }
    }
{code}

If at some point there will be
{code} ourUpdates.get(ourUpdatesIndex) = -otherVersions.get(otherUpdatesIndex) {code}
loop will never end. It will same string again and again into {{rangesToRequest}} until process runs out of memory.





> Endless loop and OOM in PeerSync
> --------------------------------
>
>                 Key: SOLR-11475
>                 URL: https://issues.apache.org/jira/browse/SOLR-11475
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public)
>            Reporter: Andrey Kudryavtsev
>
> After problem described in SOLR-11459, I restarted cluster and got OOM on start.
> [PeerSync#handleVersionsWithRanges|https://github.com/apache/lucene-solr/blob/68bda0be421ce18811e03b229781fd6152fcc04a/solr/core/src/java/org/apache/solr/update/PeerSync.java#L539] contains this logic:
> {code}
>     while (otherUpdatesIndex >= 0) {
>       // we have run out of ourUpdates, pick up all the remaining versions from the other versions
>       if (ourUpdatesIndex < 0) {
>         String range = otherVersions.get(otherUpdatesIndex) + "..." + otherVersions.get(0);
>         rangesToRequest.add(range);
>         totalRequestedVersions += otherUpdatesIndex + 1;
>         break;
>       }
>       // stop when the entries get old enough that reorders may lead us to see updates we don't need
>       if (!completeList && Math.abs(otherVersions.get(otherUpdatesIndex)) < ourLowThreshold) break;
>       if (ourUpdates.get(ourUpdatesIndex).longValue() == otherVersions.get(otherUpdatesIndex).longValue()) {
>         ourUpdatesIndex--;
>         otherUpdatesIndex--;
>       } else if (Math.abs(ourUpdates.get(ourUpdatesIndex)) < Math.abs(otherVersions.get(otherUpdatesIndex))) {
>         ourUpdatesIndex--;
>       } else {
>         long rangeStart = otherVersions.get(otherUpdatesIndex);
>         while ((otherUpdatesIndex < otherVersions.size())
>             && (Math.abs(otherVersions.get(otherUpdatesIndex)) < Math.abs(ourUpdates.get(ourUpdatesIndex)))) {
>           otherUpdatesIndex--;
>           totalRequestedVersions++;
>         }
>         // construct range here
>         rangesToRequest.add(rangeStart + "..." + otherVersions.get(otherUpdatesIndex + 1));
>       }
>     }
> {code}
> If at some point there will be
> {code} ourUpdates.get(ourUpdatesIndex) = -otherVersions.get(otherUpdatesIndex) {code}
> loop will never end. It will same string again and again into {{rangesToRequest}} until process runs out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]