[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740855#comment-16740855 ]

Ankit Jain commented on LUCENE-8635:
------------------------------------

I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change:

 

   [junit4] Tests with failures [seed: 1D3ADDF6AE377902]:

   [junit4]   - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup

   [junit4]   - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger

   [junit4]

   [junit4]

   [junit4] JVM J0:     1.40 ..  4359.18 =  4357.78s

   [junit4] JVM J1:     1.40 ..  4359.35 =  4357.95s

   [junit4] JVM J2:     1.40 ..  4359.30 =  4357.90s

   [junit4] Execution time total: 1 hour 12 minutes 40 seconds

   [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions)

 

Details for failing tests

 

 NOTE: reproduce with: ant test  -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII

   [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<<

   [junit4]    > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2>

   [junit4]    >        at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0)

   [junit4]    >        at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113)

   [junit4]    >        at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66)

   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

   [junit4]    >        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

   [junit4]    >        at java.base/java.lang.reflect.Method.invoke(Method.java:564)

   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:844)

 

NOTE: reproduce with: ant test  -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII

   [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<<

at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0)

   [junit4]    >        at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70)

   [junit4]    >        at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167)

   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

   [junit4]    >        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

   [junit4]    >        at java.base/java.lang.reflect.Method.invoke(Method.java:564)

   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:844)

   [junit4]    > Caused by: java.util.concurrent.TimeoutException: last state: DocCollection(ScheduledMaintenanceTriggerTest_collection1//clusterstate.json/6)={

 

> Lazy loading Lucene FST offheap using mmap
> ------------------------------------------
>
>                 Key: LUCENE-8635
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8635
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/FSTs
>         Environment: I used below setup for es_rally tests:
> single node i3.xlarge running ES 6.5
> es_rally was running on another i3.xlarge instance
>            Reporter: Ankit Jain
>            Priority: Major
>         Attachments: offheap.patch, rally_benchmark.xlsx
>
>
> Currently, FST loads all the terms into heap memory during index open. This causes frequent JVM OOM issues if the term size gets big. A better way of doing this will be to lazily load FST using mmap. That ensures only the required terms get loaded into memory.
>  
> Lucene can expose API for providing list of fields to load terms offheap. I'm planning to take following approach for this:
>  # Add a boolean property fstOffHeap in FieldInfo
>  # Pass list of offheap fields to lucene during index open (ALL can be special keyword for loading ALL fields offheap)
>  # Initialize the fstOffHeap property during lucene index open
>  # FieldReader invokes default FST constructor or OffHeap constructor based on fstOffHeap field
>  
> I created a patch (that loads all fields offheap), did some benchmarks using es_rally and results look good.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]