[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448123#comment-16448123 ]

David Smiley commented on LUCENE-8267:

-1 sorry. I've used the MemoryPostingsFormat for a text-tagging use-case where there are intense lookups against the terms dictionary.  It's highly beneficial to have the terms dictionary be entirely memory resident, albeit in a compact FST.  The issue description mentions "We don't use those memory codecs anywhere outside of tests" -- this should be no surprise as it's not the default codec.  I'm sure it may be hard to gauge the level of use of something outside of core-Lucene.  When we ponder removing something that Lucene doesn't even _need_, I propose we raise the issue more openly to the community.  Perhaps the question could be proposed in CHANGES.txt and/or release announcements to solicit community input?

Perhaps BaseRangeFieldQueryTestCase.verify should ascertain if the postings format is a known "memory" postings format (of which there are several, to include "Direct"), and if so then use JUnit's Assume to bail out?  If this is hard to do, we ought to add a convenience method to make it easier.

Speaking of memory postings formats, I'm in favor of the Direct postings format going away since it ought to be re-imagined as some sort of read-time FilterCodecReader that does not require an index format.  Credit to Alan for that idea years ago.  Though that's more of a re-orientation of something that exists rather than saying it should go away entirely.

> Remove memory codecs from the codebase
> --------------------------------------
>                 Key: LUCENE-8267
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8267
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Dawid Weiss
>            Priority: Major
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random selection of codecs for tests and cause occasional OOMs when a test with huge data is selected. We don't use those memory codecs anywhere outside of tests, it has been suggested to just remove them to avoid maintenance costs and OOMs in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]