[jira] [Created] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
Shai Erera created LUCENE-4061:
----------------------------------

             Summary: Improvements to DirectoryTaxonomyWriter (synchronization and others)
                 Key: LUCENE-4061
                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/facet
            Reporter: Shai Erera
            Assignee: Shai Erera
             Fix For: 4.0


DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.

Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.

If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-4061:
-------------------------------

    Attachment: LUCENE-4061.patch

Patch does the following:

* Removes getCacheMemoryUsage()
* Handles few TODOs (e.g. don't use MultiFields)
* Improves synchronization of addCategory -- only if the category is not found, then it synchronizes and proceeds with the original logic.
* Made addCategoryDocument private -- this is a very dangerous method to call by extending classes.
* Made the TaxoWriterCache impls thread-safe.
* Added a concurrency test to validate synchronization works
* Modified TestAddTaxonomy to run *much* faster by building the taxonomy in parallel.

All tests pass, and run even faster now. I plan to commit this soon.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-4061.
--------------------------------

    Resolution: Fixed

Committed rev 1339150
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Reopened] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera reopened LUCENE-4061:
--------------------------------


TestDirectoryTaxonomyWriter.testConcurrency sometimes fails on mismatch number of categories.

This is not repeated easily, but if run with -Dtests.iters, it is.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-4061:
-------------------------------

    Attachment: LUCENE-4061.patch

Fixing the concurrency issue was hairy, and required lots of changes to DirTaxoWriter:

* Needed a ReaderManager, so added such in core under o.a.l.index. Separately, I think that we should move RefManager to o.a.l.util instead of o.a.l.search.

* DirTaxoWriter was not very well built for concurrency :), so many changes had to be done to it.

* TaxoWriterCache.hasRoom(int) has been replaced by isFull().

* TestDirTaxoWriter has been enhanced to sometimes, during nightly builds, use a NoOpCache, as it uncovered some bugs too ! (yet it makes the test horribly slow, hence why the nightly criteria, and very low chances still).

I ran DirTaxoWriter.testConcurrency over 1000 times and no failures, so I'm inclined to believe the concurrency issues are now resolved. Still, a second (and third and even a forth) look by someone else would be appreciated.

I'll commit it tomorrow if no one will object, and port to 4x.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292977#comment-13292977 ]

Simon Willnauer commented on LUCENE-4061:
-----------------------------------------

patch looks good. I wonder if you can't create the ReaderManager in advance and make it final. I mean if you do add categories which seems to be the purpose of that writer you need it anyway and the costs should be considerably low. That would remove the need for locking on it entirely.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293063#comment-13293063 ]

Shai Erera commented on LUCENE-4061:
------------------------------------

Believe me, I wanted to avoid it too, but ReaderManager is allocated like that for few reasons:

* It's lazy, as the comment in the code says -- it's a waste to open an IR if your DirTaxoWriter session is going to be short living.
** Personally, I think this is a minor issue, and if it were only that, I'd make it final.

* The TaxoWriterCache can be 'complete' which means all the categories currently known to DirTW are cached. In that case, it is a waste to keep the reader open and we close it.
** This is true for Cl2oCache, since it keeps all categories in memory.
** But LruCache is not like that, since it potentially evicts entries from the cache. So it can be 'complete' until it evicts the first entry, in which case it will never be complete, and we'll need to keep the reader open.

Currently, when we don't need ReaderManager, we close it. We also don't open it until few cache misses occur. To change it would mean to sacrifice efficiency by always keeping a Reader open, even if it's not needed. It wastes RAM, file handles and what not.

Not sure it's worth it. What do you think?
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293427#comment-13293427 ]

Simon Willnauer commented on LUCENE-4061:
-----------------------------------------

bq. Not sure it's worth it. What do you think?
given that I think its ok to have the locking and init it lazy. +1 for the patch. go for it!
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293463#comment-13293463 ]

Shai Erera commented on LUCENE-4061:
------------------------------------

Thanks Simon for reviewing the patch. I'll commit this then and port to 4x.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-4061:
-------------------------------

    Fix Version/s: 5.0
   

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-4061.
--------------------------------

    Resolution: Fixed

Committed rev 1349214 (trunk) and 1349223 (4x).
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Reopened] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera reopened LUCENE-4061:
--------------------------------


Jenkins failed with this stack trace:

Stack Trace:
java.lang.AssertionError: mismatch number of categories
expected:<21684> but was:<22301>
       at __randomizedtesting.SeedInfo.seed([FA7ED469F531418D:5C8E16F49462A144]:0)
       at org.junit.Assert.fail(Assert.java:93)
       at org.junit.Assert.failNotEquals(Assert.java:647)
       at org.junit.Assert.assertEquals(Assert.java:128)
       at org.junit.Assert.assertEquals(Assert.java:472)
       at org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyWriter.testConcurrency(TestDirectoryTaxonomyWriter.java:263)

reproduce with: ant test -Dtestcase=TestDirectoryTaxonomyWriter -Dtests.method=testConcurrency -Dtests.seed=FA7ED469F531418D -Dtests.multiplier=3
-Dtests.nightly=true -Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt -Dtests.locale=sk_SK -Dtests.timezone=Africa/Nairobi
-Dargs="-Dfile.encoding=ISO8859-1"

I still wasn't able to reproduce it on my machine. Digging into this.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396706#comment-13396706 ]

Shai Erera commented on LUCENE-4061:
------------------------------------

Some progress ... still wasn't able to reproduce locally, but I did realize that the test was run with -Dtests.nightly, and with the seed it uses NoOpCache (which never caches anything). Progress is that I thought it's related to LruTaxoWriterCache all along ! :).

Still digging.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396721#comment-13396721 ]

Shai Erera commented on LUCENE-4061:
------------------------------------

Found it. There was a problem with instructions order -- upon adding a new category, we mark that the reader should be refreshed *after* we add the category to the cache. When NoOpCache is used, adding to the cache fails, and we fail to refresh the reader because the flag is set to true only afterwards. I added a simple deterministic test that verifies it. Will commit the fix shortly.

BTW, NoOpCache in this case simulates the case where e.g. LruTaxoWriterCache just evicts entries from it and we fail to refresh the reader, so this is a true bug and not related to the use of NoOpCache.

Hopefully this will be the end of it :).
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-4061.
--------------------------------

    Resolution: Fixed

Committed revisions 1351672 and 1351675.
               

> Improvements to DirectoryTaxonomyWriter (synchronization and others)
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4061
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4061.patch, LUCENE-4061.patch
>
>
> DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be.
> Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it.
> If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]