[jira] Created: (LUCENE-584) Decouple Filter from BitSet

classic Classic list List threaded Threaded
237 messages Options
1234 ... 12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
Decouple Filter from BitSet
---------------------------

         Key: LUCENE-584
         URL: http://issues.apache.org/jira/browse/LUCENE-584
     Project: Lucene - Java
        Type: Improvement

  Components: Search  
    Versions: 2.0.1    
    Reporter: Peter Schäfer
    Priority: Minor


{code}
package org.apache.lucene.search;

public abstract class Filter implements java.io.Serializable
{
  public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
}

public interface AbstractBitSet
{
  public boolean get(int index);
}

{code}

It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.

Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.

Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.



--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12414046 ]

Eks Dev commented on LUCENE-584:
--------------------------------

Peter,

there is some advanced things you are probably interested in.

see:
"some utilities for a compact sparse filter" LUCENE-328

Also interesting:
[#SOLR-15] OpenBitSet - ASF JIRA

complete solr solution for Filters is one cool thing! a bit awkward bridge to lucene due to BitSet in Filter, but this is due to be resolved...

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor

>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [jira] Created: (LUCENE-584) Decouple Filter from BitSet

Robert Engels
In reply to this post by Radim Rehurek (Jira)
This design "seems" wrong, since it does not support 'next set bit', which
will kill performance in many cases.

Why not use:

interface Filter {
    boolean include(int docnum);
    int next(int docnum);
}

It is easy to create a implementation FilterBits as

class FilterBits implements Filter
    BitSet bits;
    FilterBits(BitSet bits) {
         this.bits = bits;
    }
    boolean include(int docnum) {
         return bits.get(docnum);
    }
    int next(int docnum) {
         return bits.nextSetBit(docnum);
    }
}

But other more computational Filter implementations can easily be created.


-----Original Message-----
From: Peter Schäfer (JIRA) [mailto:[hidden email]]
Sent: Wednesday, May 31, 2006 6:48 AM
To: [hidden email]
Subject: [jira] Created: (LUCENE-584) Decouple Filter from BitSet

Decouple Filter from BitSet
---------------------------

         Key: LUCENE-584
         URL: http://issues.apache.org/jira/browse/LUCENE-584
     Project: Lucene - Java
        Type: Improvement

  Components: Search  
    Versions: 2.0.1    
    Reporter: Peter Schäfer
    Priority: Minor


{code}
package org.apache.lucene.search;

public abstract class Filter implements java.io.Serializable {
  public abstract AbstractBitSet bits(IndexReader reader) throws
IOException; }

public interface AbstractBitSet
{
  public boolean get(int index);
}

{code}

It would be useful if the method =Filter.bits()= returned an abstract
interface, instead of =java.util.BitSet=.

Use case: there is a very large index, and, depending on the user's
privileges, only a small portion of the index is actually visible.
Sparsely populated =java.util.BitSet=s are not efficient and waste lots of
memory. It would be desirable to have an alternative BitSet implementation
with smaller memory footprint.

Though it _is_ possibly to derive classes from =java.util.BitSet=, it was
obviously not designed for that purpose.
That's why I propose to use an interface instead. The default implementation
could still delegate to =java.util.BitSet=.



--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12414224 ]

Peter Schäfer commented on LUCENE-584:
--------------------------------------

thanks, this looks interesting.

Regards,
Peter

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor

>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12417896 ]

paul.elschot commented on LUCENE-584:
-------------------------------------

As the title of this issue is as accurate as it gets, I'm attaching a  series of patches and additions here  that make Scorer a subclass of Matcher, while Matcher takes the current role of the BitSet in Filter.
All patches against trunk revision 417299.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor

>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: HitCollector-20060626.patch

javadocs of HitCollector.java to use 'matching' instead of 'non-zero score'.
This is actually independent of the Matcher/Scorer change.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: HitCollector-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: Searcher-20060626.patch

javadocs of Searcher.java to use 'matching' instead of 'non-zero score',
and to describe the Filter effect more accurately.
This is independent of the Matcher/Scorer change.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: HitCollector-20060626.patch, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: MatchCollector.java

MatchCollector.java with collect(int) method for org.apache.lucene.search.


> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: HitCollector-20060626.patch, MatchCollector.java, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: Matcher.java

Matcher.java, including a match(MatchCollector) method, for org.apache.lucene.search.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: HitCollector-20060626.patch, MatchCollector.java, Matcher.java, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: Scorer-20060626.patch

patch to Scorer.java to subclass Matcher.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: HitCollector-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: Filter-20060626.patch

patch to Filter to add getMatcher() and to deprecate getBits() in favour of getMatcher().
Includes commented test code to test IndexSearcher using BitsMatcher.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: Filter-20060626.patch, HitCollector-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: IndexSearcher-20060626.patch

Patch to IndexSearcher.java to prefer getMatcher() over getBits() on Filter.
Also add method IndexSearcher.match(Query, MatchCollector).


> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: BitsMatcher.java

A Matcher constructed from a BitSet for org.apache.lucene.util.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: SortedVIntList.java

SortedVIntList.java for org.apache.lucene.util superseding the one in LUCENE-328. Has a getMatcher() method.

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch, SortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: TestSortedVIntList.java

TestSortedVIntList.java, superseding the one in LUCENE-328 testing the Matcher provided by a SortedVIntList.


> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12417917 ]

paul.elschot commented on LUCENE-584:
-------------------------------------

I hope I got all the attachments right, please holler in case something does not patch or compile cleanly.

Some questions/remarks:

- When IndexSearcher gets a BitSet from a Filter, it will not use skipTo() on the Scorer
of the Query being filtered.
This still allows to use the 1.4 BooleanScorer until Filter.getBits() is removed.

- Ok. not to add match() method(s) to Searcher/Searchable ?

- BitSetIterator of SOLR-15 could implement a Matcher, and perhaps to be added to org.apache.lucene.util ?

- Matcher as superclass of Scorer opens possibility to add BooleanQuery.add(Filter) method.
This also needs the addition of required Matchers to ConjunctionScorer and the addition of prohibited Matchers at ReqExclScorer/DisjunctionScorer.
Doing this filtering in ConjunctionScorer/ReqExclScorer will probably reduce the number of method calls for filtering.
Once such an addition is done to BooleanQuery, the filtering methods in IndexSearcher could be deprecated in favour of BooleanQuery.add(Filter).

Regards,
Paul Elschot


> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12418141 ]

paul.elschot commented on LUCENE-584:
-------------------------------------

I've started to improve the javadocs of almost all code posted here, so it's probably not worthwhile to commit this as it is now.
I don't expect changes to the java code in the short term.

Regards,
Paul Elschot



> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
    [ http://issues.apache.org/jira/browse/LUCENE-584?page=comments#action_12418177 ]

Eks Dev commented on LUCENE-584:
--------------------------------

Any toughts on adding OpenBitSet from solr here?

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searcher-20060626.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: HitCollector-20060628.patch
                Searchable-20060628.patch
                Searcher-20060628.patch

Patches against trunk revision 417683, current.
Compared to previous patches/files, there are only javadoc updates,
and the javadocs of Searchable are also patched.



> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, HitCollector-20060626.patch, HitCollector-20060628.patch, IndexSearcher-20060626.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Searchable-20060628.patch, Searcher-20060626.patch, Searcher-20060628.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Radim Rehurek (Jira)
In reply to this post by Radim Rehurek (Jira)
     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

paul.elschot updated LUCENE-584:
--------------------------------

    Attachment: Scorer-20060628.patch
                Filter-20060628.patch
                IndexSearcher-20060628.patch

> Decouple Filter from BitSet
> ---------------------------
>
>          Key: LUCENE-584
>          URL: http://issues.apache.org/jira/browse/LUCENE-584
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Search
>     Versions: 2.0.1
>     Reporter: Peter Schäfer
>     Priority: Minor
>  Attachments: BitsMatcher.java, Filter-20060626.patch, Filter-20060628.patch, HitCollector-20060626.patch, HitCollector-20060628.patch, IndexSearcher-20060626.patch, IndexSearcher-20060628.patch, MatchCollector.java, Matcher.java, Scorer-20060626.patch, Scorer-20060628.patch, Searchable-20060628.patch, Searcher-20060626.patch, Searcher-20060628.patch, SortedVIntList.java, TestSortedVIntList.java
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

1234 ... 12