SortedDocValues.lookupOrd and BytesRef reuse

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

SortedDocValues.lookupOrd and BytesRef reuse

Joel Bernstein
In the SortedDocValues.lookupOrd documentation it says that a deep copy is needed for the returned BytesRef. I wanted to verify that this was actually true. I'm
trying to see a way that this BytesRef could be safely reused by the API but I don't see one. Is there actually an implementation of lookupOrd that somehow reuses the
same BytesRef between invocations. The java doc is copied below:
Thanks!
/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be >= 0 and < {@link #getValueCount()})
* @see #ordValue()
*/
public abstract BytesRef lookupOrd(int ord) throws IOException;


Reply | Threaded
Open this post in threaded view
|

Re: SortedDocValues.lookupOrd and BytesRef reuse

Michael McCandless-2
Hi Joel,

You should trust the javadocs.

Looking at our default Codec on master (Lucene84Codec), and at its default doc values implementation (Lucene80DocValuesProducer), it is clearly reusing the private "BytesRef term" instance.

If your code is fully consuming this BytesRef before calling any other methods on the same SortedDocValues instance, you can safely reuse it for that duration.

But if you want to call methods on that same SortedDocValues and continue using the previous BytesRef, you'll need to make a copy.

Maybe improve the javadocs?

On Thu, May 14, 2020 at 4:12 PM Joel Bernstein <[hidden email]> wrote:
In the SortedDocValues.lookupOrd documentation it says that a deep copy is needed for the returned BytesRef. I wanted to verify that this was actually true. I'm
trying to see a way that this BytesRef could be safely reused by the API but I don't see one. Is there actually an implementation of lookupOrd that somehow reuses the
same BytesRef between invocations. The java doc is copied below:
Thanks!
/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be &gt;= 0 and &lt; {@link #getValueCount()})
* @see #ordValue()
*/
public abstract BytesRef lookupOrd(int ord) throws IOException;


Reply | Threaded
Open this post in threaded view
|

Re: SortedDocValues.lookupOrd and BytesRef reuse

Joel Bernstein
Ok thanks, this makes sense. It's safe to use for the same SortedDocValues instance until the method is called again. I think changing the javadoc to the following will help clear up the confusion:

/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls on the same instance to the {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be &gt;= 0 and &lt; {@link #getValueCount()})
* @see #ordValue()
*/
I can make this change if others agree.




On Thu, May 14, 2020 at 4:37 PM Michael McCandless <[hidden email]> wrote:
Hi Joel,

You should trust the javadocs.

Looking at our default Codec on master (Lucene84Codec), and at its default doc values implementation (Lucene80DocValuesProducer), it is clearly reusing the private "BytesRef term" instance.

If your code is fully consuming this BytesRef before calling any other methods on the same SortedDocValues instance, you can safely reuse it for that duration.

But if you want to call methods on that same SortedDocValues and continue using the previous BytesRef, you'll need to make a copy.

Maybe improve the javadocs?

On Thu, May 14, 2020 at 4:12 PM Joel Bernstein <[hidden email]> wrote:
In the SortedDocValues.lookupOrd documentation it says that a deep copy is needed for the returned BytesRef. I wanted to verify that this was actually true. I'm
trying to see a way that this BytesRef could be safely reused by the API but I don't see one. Is there actually an implementation of lookupOrd that somehow reuses the
same BytesRef between invocations. The java doc is copied below:
Thanks!
/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be &gt;= 0 and &lt; {@link #getValueCount()})
* @see #ordValue()
*/
public abstract BytesRef lookupOrd(int ord) throws IOException;


Reply | Threaded
Open this post in threaded view
|

Re: SortedDocValues.lookupOrd and BytesRef reuse

Michael McCandless-2

Hi Joel,

Looks great!  +1 to push, thanks.


On Thu, May 14, 2020 at 4:56 PM Joel Bernstein <[hidden email]> wrote:
Ok thanks, this makes sense. It's safe to use for the same SortedDocValues instance until the method is called again. I think changing the javadoc to the following will help clear up the confusion:

/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls on the same instance to the {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be &gt;= 0 and &lt; {@link #getValueCount()})
* @see #ordValue()
*/
I can make this change if others agree.




On Thu, May 14, 2020 at 4:37 PM Michael McCandless <[hidden email]> wrote:
Hi Joel,

You should trust the javadocs.

Looking at our default Codec on master (Lucene84Codec), and at its default doc values implementation (Lucene80DocValuesProducer), it is clearly reusing the private "BytesRef term" instance.

If your code is fully consuming this BytesRef before calling any other methods on the same SortedDocValues instance, you can safely reuse it for that duration.

But if you want to call methods on that same SortedDocValues and continue using the previous BytesRef, you'll need to make a copy.

Maybe improve the javadocs?

On Thu, May 14, 2020 at 4:12 PM Joel Bernstein <[hidden email]> wrote:
In the SortedDocValues.lookupOrd documentation it says that a deep copy is needed for the returned BytesRef. I wanted to verify that this was actually true. I'm
trying to see a way that this BytesRef could be safely reused by the API but I don't see one. Is there actually an implementation of lookupOrd that somehow reuses the
same BytesRef between invocations. The java doc is copied below:
Thanks!
/** Retrieves the value for the specified ordinal. The returned
* {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
* so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
* to keep it around.
* @param ord ordinal to lookup (must be &gt;= 0 and &lt; {@link #getValueCount()})
* @see #ordValue()
*/
public abstract BytesRef lookupOrd(int ord) throws IOException;