Would docvalues be loaded into jvm?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Would docvalues be loaded into jvm?

wangqinghuan
hi
I know that data is written into disk with the style of column-store if I enable doc-values for certain field.
But I don't understand why sorting with docvalues doesn't increase the load of jvm. whatever sorting algorithm , data would be loaded into  jvm to sort. This should be a high load for jvm when I sort all index  , but  no change for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm work directly based on the file (Mmap) ?
Reply | Threaded
Open this post in threaded view
|

RE: Would docvalues be loaded into jvm?

Uwe Schindler
Hi

It works directly off  the mmapped files. It is not fully loaded into heap, only some small control structures are allocated on heap. During sorting the TopDocsCollector uses the memory mapped structures to uncompress and lookup the sort values.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 4:36 AM
> To: [hidden email]
> Subject: Would docvalues be loaded into jvm?
>
> hi
> I know that data is written into disk with the style of column-store if I
> enable doc-values for certain field.
> But I don't understand why sorting with docvalues doesn't increase the load
> of jvm. whatever sorting algorithm , data would be loaded into  jvm to sort.
> This should be a high load for jvm when I sort all index  , but  no change
> for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm
> work directly based on the file (Mmap) ?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644.html
> Sent from the Lucene - General mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: RE: Would docvalues be loaded into jvm?

wangqinghuan
hi
Is there any  design document on this aspect (sorting algorithm off mmap)?

---Original---
From: "Uwe Schindler [via Lucene]"<[hidden email]>
Date: 2017/6/15 14:39:30
To: "wangqinghuan"<[hidden email]>;
Subject: RE: Would docvalues be loaded into jvm?

Hi

It works directly off  the mmapped files. It is not fully loaded into heap, only some small control structures are allocated on heap. During sorting the TopDocsCollector uses the memory mapped structures to uncompress and lookup the sort values.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 4:36 AM
> To: [hidden email]
> Subject: Would docvalues be loaded into jvm?
>
> hi
> I know that data is written into disk with the style of column-store if I
> enable doc-values for certain field.
> But I don't understand why sorting with docvalues doesn't increase the load
> of jvm. whatever sorting algorithm , data would be loaded into  jvm to sort.
> This should be a high load for jvm when I sort all index  , but  no change
> for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm
> work directly based on the file (Mmap) ?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644.html
> Sent from the Lucene - General mailing list archive at Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340659.html
To unsubscribe from Would docvalues be loaded into jvm?, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

RE: RE: Would docvalues be loaded into jvm?

Uwe Schindler
Hi,

There is no design document about that. Lucene uses MMAP for all index files since a long time ago. DocValues is just another implementation. Basically it uses IndexInput's methods to access the underlying data, which is memory mapped if you are on 64 bit platforms. For DocValues there are also positional reads available. There is not much stuff specifically for docvalues, it is just a file format that supports column based access with positional reads. The mmap implementation is separated from this and a bit lower in the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does not sort directly on the mmapped files, there are several abstractions inbetween (which are of course removed by the Hotspot optimizer).

Some information (a bit older, but still valid) is here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 10:41 AM
> To: [hidden email]
> Subject: Re: RE: Would docvalues be loaded into jvm?
>
> hi
> Is there any  design document on this aspect (sorting algorithm off mmap)?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<[hidden email]>
> Date: 2017/6/15 14:39:30
> To: "wangqinghuan"<[hidden email]>;
> Subject: RE: Would docvalues be loaded into jvm?
>
>
>  Hi
>
> It works directly off  the mmapped files. It is not fully loaded into heap, only
> some small control structures are allocated on heap. During sorting the
> TopDocsCollector uses the memory mapped structures to uncompress and
> lookup the sort values.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 4:36 AM
> > To: [hidden email]
> > Subject: Would docvalues be loaded into jvm?
> >
> > hi
> > I know that data is written into disk with the style of column-store if I
> > enable doc-values for certain field.
> > But I don't understand why sorting with docvalues doesn't increase the
> load
> > of jvm. whatever sorting algorithm , data would be loaded into  jvm to sort.
> > This should be a high load for jvm when I sort all index  , but  no change
> > for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm
> > work directly based on the file (Mmap) ?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
>   If you reply to this email, your message will be added to the discussion
> below:
>  http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340659.html
>   To unsubscribe from Would docvalues be loaded into jvm?, click here.
>  NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> Sent from the Lucene - General mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: RE: RE: Would docvalues be loaded into jvm?

wangqinghuan
Does "hotspot" reffers to java virtual machine?

---Original---
From: "Uwe Schindler [via Lucene]"<[hidden email]>
Date: 2017/6/15 17:03:46
To: "wangqinghuan"<[hidden email]>;
Subject: RE: RE: Would docvalues be loaded into jvm?

Hi,

There is no design document about that. Lucene uses MMAP for all index files since a long time ago. DocValues is just another implementation. Basically it uses IndexInput's methods to access the underlying data, which is memory mapped if you are on 64 bit platforms. For DocValues there are also positional reads available. There is not much stuff specifically for docvalues, it is just a file format that supports column based access with positional reads. The mmap implementation is separated from this and a bit lower in the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does not sort directly on the mmapped files, there are several abstractions inbetween (which are of course removed by the Hotspot optimizer).

Some information (a bit older, but still valid) is here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 10:41 AM
> To: [hidden email]
> Subject: Re: RE: Would docvalues be loaded into jvm?
>
> hi
> Is there any  design document on this aspect (sorting algorithm off mmap)?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<[hidden email]>
> Date: 2017/6/15 14:39:30
> To: "wangqinghuan"<[hidden email]>;
> Subject: RE: Would docvalues be loaded into jvm?
>
>
>  Hi
>
> It works directly off  the mmapped files. It is not fully loaded into heap, only
> some small control structures are allocated on heap. During sorting the
> TopDocsCollector uses the memory mapped structures to uncompress and
> lookup the sort values.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 4:36 AM
> > To: [hidden email]
> > Subject: Would docvalues be loaded into jvm?
> >
> > hi
> > I know that data is written into disk with the style of column-store if I
> > enable doc-values for certain field.
> > But I don't understand why sorting with docvalues doesn't increase the
> load
> > of jvm. whatever sorting algorithm , data would be loaded into  jvm to sort.
> > This should be a high load for jvm when I sort all index  , but  no change
> > for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm
> > work directly based on the file (Mmap) ?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
>   If you reply to this email, your message will be added to the discussion
> below:
>  http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340659.html
>   To unsubscribe from Would docvalues be loaded into jvm?, click here.
>  NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> Sent from the Lucene - General mailing list archive at Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340678.html
To unsubscribe from Would docvalues be loaded into jvm?, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

RE: RE: RE: Would docvalues be loaded into jvm?

Uwe Schindler
Yes.

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 12:21 PM
> To: [hidden email]
> Subject: Re: RE: RE: Would docvalues be loaded into jvm?
>
> Does "hotspot" reffers to java virtual machine?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<[hidden email]>
> Date: 2017/6/15 17:03:46
> To: "wangqinghuan"<[hidden email]>;
> Subject: RE: RE: Would docvalues be loaded into jvm?
>
>
>  Hi,
>
> There is no design document about that. Lucene uses MMAP for all index
> files since a long time ago. DocValues is just another implementation.
> Basically it uses IndexInput's methods to access the underlying data, which is
> memory mapped if you are on 64 bit platforms. For DocValues there are also
> positional reads available. There is not much stuff specifically for docvalues,
> it is just a file format that supports column based access with positional
> reads. The mmap implementation is separated from this and a bit lower in
> the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does
> not sort directly on the mmapped files, there are several abstractions
> inbetween (which are of course removed by the Hotspot optimizer).
>
> Some information (a bit older, but still valid) is here:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 10:41 AM
> > To: [hidden email]
> > Subject: Re: RE: Would docvalues be loaded into jvm?
> >
> > hi
> > Is there any  design document on this aspect (sorting algorithm off mmap)?
> >
> >
> >
> > ---Original---
> > From: "Uwe Schindler [via
> > Lucene]"<[hidden email]>
> > Date: 2017/6/15 14:39:30
> > To: "wangqinghuan"<[hidden email]>;
> > Subject: RE: Would docvalues be loaded into jvm?
> >
> >
> >  Hi
> >
> > It works directly off  the mmapped files. It is not fully loaded into heap,
> only
> > some small control structures are allocated on heap. During sorting the
> > TopDocsCollector uses the memory mapped structures to uncompress and
> > lookup the sort values.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > http://www.thetaphi.de
> > eMail: [hidden email]
> >
> > > -----Original Message-----
> > > From: wangqinghuan [mailto:[hidden email]]
> > > Sent: Thursday, June 15, 2017 4:36 AM
> > > To: [hidden email]
> > > Subject: Would docvalues be loaded into jvm?
> > >
> > > hi
> > > I know that data is written into disk with the style of column-store if I
> > > enable doc-values for certain field.
> > > But I don't understand why sorting with docvalues doesn't increase the
> > load
> > > of jvm. whatever sorting algorithm , data would be loaded into  jvm to
> sort.
> > > This should be a high load for jvm when I sort all index  , but  no change
> > > for jvm in fact.  How does lucene sort with docvalues ? Can sort algorithm
> > > work directly based on the file (Mmap) ?
> > >
> > >
> > >
> > > --
> > > View this message in context:
> http://lucene.472066.n3.nabble.com/Would-
> > > docvalues-be-loaded-into-jvm-tp4340644.html
> > > Sent from the Lucene - General mailing list archive at Nabble.com.
> >
> >
> >
> >
> >   If you reply to this email, your message will be added to the discussion
> > below:
> >  http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-
> jvm-
> > tp4340644p4340659.html
> >   To unsubscribe from Would docvalues be loaded into jvm?, click here.
> >  NAML
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
>   If you reply to this email, your message will be added to the discussion
> below:
>  http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340678.html
>   To unsubscribe from Would docvalues be loaded into jvm?, click here.
>  NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340689.html
> Sent from the Lucene - General mailing list archive at Nabble.com.