Are flushed (but not committed yet) segments mutable?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Are flushed (but not committed yet) segments mutable?

Nawab Zada Asad Iqbal
Hi,

When a segment is flushed to disk because it is exceeding available memory,
is it sill updated when new documents are added? I also read somewhere that
a segment is not committed even if it is flushed. How is a
flushed-but-not-committed segment different from a committed segment?

For example, my hard-commit is scheduled for every 30 seconds, but many
segments are flushed during this interval. Are they flushed as in-memory
data structures (which will keep them optimal for updates) or are they
immutable?

I also see some segment merges before the hard-commit executes, which make
me think that flush converts the in-memory data-structures into Lucene
segment.

Thanks
Nawab
Reply | Threaded
Open this post in threaded view
|

Re: Are flushed (but not committed yet) segments mutable?

Erick Erickson
"I also see some segment merges before the hard-commit executes, which
make me think that flush converts the in-memory data-structures into
Lucene"

That's my understanding. Essentially each flush creates a new segment
that gets merged sometime.

"How is a flushed-but-not-committed segment different from a committed segment?"

In a nutshell, it hasn't been added to the "segments_n" file, which
contains a list of all of the segments as of the last commit point.
Segments added for whatever reason since the last hard commit aren't
added to that file. So say Solr is killed before committing. When it
restarts it sees the segments_n file that contains the old "picture"
of the index. If tlogs are around, then Solr replays the documents
since that point.

Best,
Erick

On Fri, Feb 9, 2018 at 8:07 PM, Nawab Zada Asad Iqbal <[hidden email]> wrote:

> Hi,
>
> When a segment is flushed to disk because it is exceeding available memory,
> is it sill updated when new documents are added? I also read somewhere that
> a segment is not committed even if it is flushed. How is a
> flushed-but-not-committed segment different from a committed segment?
>
> For example, my hard-commit is scheduled for every 30 seconds, but many
> segments are flushed during this interval. Are they flushed as in-memory
> data structures (which will keep them optimal for updates) or are they
> immutable?
>
> I also see some segment merges before the hard-commit executes, which make
> me think that flush converts the in-memory data-structures into Lucene
> segment.
>
> Thanks
> Nawab
Reply | Threaded
Open this post in threaded view
|

Re: Are flushed (but not committed yet) segments mutable?

Nawab Zada Asad Iqbal
Thanks Erick!

On Sat, Feb 10, 2018 at 11:37 PM, Erick Erickson <[hidden email]>
wrote:

> "I also see some segment merges before the hard-commit executes, which
> make me think that flush converts the in-memory data-structures into
> Lucene"
>
> That's my understanding. Essentially each flush creates a new segment
> that gets merged sometime.
>
> "How is a flushed-but-not-committed segment different from a committed
> segment?"
>
> In a nutshell, it hasn't been added to the "segments_n" file, which
> contains a list of all of the segments as of the last commit point.
> Segments added for whatever reason since the last hard commit aren't
> added to that file. So say Solr is killed before committing. When it
> restarts it sees the segments_n file that contains the old "picture"
> of the index. If tlogs are around, then Solr replays the documents
> since that point.
>
> Best,
> Erick
>
> On Fri, Feb 9, 2018 at 8:07 PM, Nawab Zada Asad Iqbal <[hidden email]>
> wrote:
> > Hi,
> >
> > When a segment is flushed to disk because it is exceeding available
> memory,
> > is it sill updated when new documents are added? I also read somewhere
> that
> > a segment is not committed even if it is flushed. How is a
> > flushed-but-not-committed segment different from a committed segment?
> >
> > For example, my hard-commit is scheduled for every 30 seconds, but many
> > segments are flushed during this interval. Are they flushed as in-memory
> > data structures (which will keep them optimal for updates) or are they
> > immutable?
> >
> > I also see some segment merges before the hard-commit executes, which
> make
> > me think that flush converts the in-memory data-structures into Lucene
> > segment.
> >
> > Thanks
> > Nawab
>