Some questions about index hard commit and intellij dev setup

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Some questions about index hard commit and intellij dev setup

Huaxiang Sun
Hi Developers,

    I am a newbie to solr/lucene project and have some questions about index hard commit. Excuse me if these have been asked before.

    1. When hard commit happens, will it drain up entries in the index queue?

    2. How exactly is index file written? I.e, will they be written to tmp dir and moved to the index dir when it completes, or they are written to the index dir directly. In the later case, if one is reading the index dir, then it can read incomplete index files. 

   3. Similar question to index merge. Will the merge process create merged file in tmp dir and moved to index dir after merge completes? When are these files merged deleted? Will these merged files be moved to some archive dir and cleaned up later or deleted right after the merge?

   The final question is about intellij setup for lucene/solr project. I followed the steps in doc and it seems that the code browsing/build does not work well for me. Just want to check that these are steps I need to follow. 

   Thanks

    Huaxiang Sun
Reply | Threaded
Open this post in threaded view
|

Re: Some questions about index hard commit and intellij dev setup

Huaxiang Sun

Thanks Erick for quick response, really appreciate it.

Sorry, for some reason, I did not get the response in email and just searched the web and

found your response, :)


Huaxiang



1> No. There's really no "index queue". Current batches being processed are finished, then the commit happens.

HX: Got it.


2,3> The key is the "segments" file in your index, there should only be one of those. At time T, the segments file "points" to, say, segments 1, 2, 3. Now segments 4, 5 and 6 are written (but no commit). If you killed Solr with, say, a kill -9 or Solr crashed, you'd only see segments 1, 2, 3.

Contrast that with a commit. The last act after closing all current segments is to rewrite the segments_n file to include segments 1, 2, 3, 4, 5, 6.

Or say you had segments 1, 2, 3 as above and segments 2 and 3 were merged into segment 4. After the merge was completed, the segments file would point to segments 1 and 4, and after it was safely written segments 2 and 3 would be deleted.


HX: So the magic is segments file, well explained, thanks.

final question: How is it "not working"? Once you execute the "ant idea" command successfully, you should just be able to open the project from IntelliJ. Occasionally I have to run an "ant clean-idea idea" to start over, but I do this very often.

In general, when i debug Solr I don't try to run Solr from within intellij. I build it externally (execute 'ant server' in the Solr directory then start it up). Then I debug Solr remotely. If I'm trying to debug something specific, it's often easiest to do from a unit test. I don't try to build the entire project from within IntelliJ although I think others do. Mostly I just don't want to have one extra variable in the equation...

HX: Thanks for sharing. I tried to load lucene-solr in intellij and do source code reading s

so I can use intellij to find out all usage of methods and all implementations of one class.

Due to unresolved variables/methods, this is hard. Let me try it again.



On Thu, Feb 7, 2019 at 12:34 PM Huaxiang Sun <[hidden email]> wrote:

Hi Devs,

I am a newbie to solr/lucene project and have some questions about index
hard commit. Excuse me if these have been asked before.

1. When hard commit happens, will it drain up entries in the index queue?

2. How exactly is index file written? I.e, will they be written to tmp dir
and moved to the index dir when it completes, or they are written to the index
dir directly. In the later case, if one is reading the index dir, then it can
read incomplete index files.

3. Similar question to index merge. Will the merge process create merged file
in tmp dir and moved to index dir after merge completes? When are these files
merged deleted? Will these merged files be moved to some archive dir and cleaned
up later or deleted right after the merge?

The final question is about intellij setup for lucene/solr project. I
followed the steps in doc and it seems that the code browsing/build does not
work well for me. Just want to check that these are steps I need to follow.

Thanks

Reply | Threaded
Open this post in threaded view
|

Fwd: Some questions about index hard commit and intellij dev setup

Huaxiang Sun

Forgot to do a reply all, so just forward the messages again. 

---------- Forwarded message ---------
From: Huaxiang Sun <[hidden email]>
Date: Mon, Feb 11, 2019 at 10:58 PM
Subject: Re: Some questions about index hard commit and intellij dev setup
To: Erick Erickson <[hidden email]>


Thanks Erick. I will try as you suggested. For the segments part, I found that when a new commit or merge happens, the segments file is created with a new generation number.
Say the old segments file is segments_8, the new one will be segments_9. Just want to get some quick confirmation from you. 

Thanks again!
Huaxiang


On Tue, Feb 12, 2019 at 8:09 AM Erick Erickson <[hidden email]> wrote:
Yep, the segments file gets a new version number. Think of it this
way: Lucene takes great pains to keep the index from being messed up.
If the segments file were overwritten, then there'd be a chance (tiny,
but there) that the file write would fail. So by writing a new
segments_n file, Lucene can insure that that operation completed
successfully. Only after

1> all segments have been closed

2> a new segments_n file has been written.

can Lucene be sure that all the write operations have been successful,
do any integrity checks and the like and remove any old files. At any
point prior to <2>, if anything goes wrong at least the index before
the commit will still be there and accessible and consistent.

Oh, I didn't notice that you'd sent the last  couple of e-mails
directly to me. Please only reply to the user's list in future, that
way everyone else gets to see the answers. No big deal.

Best,
Erick

On Mon, Feb 11, 2019 at 11:11 PM Huaxiang Sun <[hidden email]> wrote:
>
> "ant clean-idea idea" did the trick, it worked for me, really appreciate your help. I checked the wiki page for idea again, ant clean-idea is mentioned there, not sure if I did the same cleanup before.