solr optimize command

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

solr optimize command

weiwang19
Hi,

 I use the following http request to start solr index optimization:

http://localhost:8983/solr/<core>/update?skipError=true -F stream.body='
<optimize />'


 The request returns status code 200 shortly, but when looking at the solr
instance I noticed that actual optimization has not completed yet as there
are more than 1 segments. Is the optimize command async? What is the best
approach to validate that optimize is truly completed?


Thanks,

Wei
Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Zheng Lin Edwin Yeo
Hi,

How big is your index size, and do you have enough space in your disk to do
the optimization? You need at least twice the disk space in order for the
optimization to be successful, and even more if you are still doing
indexing during the optimization.

Also, which Solr version are you using?

Regards,
Edwin

On Thu, 29 Nov 2018 at 09:23, Wei <[hidden email]> wrote:

> Hi,
>
>  I use the following http request to start solr index optimization:
>
> http://localhost:8983/solr/<core>/update?skipError=true -F stream.body='
> <optimize />'
>
>
>  The request returns status code 200 shortly, but when looking at the solr
> instance I noticed that actual optimization has not completed yet as there
> are more than 1 segments. Is the optimize command async? What is the best
> approach to validate that optimize is truly completed?
>
>
> Thanks,
>
> Wei
>
Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Walter Underwood
Why do you think you need to optimize? Most configurations don’t need that.

And no, there is not synchronous optimize request.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Nov 28, 2018, at 6:50 PM, Zheng Lin Edwin Yeo <[hidden email]> wrote:
>
> Hi,
>
> How big is your index size, and do you have enough space in your disk to do
> the optimization? You need at least twice the disk space in order for the
> optimization to be successful, and even more if you are still doing
> indexing during the optimization.
>
> Also, which Solr version are you using?
>
> Regards,
> Edwin
>
> On Thu, 29 Nov 2018 at 09:23, Wei <[hidden email]> wrote:
>
>> Hi,
>>
>> I use the following http request to start solr index optimization:
>>
>> http://localhost:8983/solr/<core>/update?skipError=true -F stream.body='
>> <optimize />'
>>
>>
>> The request returns status code 200 shortly, but when looking at the solr
>> instance I noticed that actual optimization has not completed yet as there
>> are more than 1 segments. Is the optimize command async? What is the best
>> approach to validate that optimize is truly completed?
>>
>>
>> Thanks,
>>
>> Wei
>>

Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Christopher Schultz
In reply to this post by weiwang19
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Wei,

On 11/28/18 20:22, Wei wrote:

> Hi,
>
> I use the following http request to start solr index optimization:
>
> http://localhost:8983/solr/<core>/update?skipError=true -F
> stream.body=' <optimize />'
>
>
> The request returns status code 200 shortly, but when looking at
> the solr instance I noticed that actual optimization has not
> completed yet as there are more than 1 segments. Is the optimize
> command async? What is the best approach to validate that optimize
> is truly completed?

Try this instead:

http://localhost:8983/solr/<core>/update?optimize=true&wait=true

This will wait until the operation has completed. Note that your
client (e.g. curl) may time-out after some time, so you'll want to
adjust that timeout to make sure the client doesn't give-up before the
optimization operation has completed.

As others have said, perhaps you don't actually need to optimize anythin
g.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlv//R4ACgkQHPApP6U8
pFi3+w/8C+pvp/XBqHUPeVCd7rEvU1v7mPOx+9lQ/zmU/OE3Y7rmAmVBXiiFvXeT
p2tKwhaNSrpx+MoGtaLu0GKg+nczD6K7yxOuRiltmr2KCg+6vCexJAd4yHFIt3H6
FmBnS3Couja7DwD/49pk75o/IkgXj3zok49fbt75AObttQOwXYo06yuijqN/08Wt
ieKo/4iLYLwGd3Pii8DnBTu3+IXlQG2eBbdOsNBazr2az0UrOkO+Xuj+IKv8brYr
LwMJ36e+m+Q2Gj8ZUvTQ8lTQNs7HD5giqtQXMelUXF7dcGPSwG9jCMvSTHfb+0rs
woMIt6ehRsW2CeP2Vrm2qY5gxeVIK5LwkwRcjZUq4gIDes3eiOImDLCE8Fhxxn2Z
xifKL7fQPlwdQWWXm2KDfTN+VvLVyWeA1n5z7drgD13VARdbA5c66iaIgguw0uKP
an3YC8uYbcZJolyWt/yu9r01pBTUsnxCpXDo5s5xUAz0LWdoRSNRDS872ohZxRIR
mcfCPbYUwNyhnclvzIPPcE8Z2sbCNaHcc2b5ZuavlA4PgEwFxgI1PweDXSa2Tuxg
lzuus5uS/U8lGSrkheeQDBmX6nCl2n1jsnXS4CXLGNHzH3uOVkJFmFraVNZCav16
t7SKTQc8Yc9P3AbdesG13C0iQDGjo3WLoKg7ghO3khoEL+NMKbQ=
=1wy3
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Shawn Heisey-2
In reply to this post by weiwang19
On 11/28/2018 6:22 PM, Wei wrote:
> I use the following http request to start solr index optimization:
>
> http://localhost:8983/solr/<core>/update?skipError=true -F stream.body='
> <optimize />'
>
> The request returns status code 200 shortly, but when looking at the solr
> instance I noticed that actual optimization has not completed yet as there
> are more than 1 segments. Is the optimize command async? What is the best
> approach to validate that optimize is truly completed?

I do not know how that request can return a 200 before the optimize job
completes.  The "wait" parameters (one of which Christopher mentioned)
should all default to true, and I don't see them on your request.  As
far as I know, the operation is NOT asynchronous.  Are you absolutely
sure that it returned a 200? I'd like to see the actual response to verify.

I hate to assume you're wrong, but I think it's probably more likely
that your HTTP request timed out because of overly aggressive timeout
settings, probably a socket timeout.  If you have definitive proof that
you received the 200 and a normal-looking response, then we'll need to
look deeper.  Do you have the entry in solr.log for the optimize request?

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Christopher Schultz
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Shawn,

On 11/29/18 17:56, Shawn Heisey wrote:

> On 11/28/2018 6:22 PM, Wei wrote:
>> I use the following http request to start solr index
>> optimization:
>>
>> http://localhost:8983/solr/<core>/update?skipError=true -F
>> stream.body=' <optimize />'
>>
>> The request returns status code 200 shortly, but when looking at
>> the solr instance I noticed that actual optimization has not
>> completed yet as there are more than 1 segments. Is the optimize
>> command async? What is the best approach to validate that
>> optimize is truly completed?
>
> I do not know how that request can return a 200 before the optimize
> job completes.  The "wait" parameters (one of which Christopher
> mentioned) should all default to true, and I don't see them on your
> request.  As far as I know, the operation is NOT asynchronous.  Are
> you absolutely sure that it returned a 200? I'd like to see the
> actual response to verify.
>
> I hate to assume you're wrong, but I think it's probably more
> likely that your HTTP request timed out because of overly
> aggressive timeout settings, probably a socket timeout.  If you
> have definitive proof that you received the 200 and a
> normal-looking response, then we'll need to look deeper.  Do you
> have the entry in solr.log for the optimize request?

When mine returned (with wait=true as a request parameter), I got a
JSON response telling me how long it took.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwAeT0ACgkQHPApP6U8
pFiXchAAxMzdVbXF5WrAC3K0E5rwg99hTh9n6WdzrtaZvGfKGCI9HyxMSrp/mZ8l
CzHXCx7gYZboW2qPHQtfZM0jknNtWHdOd5CahmXzd4vpFee85PJlWWru8cVEsnHZ
hQfNhX/kVRbFlA3lA++1gYZbl/cqdlqMdfF3pn/X3nnwto7xSsYg1vKKi0+4HW/5
yWm8AmsLYK8eluHOcpheCTOGhT9NPt5OkTsT6FxLSDfyAoSVN8GnCIKZJwRtX6Ni
m826mtc55BSb0dM6Zh3xRyLl5O1BIknIC8QaZtL1OiAb/8r3iJoc/vfhP64Jzq+5
enVORXbdqeWjPF+mJoBNPnCb14VnvzyUX+G4PhrN9jPgsWzlv2FDBwWBopOiAl/L
GZKSRRasxQ6Uwk09U2x6PPwlWCP6fC3i4xJoM++Rj1VRRCu6j7duyats9UBXlQ7M
bJcjlvAVQgaAMgndBJikPEFljyhgg+Tl8iAtf1PMUO8nPoboAwIGmZZwRsoBAPXP
rvvi1/V5KHlO6tDjQ5PLZVq9Bo71BbVDEUrJkyEUU+pAU1xZKyAhWANydCuasZ+n
CLShdIlGb4LTzRdv8L0WklTdl9BAEGa0hhNjdNNJkNxBngaX9cCyTJdZi0ImswsG
CZUlriNR0Ojue/yVDF+K5YxtQmw2slFysadX4kgNPO6LS2dwkeM=
=Xd+S
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Shawn Heisey-2
On 11/29/2018 4:41 PM, Christopher Schultz wrote:
> When mine returned (with wait=true as a request parameter), I got a
> JSON response telling me how long it took.

That's what I would expect.

If you have to explicitly include parameters like "wait" or
"waitSearcher" to make it block until the optimize is done, then in my
mind, that's a bug.  That should be the default setting.  In the 7.5
reference guide, I only see "waitSearcher", and it says the default is true.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Erick Erickson
Here's the scoop on optimize:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/

Note the link to how Solr 7.5 is different.

Best,
Erick
On Thu, Nov 29, 2018 at 3:53 PM Shawn Heisey <[hidden email]> wrote:

>
> On 11/29/2018 4:41 PM, Christopher Schultz wrote:
> > When mine returned (with wait=true as a request parameter), I got a
> > JSON response telling me how long it took.
>
> That's what I would expect.
>
> If you have to explicitly include parameters like "wait" or
> "waitSearcher" to make it block until the optimize is done, then in my
> mind, that's a bug.  That should be the default setting.  In the 7.5
> reference guide, I only see "waitSearcher", and it says the default is true.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: solr optimize command

Christopher Schultz
In reply to this post by Shawn Heisey-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Shawn,

On 11/29/18 18:53, Shawn Heisey wrote:

> On 11/29/2018 4:41 PM, Christopher Schultz wrote:
>> When mine returned (with wait=true as a request parameter), I got
>> a JSON response telling me how long it took.
>
> That's what I would expect.
>
> If you have to explicitly include parameters like "wait" or
> "waitSearcher" to make it block until the optimize is done, then in
> my mind, that's a bug.  That should be the default setting.  In the
> 7.5 reference guide, I only see "waitSearcher", and it says the
> default is true.

I didn't test it without that parameter. I used it because it was
suggested to me earlier this week on this list. It may in fact be
optional. I was using Solr 7.4.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwBZjQACgkQHPApP6U8
pFj2ZBAAq741UaizWQkea2dsupyJMUAs+K0A3oHh3Z9QCJqonXdgew620HMmlj2v
iTD1ECZ0OxUy6h4fDKAUFw96FO0/86gsGGMI+BVGZjbBN46oXwpUsNik3gEj3h/E
VjEZ0Nh0qpA783ug2Ezl7zHfeEBd+TRo6tHP1T7S6xp1JFqAs+kB5hxnepipFA/Q
SFssFmdub/0TTDSfxi2taPWxkHVCJO6Atse2HGhiLiRve/ZnV1LabnZnV92OCK6q
YucL3HzrOe23mu1qGJ2uzRM6M8pVkw5QioAUm/ESOFTVv5wqTwMPQ/HGTqO7W/Mp
qU0v3D8+ziKUtCW94UGSEDC5eBOhlr270JWOplYyrxhL/szCCSZ2yVLYaIz6ZXyI
EF5jh1WUsh6w+TrPPN0obUtbN/ZH6SLFzQzocbV6ZhZZL7kqgrAGmw1TVcokR0fC
HhXj0sEukrhRGBaog3+8w21j/ACywb02kTyl21ntpo/+flKHKpitafU2juLHJswD
nb3Q2YAD2bIWX8Ms9QTtozAc+EFVmNw5j2piFprTtWYdbAfqqTS/MxKqZoy/8L49
qiS1lY3eivOGDQufhAhdTO8jTzly5V6Y6xlJ8i0n0oQiPP2FY8yZeCLphdE5Wo/i
jfoauU9WwRGWdq1dwPUe1ZAg9eft2rlvexrVyjh7vjVk92sp17M=
=0Tlc
-----END PGP SIGNATURE-----