Solr document missing or not getting indexed though we get 200 ok status from server

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr document missing or not getting indexed though we get 200 ok status from server

mganeshs
Hi,
we are keep sending documents to Solr from our app server. Single document per request, but in parallel of 10 request hits solr cloud in a second.

We could see our post request ( update request ) hitting our solr 5.4 in localhost_access logs, and it's response as 200 Ok response. And also we get HTTP 200 OK response to our app servers as well for out HTTP request we fired to SOLR Cloud.

But few documents are not getting indexed. Out of 2000 documents we sent 10 documents are getting missed. Thought there is not error, few documents are getting missed.

We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.

Why is that 10 documents not getting indexed and also no error getting thrown back if server is not able to index it ?

Regards,



Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Nitin Kumar
Please check doc's unique key(Id). All keys shd be unique. Else docs having
same id will be replaced.

On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]> wrote:

> Hi,
> we are keep sending documents to Solr from our app server. Single document
> per request, but in parallel of 10 request hits solr cloud in a second.
>
> We could see our post request ( update request ) hitting our solr 5.4 in
> localhost_access logs, and it's response as 200 Ok response. And also we
> get HTTP 200 OK response to our app servers as well for out HTTP request we
> fired to SOLR Cloud.
>
> But few documents are not getting indexed. Out of 2000 documents we sent
> 10 documents are getting missed. Thought there is not error, few documents
> are getting missed.
>
> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
>
> Why is that 10 documents not getting indexed and also no error getting
> thrown back if server is not able to index it ?
>
> Regards,
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

mganeshs
Nitin, Thanks for reply. Our each document has unique id and its hbase rowkey id. So it will be unique only. So there is no chance of duplicates id being send.



On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]<mailto:[hidden email]>> wrote:
Please check doc's unique key(Id). All keys shd be unique. Else docs having
same id will be replaced.

On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:[hidden email]>> wrote:

> Hi,
> we are keep sending documents to Solr from our app server. Single document
> per request, but in parallel of 10 request hits solr cloud in a second.
>
> We could see our post request ( update request ) hitting our solr 5.4 in
> localhost_access logs, and it's response as 200 Ok response. And also we
> get HTTP 200 OK response to our app servers as well for out HTTP request we
> fired to SOLR Cloud.
>
> But few documents are not getting indexed. Out of 2000 documents we sent
> 10 documents are getting missed. Thought there is not error, few documents
> are getting missed.
>
> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
>
> Why is that 10 documents not getting indexed and also no error getting
> thrown back if server is not able to index it ?
>
> Regards,
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

mganeshs
In reply to this post by Nitin Kumar
Some more information on this... Most of documents get indexed properly. Few documents are not getting indexed.

All documents POST are seen in the localhost_access and 200 OK response is seen in local host access file. But in catalina, there are some difference in the logs for which are indexing properly, following is the logs.

FINE: PRE_UPDATE add
{,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
FINE: New TransactionLog file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856, exists=false, size=0, openExisting=false
Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor submit
FINE: sending update to http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0 add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001} params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
Sep 01, 2016 7:39:31 AM org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
FINE: starting runner: org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
Sep 01, 2016 7:39:31 AM org.apache.solr.update.processor.LogUpdateProcessor finish
FINE: PRE_UPDATE FINISH params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
Sep 01, 2016 7:39:31 AM org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
FINE: finished: org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
Sep 01, 2016 7:39:31 AM org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
{crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
{add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001 (1544254202941800448)]}
Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter doFilter
FINE: Closing out SolrRequest: params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
-------------------------------------------------

For the one which document is not getting indexed, we could see only following log in catalina.out. Not sure whether it's getting added to SOLR.


Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
FINE: PRE_UPDATE FINISH params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
{crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
{} 0 1
Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter doFilter
FINE: Closing out SolrRequest: params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)

----------------------

You can see that in above log for missing documents ( which is not indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that causing / reason for document not getting indexed ?

We have set autosoftcommit to 1 seconds and autohardcommit to 30 seconds.

We are not getting any errors or exceptions in the log.

This issue is becoming very critical and sort of reliable factor. Though we get 200 OK response from SOLR for update HTTP POST request, nothing happens on the SOLR side. If SOLR is not able to process, isn't it we get error from SOLR instead of giving 200 OK response.

Anybody has faced this sort of issue or any sort of help would be very much appreciated.




On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:[hidden email]>> wrote:
Nitin, Thanks for reply. Our each document has unique id and its hbase rowkey id. So it will be unique only. So there is no chance of duplicates id being send.



On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]<mailto:[hidden email]>> wrote:
Please check doc's unique key(Id). All keys shd be unique. Else docs having
same id will be replaced.

On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:[hidden email]>> wrote:

> Hi,
> we are keep sending documents to Solr from our app server. Single document
> per request, but in parallel of 10 request hits solr cloud in a second.
>
> We could see our post request ( update request ) hitting our solr 5.4 in
> localhost_access logs, and it's response as 200 Ok response. And also we
> get HTTP 200 OK response to our app servers as well for out HTTP request we
> fired to SOLR Cloud.
>
> But few documents are not getting indexed. Out of 2000 documents we sent
> 10 documents are getting missed. Thought there is not error, few documents
> are getting missed.
>
> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
>
> Why is that 10 documents not getting indexed and also no error getting
> thrown back if server is not able to index it ?
>
> Regards,
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Alexandre Rafalovitch
Can you identify the specific documents that 'fail'? What happens if
you post them manually? Try posting them manually but with one field
super-distinct to see whether it made it in. What happens if you post
it to an empty index (copy definition and try).

Also, what's your request handler's parameters look like. Perhaps you
have a signature processor, in which case it may be triggering
duplicates avoidance with different calculation from just an id.

My guess is still that it is some sort of duplicate issue.

Regards,
   Alex.
----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:

> Some more information on this... Most of documents get indexed properly. Few documents are not getting indexed.
>
> All documents POST are seen in the localhost_access and 200 OK response is seen in local host access file. But in catalina, there are some difference in the logs for which are indexing properly, following is the logs.
>
> FINE: PRE_UPDATE add
> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
> FINE: New TransactionLog file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856, exists=false, size=0, openExisting=false
> Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor submit
> FINE: sending update to http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0 add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001} params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
> Sep 01, 2016 7:39:31 AM org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> FINE: starting runner: org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> Sep 01, 2016 7:39:31 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> FINE: PRE_UPDATE FINISH params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> Sep 01, 2016 7:39:31 AM org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> FINE: finished: org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> Sep 01, 2016 7:39:31 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001 (1544254202941800448)]}
> Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter doFilter
> FINE: Closing out SolrRequest: params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> -------------------------------------------------
>
> For the one which document is not getting indexed, we could see only following log in catalina.out. Not sure whether it's getting added to SOLR.
>
>
> Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> FINE: PRE_UPDATE FINISH params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> {crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
> {} 0 1
> Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter doFilter
> FINE: Closing out SolrRequest: params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
>
> ----------------------
>
> You can see that in above log for missing documents ( which is not indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that causing / reason for document not getting indexed ?
>
> We have set autosoftcommit to 1 seconds and autohardcommit to 30 seconds.
>
> We are not getting any errors or exceptions in the log.
>
> This issue is becoming very critical and sort of reliable factor. Though we get 200 OK response from SOLR for update HTTP POST request, nothing happens on the SOLR side. If SOLR is not able to process, isn't it we get error from SOLR instead of giving 200 OK response.
>
> Anybody has faced this sort of issue or any sort of help would be very much appreciated.
>
>
>
>
> On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:[hidden email]>> wrote:
> Nitin, Thanks for reply. Our each document has unique id and its hbase rowkey id. So it will be unique only. So there is no chance of duplicates id being send.
>
>
>
> On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]<mailto:[hidden email]>> wrote:
> Please check doc's unique key(Id). All keys shd be unique. Else docs having
> same id will be replaced.
>
> On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:[hidden email]>> wrote:
>
>> Hi,
>> we are keep sending documents to Solr from our app server. Single document
>> per request, but in parallel of 10 request hits solr cloud in a second.
>>
>> We could see our post request ( update request ) hitting our solr 5.4 in
>> localhost_access logs, and it's response as 200 Ok response. And also we
>> get HTTP 200 OK response to our app servers as well for out HTTP request we
>> fired to SOLR Cloud.
>>
>> But few documents are not getting indexed. Out of 2000 documents we sent
>> 10 documents are getting missed. Thought there is not error, few documents
>> are getting missed.
>>
>> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
>>
>> Why is that 10 documents not getting indexed and also no error getting
>> thrown back if server is not able to index it ?
>>
>> Regards,
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Ganesh M-3
Hi Alex,
We tried to post the same manually from SOLR ADMIN / documents UI. It got
indexed successfully.  We are sure that it's not duplicate issue. We are
using default update handler and doesn't configure for custom one. We fire
the request to index using direct HTTP request using <add> <doc> XML
format. We are getting 200 OK response. But not getting indexed.

This is the request we fired and got 200. But not getting indexed. Same
request fired via SOLR ADMIN / Document UI, it's getting indexed
successfully.
<add>
<doc>
<CT_iscof>false</CT_iscof>
<CT_ui116_s>55788327</CT_ui116_s>
<CT_iscod>false</CT_iscod>
<CT_ui114_s>Factuur _PERF29161663_Voor _Va Bene.pdf</CT_ui114_s>
<CT_ui68_s>55788327-PERF29161663</CT_ui68_s>
<CT_ui75_f>3.00</CT_ui75_f>
<CT_ui48_s>2916847</CT_ui48_s>
<CT_stsid>STCUA0000021500000011472808279078</CT_stsid>
<CT_ui6_s>EUR</CT_ui6_s>
<CT_ui74_f>50.00</CT_ui74_f>
<CT_ui28_s>VAT</CT_ui28_s>
<CT_ui82_f>50.00</CT_ui82_f>
<CT_lsti>UA000002150000001:VB1
VB1:A000002150:vbgroupnft+1:1472808278137</CT_lsti>
<CT_pdfid>RA000002150AT009428</CT_pdfid>
<CT__s_RU_I_UA000002150000001>100000,false</CT__s_RU_I_UA000002150000001>
<CT_ui30_s>62440101</CT_ui30_s>
<CT_ui152_s>UNKNOWN</CT_ui152_s>
<CT_content> RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
RA000002150AT009425#pdf.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
1472808279002
CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA
Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA
Beheer B.V.null null null  2.1null  urn:www.cenbii.eu:
transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull
 urn:www.cenbii.eu:profile:bii04:ver2.0null  PERF20209161454372  null
 1472754600000null  3806 UNCL1001 null  EUR6 ISO 4217 Alpha null null
 29168472  null null  pdf.pdf2  null null  RA000002150AT009425#pdf.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#fpdf.pdf
application/pdf null null  Factuur _PERF29161663_Voor _Va Bene.pdf2  null
 PrimaryImagenull null  RA000002150AT009424#Factuur _PERF29161663_Voor _Va
Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#fFactuur
_PERF29161663_Voor _Va Bene.pdf application/pdf null null null  62440101ZZZ
NL:KVK null null  2916847ZZZ NL:VAT null null  VA Beheer B.V.null null
 Schurinkstraatnull  23null  Ommennull  7731GCnull null  NL6
ISO3166-1:Alpha2 null null  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153
null null  62440101ZZZ NL:KVK null null null  55788327ZZZ NL:KVK null null
 55788327ZZZ NL:KVK null null  Va Benenull null  Voorstraatnull  26null
 Voorschotennull  2251BNnull null  NL6 ISO3166-1:Alpha2 null null
 2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153 null null  55788327ZZZ
NL:KVK null null  1475173800000null null null null  NL6 ISO3166-1:Alpha2
null null  316 UNCL4461 null  1475087400000null  55788327-PERF29161663null
null  29168472 IBAN null  UNKNOWNBIC null  Betaling?binnen?14?dagen op
bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null
 3.00EUR null null  50.00EUR null  3.00EUR null null  S6 UNCL5305 null
 6.00null null  VAT6 UN/ECE 5153 null null  50.00EUR null  50.00EUR null
 53.00EUR null  53.00EUR null null  102  null  5.00BX null  50.00EUR null
null  PERF2020916145437null  PERF2020916145437null null  12  null null  S6
UNCL5305 null  6.00null null  VAT6 UN/ECE 5153 null null  10.00EUR null
 RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
DM001 XCNIN199751 NL:KVK:62440101 false false false false 10
UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen
1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1 VB1
UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group
55788327 Va Bene XCNL034435 Va Bene
LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150
PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V.
LEA0000021509223370564294689844EXCC1000001
STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true
Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001
NL:KVK:55788327</CT_content>
<CT_ranm>vbgroupnft+1</CT_ranm>
<CT_lstc>10</CT_lstc>
<CT_rgnm>Va Bene</CT_rgnm>
<CT_tdr>true</CT_tdr>
<CT_ui83_f>50.00</CT_ui83_f>
<CT_ui64_s>NL</CT_ui64_s>
<CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>100000,false</CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>
<CT_scxkvk>62440101</CT_scxkvk>
<CT_sgexid>XCNL034436</CT_sgexid>
<CT_ui67_l>1475087400000</CT_ui67_l>
<CT_mtnm>Factuur</CT_mtnm>
<CT_ui8_s>2916847</CT_ui8_s>
<CT_sunm>VB1 VB1</CT_sunm>
<CT_ui66_s>31</CT_ui66_s>
<CT_ui46_s>NL</CT_ui46_s>
<CT_ui84_f>53.00</CT_ui84_f>
<CT_lsts>Ontvangen</CT_lsts>
<CT_ui42_s>26</CT_ui42_s>
<rowkey>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</rowkey>
<CT_rgexid>XCNL034435</CT_rgexid>
<CT_ui80_s>VAT</CT_ui80_s>
<CT_sgexnm>VA Beheer B.V.</CT_sgexnm>
<CT_ui16_s>VA Beheer B.V.</CT_ui16_s>
<CT_ui44_s>2251BN</CT_ui44_s>
<CT_ui38_s>Va Bene</CT_ui38_s>
<CT_iscvd>false</CT_iscvd>
<CT_munm>VB1 VB1</CT_munm>
<CT_ui52_s>55788327</CT_ui52_s>
<CT_ui1_s>2.1</CT_ui1_s>
<CT_ui104_s>PERF2020916145437</CT_ui104_s>
<CT_ui56_l>1475173800000</CT_ui56_l>
<CT_tmsg>EM0001</CT_tmsg>
<CT_sbj>PERF2020916145437</CT_sbj>
<CT_ui4_s>PERF2020916145437</CT_ui4_s>
<CT_ui3_s>urn:www.cenbii.eu:profile:bii04:ver2.0</CT_ui3_s>
<CT_ui98_s>Betaling?binnen?14?dagen op bankrekening?2916847?onder
vermelding van?55788327/PERF29161663</CT_ui98_s>
<CT_ui5_l>1472754600000</CT_ui5_l>
<CT_ui2_s>urn:www.cenbii.eu:
transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:
si:si-ubl:ver1.1.x</CT_ui2_s>
<CT_ui88_f>5.00</CT_ui88_f>
<CT_muid>UA000002150000001</CT_muid>
<CT_ui36_s>55788327</CT_ui36_s>
<CT_sby>Group</CT_sby>
<CT_toid>NL:KVK:55788327</CT_toid>
<CT_crid>LEA0000021509223370564294752110EXCC2000001</CT_crid>
<CT_csid>LEA0000021509223370564294689844EXCC1000001</CT_csid>
<CT_cid>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</CT_cid>
<CT_fmid>NL:KVK:62440101</CT_fmid>
<CT_sgnm>VA Beheer B.V.</CT_sgnm>
<CT_mdt>1472808279002</CT_mdt>
<CT_ui113_f>10.00</CT_ui113_f>
<CT_tnm>Factuur</CT_tnm>
<CT_said>A000002150</CT_said>
<CT_ui115_s>62440101</CT_ui115_s>
<CT_suid>UA000002150000001</CT_suid>
<CT_raid>A000002150</CT_raid>
<CT_mtid>GLDT9223370666504283001RA000000006DTP2000001</CT_mtid>
<CT_dmtd>DM001</CT_dmtd>
<CT_rcxkvk>55788327</CT_rcxkvk>
<CT_ui111_s>VAT</CT_ui111_s>
<CT_ui106_s>1</CT_ui106_s>
<CT_ui50_s>VAT</CT_ui50_s>
<CT_ui14_s>2916847</CT_ui14_s>
<CT_exid>XCNIN199751</CT_exid>
<CT_sdur>VB1 VB1</CT_sdur>
<CT_ui153_s>PERF2020916145437</CT_ui153_s>
<CT_ui100_t1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
</CT_ui100_t1>
<CT_ui21_s>Ommen</CT_ui21_s>
<CT_ui109_s>6.00</CT_ui109_s>
<CT_csnm>VA Beheer B.V.</CT_csnm>
<CT_ui85_f>53.00</CT_ui85_f>
<CT_rby>Group</CT_rby>
<CT_tid>GLDT9223370666504283001RA000000006DTP2000001</CT_tid>
<CT_ui108_s>S</CT_ui108_s>
<CT_crnm>Va Bene</CT_crnm>
<CT_ui26_s>2916847</CT_ui26_s>
<CT_ui20_s>23</CT_ui20_s>
<CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>100000,false</CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>
<CT_ui101_s>PrimaryImage</CT_ui101_s>
<CT_ui24_s>NL</CT_ui24_s>
<CT_ui22_s>7731GC</CT_ui22_s>
<CT_uctx_UA000002150000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
false</CT_uctx_UA000002150000001_s1>
<CT_ui43_s>Voorschoten</CT_ui43_s>
<CT__s_dxat_2>RA000002150AT009425#pdf.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
</CT__s_dxat_2>
<CT__s_dxat_1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
</CT__s_dxat_1>
<CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
false</CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>
<CT_ui70_s>2916847</CT_ui70_s>
<CT_cdt>1472808279002</CT_cdt>
<CT_ui19_s>Schurinkstraat</CT_ui19_s>
<CT_sgid>LEA0000021509223370564294689844EXCC1000001</CT_sgid>
<CT_rgexnm>Va Bene</CT_rgexnm>
<CT_ui72_f>3.00</CT_ui72_f>
<CT_ui87_s>10</CT_ui87_s>
<CT__s_RU_O_UA000002150000001>100000,false</CT__s_RU_O_UA000002150000001>
<CT_ui77_s>S</CT_ui77_s>
<CT_cnm>PERF2020916145437</CT_cnm>
<CT_sanm>vbgroupnft+1</CT_sanm>
<CT_ird>false</CT_ird>
<CT_ui146_s>380</CT_ui146_s>
<CT_ui89_f>50.00</CT_ui89_f>
<CT_ui41_s>Voorstraat</CT_ui41_s>
<CT_daf>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
</CT_daf>
<CT_ui78_s>6.00</CT_ui78_s>
<CT_rgid>LEA0000021509223370564294752110EXCC2000001</CT_rgid>
</doc>
</add>


Only difference is when we post via manually via SOLR ADMIN, it's fired
when there is no concurrency. But initially there would be around 50
threads firing update POST request and also few threads fire's GET request
to different collections.
Little more information about the setup....
We have around 5 Collection and each collection has 2 shards ( one shard in
each node, one shard for index and other for replica), totally 2 nodes with
master master setup.

We are getting this error only when there is concurrency of of around 50
threads firing POST request to various collections same time.

Strange thing is why SOLR not returning error when it's not able to index
it. If SOLR has returned error, we could have retry the document indexing.
Is there any way we can make SOLR to return error instead of 200 when they
fail to index ?

Regards,
Ganesh

On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch <[hidden email]>
wrote:

> Can you identify the specific documents that 'fail'? What happens if
> you post them manually? Try posting them manually but with one field
> super-distinct to see whether it made it in. What happens if you post
> it to an empty index (copy definition and try).
>
> Also, what's your request handler's parameters look like. Perhaps you
> have a signature processor, in which case it may be triggering
> duplicates avoidance with different calculation from just an id.
>
> My guess is still that it is some sort of duplicate issue.
>
> Regards,
>    Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:
> > Some more information on this... Most of documents get indexed properly.
> Few documents are not getting indexed.
> >
> > All documents POST are seen in the localhost_access and 200 OK response
> is seen in local host access file. But in catalina, there are some
> difference in the logs for which are indexing properly, following is the
> logs.
> >
> > FINE: PRE_UPDATE add
> >
> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> >
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
> > FINE: New TransactionLog
> file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856,
> exists=false, size=0, openExisting=false
> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor submit
> > FINE: sending update to
> http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0
> add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
> > Sep 01, 2016 7:39:31 AM
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > FINE: starting runner:
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> > Sep 01, 2016 7:39:31 AM
> org.apache.solr.update.processor.LogUpdateProcessor finish
> > FINE: PRE_UPDATE FINISH
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> > Sep 01, 2016 7:39:31 AM
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > FINE: finished:
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> > Sep 01, 2016 7:39:31 AM
> org.apache.solr.update.processor.LogUpdateProcessor finish
> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> >
> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> >
> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001
> (1544254202941800448)]}
> > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter
> doFilter
> > FINE: Closing out SolrRequest:
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> > -------------------------------------------------
> >
> > For the one which document is not getting indexed, we could see only
> following log in catalina.out. Not sure whether it's getting added to SOLR.
> >
> >
> > Sep 01, 2016 7:39:56 AM
> org.apache.solr.update.processor.LogUpdateProcessor finish
> > FINE: PRE_UPDATE FINISH
> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> > Sep 01, 2016 7:39:56 AM
> org.apache.solr.update.processor.LogUpdateProcessor finish
> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> >
> {crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
> > {} 0 1
> > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter
> doFilter
> > FINE: Closing out SolrRequest:
> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> >
> > ----------------------
> >
> > You can see that in above log for missing documents ( which is not
> indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that
> causing / reason for document not getting indexed ?
> >
> > We have set autosoftcommit to 1 seconds and autohardcommit to 30 seconds.
> >
> > We are not getting any errors or exceptions in the log.
> >
> > This issue is becoming very critical and sort of reliable factor. Though
> we get 200 OK response from SOLR for update HTTP POST request, nothing
> happens on the SOLR side. If SOLR is not able to process, isn't it we get
> error from SOLR instead of giving 200 OK response.
> >
> > Anybody has faced this sort of issue or any sort of help would be very
> much appreciated.
> >
> >
> >
> >
> > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:
> [hidden email]>> wrote:
> > Nitin, Thanks for reply. Our each document has unique id and its hbase
> rowkey id. So it will be unique only. So there is no chance of duplicates
> id being send.
> >
> >
> >
> > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]
> <mailto:[hidden email]>> wrote:
> > Please check doc's unique key(Id). All keys shd be unique. Else docs
> having
> > same id will be replaced.
> >
> > On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:
> [hidden email]>> wrote:
> >
> >> Hi,
> >> we are keep sending documents to Solr from our app server. Single
> document
> >> per request, but in parallel of 10 request hits solr cloud in a second.
> >>
> >> We could see our post request ( update request ) hitting our solr 5.4 in
> >> localhost_access logs, and it's response as 200 Ok response. And also we
> >> get HTTP 200 OK response to our app servers as well for out HTTP
> request we
> >> fired to SOLR Cloud.
> >>
> >> But few documents are not getting indexed. Out of 2000 documents we sent
> >> 10 documents are getting missed. Thought there is not error, few
> documents
> >> are getting missed.
> >>
> >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
> >>
> >> Why is that 10 documents not getting indexed and also no error getting
> >> thrown back if server is not able to index it ?
> >>
> >> Regards,
> >>
> >>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Dheerendra Kulkarni
Can you try this:

1. Add the document
2. Follow up by optimize in the core admin ui,

If above works then you may need to check your commit.

Regards,
Dheerendra

On Sun, Sep 4, 2016 at 10:47 PM, Ganesh M <[hidden email]>
wrote:

> Hi Alex,
> We tried to post the same manually from SOLR ADMIN / documents UI. It got
> indexed successfully.  We are sure that it's not duplicate issue. We are
> using default update handler and doesn't configure for custom one. We fire
> the request to index using direct HTTP request using <add> <doc> XML
> format. We are getting 200 OK response. But not getting indexed.
>
> This is the request we fired and got 200. But not getting indexed. Same
> request fired via SOLR ADMIN / Document UI, it's getting indexed
> successfully.
> <add>
> <doc>
> <CT_iscof>false</CT_iscof>
> <CT_ui116_s>55788327</CT_ui116_s>
> <CT_iscod>false</CT_iscod>
> <CT_ui114_s>Factuur _PERF29161663_Voor _Va Bene.pdf</CT_ui114_s>
> <CT_ui68_s>55788327-PERF29161663</CT_ui68_s>
> <CT_ui75_f>3.00</CT_ui75_f>
> <CT_ui48_s>2916847</CT_ui48_s>
> <CT_stsid>STCUA0000021500000011472808279078</CT_stsid>
> <CT_ui6_s>EUR</CT_ui6_s>
> <CT_ui74_f>50.00</CT_ui74_f>
> <CT_ui28_s>VAT</CT_ui28_s>
> <CT_ui82_f>50.00</CT_ui82_f>
> <CT_lsti>UA000002150000001:VB1
> VB1:A000002150:vbgroupnft+1:1472808278137</CT_lsti>
> <CT_pdfid>RA000002150AT009428</CT_pdfid>
> <CT__s_RU_I_UA000002150000001>100000,false</CT__s_RU_I_UA000002150000001>
> <CT_ui30_s>62440101</CT_ui30_s>
> <CT_ui152_s>UNKNOWN</CT_ui152_s>
> <CT_content> RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#f
> RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278843.png#f
> 1472808279002
> CLEA0000021509223370564294689844EXCC100000192233705640464967
> 93C1LEA0000021509223370564294752110EXCC2000001
> PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA
> Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA
> Beheer B.V.null null null  2.1null  urn:www.cenbii.eu:
> transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull
>  urn:www.cenbii.eu:profile:bii04:ver2.0null  PERF20209161454372  null
>  1472754600000null  3806 UNCL1001 null  EUR6 ISO 4217 Alpha null null
>  29168472  null null  pdf.pdf2  null null  RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278843.png#fpdf.pdf
> application/pdf null null  Factuur _PERF29161663_Voor _Va Bene.pdf2  null
>  PrimaryImagenull null  RA000002150AT009424#Factuur _PERF29161663_Voor _Va
> Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#fFactuur
> _PERF29161663_Voor _Va Bene.pdf application/pdf null null null
> 62440101ZZZ
> NL:KVK null null  2916847ZZZ NL:VAT null null  VA Beheer B.V.null null
>  Schurinkstraatnull  23null  Ommennull  7731GCnull null  NL6
> ISO3166-1:Alpha2 null null  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153
> null null  62440101ZZZ NL:KVK null null null  55788327ZZZ NL:KVK null null
>  55788327ZZZ NL:KVK null null  Va Benenull null  Voorstraatnull  26null
>  Voorschotennull  2251BNnull null  NL6 ISO3166-1:Alpha2 null null
>  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153 null null  55788327ZZZ
> NL:KVK null null  1475173800000null null null null  NL6 ISO3166-1:Alpha2
> null null  316 UNCL4461 null  1475087400000null  55788327-PERF29161663null
> null  29168472 IBAN null  UNKNOWNBIC null  Betaling?binnen?14?dagen op
> bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null
>  3.00EUR null null  50.00EUR null  3.00EUR null null  S6 UNCL5305 null
>  6.00null null  VAT6 UN/ECE 5153 null null  50.00EUR null  50.00EUR null
>  53.00EUR null  53.00EUR null null  102  null  5.00BX null  50.00EUR null
> null  PERF2020916145437null  PERF2020916145437null null  12  null null  S6
> UNCL5305 null  6.00null null  VAT6 UN/ECE 5153 null null  10.00EUR null
>  RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#f
> DM001 XCNIN199751 NL:KVK:62440101 false false false false 10
> UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen
> 1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1 VB1
> UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group
> 55788327 Va Bene XCNL034435 Va Bene
> LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150
> PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V.
> LEA0000021509223370564294689844EXCC1000001
> STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true
> Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001
> NL:KVK:55788327</CT_content>
> <CT_ranm>vbgroupnft+1</CT_ranm>
> <CT_lstc>10</CT_lstc>
> <CT_rgnm>Va Bene</CT_rgnm>
> <CT_tdr>true</CT_tdr>
> <CT_ui83_f>50.00</CT_ui83_f>
> <CT_ui64_s>NL</CT_ui64_s>
> <CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>100000,false</CT_
> _s_RU_O_LEA0000021509223370564294689844EXCC1000001>
> <CT_scxkvk>62440101</CT_scxkvk>
> <CT_sgexid>XCNL034436</CT_sgexid>
> <CT_ui67_l>1475087400000</CT_ui67_l>
> <CT_mtnm>Factuur</CT_mtnm>
> <CT_ui8_s>2916847</CT_ui8_s>
> <CT_sunm>VB1 VB1</CT_sunm>
> <CT_ui66_s>31</CT_ui66_s>
> <CT_ui46_s>NL</CT_ui46_s>
> <CT_ui84_f>53.00</CT_ui84_f>
> <CT_lsts>Ontvangen</CT_lsts>
> <CT_ui42_s>26</CT_ui42_s>
> <rowkey>CLEA0000021509223370564294689844EXCC100000192233705640464967
> 93C1LEA0000021509223370564294752110EXCC2000001</rowkey>
> <CT_rgexid>XCNL034435</CT_rgexid>
> <CT_ui80_s>VAT</CT_ui80_s>
> <CT_sgexnm>VA Beheer B.V.</CT_sgexnm>
> <CT_ui16_s>VA Beheer B.V.</CT_ui16_s>
> <CT_ui44_s>2251BN</CT_ui44_s>
> <CT_ui38_s>Va Bene</CT_ui38_s>
> <CT_iscvd>false</CT_iscvd>
> <CT_munm>VB1 VB1</CT_munm>
> <CT_ui52_s>55788327</CT_ui52_s>
> <CT_ui1_s>2.1</CT_ui1_s>
> <CT_ui104_s>PERF2020916145437</CT_ui104_s>
> <CT_ui56_l>1475173800000</CT_ui56_l>
> <CT_tmsg>EM0001</CT_tmsg>
> <CT_sbj>PERF2020916145437</CT_sbj>
> <CT_ui4_s>PERF2020916145437</CT_ui4_s>
> <CT_ui3_s>urn:www.cenbii.eu:profile:bii04:ver2.0</CT_ui3_s>
> <CT_ui98_s>Betaling?binnen?14?dagen op bankrekening?2916847?onder
> vermelding van?55788327/PERF29161663</CT_ui98_s>
> <CT_ui5_l>1472754600000</CT_ui5_l>
> <CT_ui2_s>urn:www.cenbii.eu:
> transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:
> si:si-ubl:ver1.1.x</CT_ui2_s>
> <CT_ui88_f>5.00</CT_ui88_f>
> <CT_muid>UA000002150000001</CT_muid>
> <CT_ui36_s>55788327</CT_ui36_s>
> <CT_sby>Group</CT_sby>
> <CT_toid>NL:KVK:55788327</CT_toid>
> <CT_crid>LEA0000021509223370564294752110EXCC2000001</CT_crid>
> <CT_csid>LEA0000021509223370564294689844EXCC1000001</CT_csid>
> <CT_cid>CLEA0000021509223370564294689844EXCC100000192233705640464967
> 93C1LEA0000021509223370564294752110EXCC2000001</CT_cid>
> <CT_fmid>NL:KVK:62440101</CT_fmid>
> <CT_sgnm>VA Beheer B.V.</CT_sgnm>
> <CT_mdt>1472808279002</CT_mdt>
> <CT_ui113_f>10.00</CT_ui113_f>
> <CT_tnm>Factuur</CT_tnm>
> <CT_said>A000002150</CT_said>
> <CT_ui115_s>62440101</CT_ui115_s>
> <CT_suid>UA000002150000001</CT_suid>
> <CT_raid>A000002150</CT_raid>
> <CT_mtid>GLDT9223370666504283001RA000000006DTP2000001</CT_mtid>
> <CT_dmtd>DM001</CT_dmtd>
> <CT_rcxkvk>55788327</CT_rcxkvk>
> <CT_ui111_s>VAT</CT_ui111_s>
> <CT_ui106_s>1</CT_ui106_s>
> <CT_ui50_s>VAT</CT_ui50_s>
> <CT_ui14_s>2916847</CT_ui14_s>
> <CT_exid>XCNIN199751</CT_exid>
> <CT_sdur>VB1 VB1</CT_sdur>
> <CT_ui153_s>PERF2020916145437</CT_ui153_s>
> <CT_ui100_t1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#f
> </CT_ui100_t1>
> <CT_ui21_s>Ommen</CT_ui21_s>
> <CT_ui109_s>6.00</CT_ui109_s>
> <CT_csnm>VA Beheer B.V.</CT_csnm>
> <CT_ui85_f>53.00</CT_ui85_f>
> <CT_rby>Group</CT_rby>
> <CT_tid>GLDT9223370666504283001RA000000006DTP2000001</CT_tid>
> <CT_ui108_s>S</CT_ui108_s>
> <CT_crnm>Va Bene</CT_crnm>
> <CT_ui26_s>2916847</CT_ui26_s>
> <CT_ui20_s>23</CT_ui20_s>
> <CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>100000,false</CT_
> _s_RU_I_LEA0000021509223370564294752110EXCC2000001>
> <CT_ui101_s>PrimaryImage</CT_ui101_s>
> <CT_ui24_s>NL</CT_ui24_s>
> <CT_ui22_s>7731GC</CT_ui22_s>
> <CT_uctx_UA000002150000001_s1>CLEA00000215092233705642946898
> 44EXCC10000019223370564046496793C1LEA00000215092233705642947
> 52110EXCC2000001
> false</CT_uctx_UA000002150000001_s1>
> <CT_ui43_s>Voorschoten</CT_ui43_s>
> <CT__s_dxat_2>RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278843.png#f
> </CT__s_dxat_2>
> <CT__s_dxat_1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#f
> </CT__s_dxat_1>
> <CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>
> CLEA0000021509223370564294689844EXCC100000192233705640464967
> 93C1LEA0000021509223370564294752110EXCC2000001
> false</CT_uctx_UA000002150000001_LEA000002150922337056429468984
> 4EXCC1000001_s1>
> <CT_ui70_s>2916847</CT_ui70_s>
> <CT_cdt>1472808279002</CT_cdt>
> <CT_ui19_s>Schurinkstraat</CT_ui19_s>
> <CT_sgid>LEA0000021509223370564294689844EXCC1000001</CT_sgid>
> <CT_rgexnm>Va Bene</CT_rgexnm>
> <CT_ui72_f>3.00</CT_ui72_f>
> <CT_ui87_s>10</CT_ui87_s>
> <CT__s_RU_O_UA000002150000001>100000,false</CT__s_RU_O_UA000002150000001>
> <CT_ui77_s>S</CT_ui77_s>
> <CT_cnm>PERF2020916145437</CT_cnm>
> <CT_sanm>vbgroupnft+1</CT_sanm>
> <CT_ird>false</CT_ird>
> <CT_ui146_s>380</CT_ui146_s>
> <CT_ui89_f>50.00</CT_ui89_f>
> <CT_ui41_s>Voorstraat</CT_ui41_s>
> <CT_daf>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> UA000002150000001/1472808278632.png#f
> </CT_daf>
> <CT_ui78_s>6.00</CT_ui78_s>
> <CT_rgid>LEA0000021509223370564294752110EXCC2000001</CT_rgid>
> </doc>
> </add>
>
>
> Only difference is when we post via manually via SOLR ADMIN, it's fired
> when there is no concurrency. But initially there would be around 50
> threads firing update POST request and also few threads fire's GET request
> to different collections.
> Little more information about the setup....
> We have around 5 Collection and each collection has 2 shards ( one shard in
> each node, one shard for index and other for replica), totally 2 nodes with
> master master setup.
>
> We are getting this error only when there is concurrency of of around 50
> threads firing POST request to various collections same time.
>
> Strange thing is why SOLR not returning error when it's not able to index
> it. If SOLR has returned error, we could have retry the document indexing.
> Is there any way we can make SOLR to return error instead of 200 when they
> fail to index ?
>
> Regards,
> Ganesh
>
> On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch <[hidden email]>
> wrote:
>
> > Can you identify the specific documents that 'fail'? What happens if
> > you post them manually? Try posting them manually but with one field
> > super-distinct to see whether it made it in. What happens if you post
> > it to an empty index (copy definition and try).
> >
> > Also, what's your request handler's parameters look like. Perhaps you
> > have a signature processor, in which case it may be triggering
> > duplicates avoidance with different calculation from just an id.
> >
> > My guess is still that it is some sort of duplicate issue.
> >
> > Regards,
> >    Alex.
> > ----
> > Newsletter and resources for Solr beginners and intermediates:
> > http://www.solr-start.com/
> >
> >
> > On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:
> > > Some more information on this... Most of documents get indexed
> properly.
> > Few documents are not getting indexed.
> > >
> > > All documents POST are seen in the localhost_access and 200 OK response
> > is seen in local host access file. But in catalina, there are some
> > difference in the logs for which are indexing properly, following is the
> > logs.
> > >
> > > FINE: PRE_UPDATE add
> > >
> > {,id=CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001}
> > >
> > params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001),defaults(wt=xml)
> > > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
> > > FINE: New TransactionLog
> > file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/
> tlog.0000000000000220856,
> > exists=false, size=0, openExisting=false
> > > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor
> submit
> > > FINE: sending update to
> > http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0
> > add{version=1544254202941800448,id=CUA000000439000001922337056413
> 9207241C3LEA0000020769223370567404392838EXCC3000001}
> > params:update.distrib=FROMLEADER&distrib.from=http%
> 3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
> > > Sep 01, 2016 7:39:31 AM
> > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > > FINE: starting runner:
> > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$
> Runner@3fb794b2
> > > Sep 01, 2016 7:39:31 AM
> > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > FINE: PRE_UPDATE FINISH
> > params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001),defaults(wt=xml)
> > > Sep 01, 2016 7:39:31 AM
> > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > > FINE: finished:
> > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$
> Runner@3fb794b2
> > > Sep 01, 2016 7:39:31 AM
> > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> > >
> > {crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001}
> > >
> > {add=[CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001
> > (1544254202941800448)]}
> > > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter
> > doFilter
> > > FINE: Closing out SolrRequest:
> > params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> 7404392838EXCC3000001),defaults(wt=xml)
> > > -------------------------------------------------
> > >
> > > For the one which document is not getting indexed, we could see only
> > following log in catalina.out. Not sure whether it's getting added to
> SOLR.
> > >
> > >
> > > Sep 01, 2016 7:39:56 AM
> > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > FINE: PRE_UPDATE FINISH
> > params(crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> 7061972057EXCC1000002),defaults(wt=xml)
> > > Sep 01, 2016 7:39:56 AM
> > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> > >
> > {crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> 7061972057EXCC1000002}
> > > {} 0 1
> > > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter
> > doFilter
> > > FINE: Closing out SolrRequest:
> > params(crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> 7061972057EXCC1000002),defaults(wt=xml)
> > >
> > > ----------------------
> > >
> > > You can see that in above log for missing documents ( which is not
> > indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that
> > causing / reason for document not getting indexed ?
> > >
> > > We have set autosoftcommit to 1 seconds and autohardcommit to 30
> seconds.
> > >
> > > We are not getting any errors or exceptions in the log.
> > >
> > > This issue is becoming very critical and sort of reliable factor.
> Though
> > we get 200 OK response from SOLR for update HTTP POST request, nothing
> > happens on the SOLR side. If SOLR is not able to process, isn't it we get
> > error from SOLR instead of giving 200 OK response.
> > >
> > > Anybody has faced this sort of issue or any sort of help would be very
> > much appreciated.
> > >
> > >
> > >
> > >
> > > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:
> > [hidden email]>> wrote:
> > > Nitin, Thanks for reply. Our each document has unique id and its hbase
> > rowkey id. So it will be unique only. So there is no chance of duplicates
> > id being send.
> > >
> > >
> > >
> > > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]
> > <mailto:[hidden email]>> wrote:
> > > Please check doc's unique key(Id). All keys shd be unique. Else docs
> > having
> > > same id will be replaced.
> > >
> > > On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:
> > [hidden email]>> wrote:
> > >
> > >> Hi,
> > >> we are keep sending documents to Solr from our app server. Single
> > document
> > >> per request, but in parallel of 10 request hits solr cloud in a
> second.
> > >>
> > >> We could see our post request ( update request ) hitting our solr 5.4
> in
> > >> localhost_access logs, and it's response as 200 Ok response. And also
> we
> > >> get HTTP 200 OK response to our app servers as well for out HTTP
> > request we
> > >> fired to SOLR Cloud.
> > >>
> > >> But few documents are not getting indexed. Out of 2000 documents we
> sent
> > >> 10 documents are getting missed. Thought there is not error, few
> > documents
> > >> are getting missed.
> > >>
> > >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
> > >>
> > >> Why is that 10 documents not getting indexed and also no error getting
> > >> thrown back if server is not able to index it ?
> > >>
> > >> Regards,
> > >>
> > >>
> > >>
> > >>
> >
>



--
Regards,
Dheerendra
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Alexandre Rafalovitch
In reply to this post by Ganesh M-3
I can't tell anything from the document provided. So, here would be my thoughts:

If what you see is some sort of concurrency issues, the documents
missed/dropped would unlikely be exactly the same ones. So, if you see
the same documents dropped, it is much more likely to be something to
do with documents, with handler end-points, with sharding, etc.

If this is easily reproducible, I would run a network analyzer such as
Wireshark and compare your Admin UI session with your client session
and verify that everything expected is absolutely identical.

You could also temporarily turn on Debug via Admin console (under
logs). You could turn individual elements to Trace to get low-level
information on what's happening.

Finally, I am assuming this is all happening with latest Solr? If not,
it may be worth trying that and/or checking Jira for bugs. Lots of
things have been fixed/improved in more recent Solr related to
multi-threaded, multi-server setups.

Regards,
   Alex.

----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 5 September 2016 at 00:17, Ganesh M <[hidden email]> wrote:

> Hi Alex,
> We tried to post the same manually from SOLR ADMIN / documents UI. It got
> indexed successfully.  We are sure that it's not duplicate issue. We are
> using default update handler and doesn't configure for custom one. We fire
> the request to index using direct HTTP request using <add> <doc> XML
> format. We are getting 200 OK response. But not getting indexed.
>
> This is the request we fired and got 200. But not getting indexed. Same
> request fired via SOLR ADMIN / Document UI, it's getting indexed
> successfully.
> <add>
> <doc>
> <CT_iscof>false</CT_iscof>
> <CT_ui116_s>55788327</CT_ui116_s>
> <CT_iscod>false</CT_iscod>
> <CT_ui114_s>Factuur _PERF29161663_Voor _Va Bene.pdf</CT_ui114_s>
> <CT_ui68_s>55788327-PERF29161663</CT_ui68_s>
> <CT_ui75_f>3.00</CT_ui75_f>
> <CT_ui48_s>2916847</CT_ui48_s>
> <CT_stsid>STCUA0000021500000011472808279078</CT_stsid>
> <CT_ui6_s>EUR</CT_ui6_s>
> <CT_ui74_f>50.00</CT_ui74_f>
> <CT_ui28_s>VAT</CT_ui28_s>
> <CT_ui82_f>50.00</CT_ui82_f>
> <CT_lsti>UA000002150000001:VB1
> VB1:A000002150:vbgroupnft+1:1472808278137</CT_lsti>
> <CT_pdfid>RA000002150AT009428</CT_pdfid>
> <CT__s_RU_I_UA000002150000001>100000,false</CT__s_RU_I_UA000002150000001>
> <CT_ui30_s>62440101</CT_ui30_s>
> <CT_ui152_s>UNKNOWN</CT_ui152_s>
> <CT_content> RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
> 1472808279002
> CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA
> Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA
> Beheer B.V.null null null  2.1null  urn:www.cenbii.eu:
> transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull
>  urn:www.cenbii.eu:profile:bii04:ver2.0null  PERF20209161454372  null
>  1472754600000null  3806 UNCL1001 null  EUR6 ISO 4217 Alpha null null
>  29168472  null null  pdf.pdf2  null null  RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#fpdf.pdf
> application/pdf null null  Factuur _PERF29161663_Voor _Va Bene.pdf2  null
>  PrimaryImagenull null  RA000002150AT009424#Factuur _PERF29161663_Voor _Va
> Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#fFactuur
> _PERF29161663_Voor _Va Bene.pdf application/pdf null null null  62440101ZZZ
> NL:KVK null null  2916847ZZZ NL:VAT null null  VA Beheer B.V.null null
>  Schurinkstraatnull  23null  Ommennull  7731GCnull null  NL6
> ISO3166-1:Alpha2 null null  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153
> null null  62440101ZZZ NL:KVK null null null  55788327ZZZ NL:KVK null null
>  55788327ZZZ NL:KVK null null  Va Benenull null  Voorstraatnull  26null
>  Voorschotennull  2251BNnull null  NL6 ISO3166-1:Alpha2 null null
>  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153 null null  55788327ZZZ
> NL:KVK null null  1475173800000null null null null  NL6 ISO3166-1:Alpha2
> null null  316 UNCL4461 null  1475087400000null  55788327-PERF29161663null
> null  29168472 IBAN null  UNKNOWNBIC null  Betaling?binnen?14?dagen op
> bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null
>  3.00EUR null null  50.00EUR null  3.00EUR null null  S6 UNCL5305 null
>  6.00null null  VAT6 UN/ECE 5153 null null  50.00EUR null  50.00EUR null
>  53.00EUR null  53.00EUR null null  102  null  5.00BX null  50.00EUR null
> null  PERF2020916145437null  PERF2020916145437null null  12  null null  S6
> UNCL5305 null  6.00null null  VAT6 UN/ECE 5153 null null  10.00EUR null
>  RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> DM001 XCNIN199751 NL:KVK:62440101 false false false false 10
> UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen
> 1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1 VB1
> UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group
> 55788327 Va Bene XCNL034435 Va Bene
> LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150
> PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V.
> LEA0000021509223370564294689844EXCC1000001
> STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true
> Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001
> NL:KVK:55788327</CT_content>
> <CT_ranm>vbgroupnft+1</CT_ranm>
> <CT_lstc>10</CT_lstc>
> <CT_rgnm>Va Bene</CT_rgnm>
> <CT_tdr>true</CT_tdr>
> <CT_ui83_f>50.00</CT_ui83_f>
> <CT_ui64_s>NL</CT_ui64_s>
> <CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>100000,false</CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>
> <CT_scxkvk>62440101</CT_scxkvk>
> <CT_sgexid>XCNL034436</CT_sgexid>
> <CT_ui67_l>1475087400000</CT_ui67_l>
> <CT_mtnm>Factuur</CT_mtnm>
> <CT_ui8_s>2916847</CT_ui8_s>
> <CT_sunm>VB1 VB1</CT_sunm>
> <CT_ui66_s>31</CT_ui66_s>
> <CT_ui46_s>NL</CT_ui46_s>
> <CT_ui84_f>53.00</CT_ui84_f>
> <CT_lsts>Ontvangen</CT_lsts>
> <CT_ui42_s>26</CT_ui42_s>
> <rowkey>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</rowkey>
> <CT_rgexid>XCNL034435</CT_rgexid>
> <CT_ui80_s>VAT</CT_ui80_s>
> <CT_sgexnm>VA Beheer B.V.</CT_sgexnm>
> <CT_ui16_s>VA Beheer B.V.</CT_ui16_s>
> <CT_ui44_s>2251BN</CT_ui44_s>
> <CT_ui38_s>Va Bene</CT_ui38_s>
> <CT_iscvd>false</CT_iscvd>
> <CT_munm>VB1 VB1</CT_munm>
> <CT_ui52_s>55788327</CT_ui52_s>
> <CT_ui1_s>2.1</CT_ui1_s>
> <CT_ui104_s>PERF2020916145437</CT_ui104_s>
> <CT_ui56_l>1475173800000</CT_ui56_l>
> <CT_tmsg>EM0001</CT_tmsg>
> <CT_sbj>PERF2020916145437</CT_sbj>
> <CT_ui4_s>PERF2020916145437</CT_ui4_s>
> <CT_ui3_s>urn:www.cenbii.eu:profile:bii04:ver2.0</CT_ui3_s>
> <CT_ui98_s>Betaling?binnen?14?dagen op bankrekening?2916847?onder
> vermelding van?55788327/PERF29161663</CT_ui98_s>
> <CT_ui5_l>1472754600000</CT_ui5_l>
> <CT_ui2_s>urn:www.cenbii.eu:
> transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:
> si:si-ubl:ver1.1.x</CT_ui2_s>
> <CT_ui88_f>5.00</CT_ui88_f>
> <CT_muid>UA000002150000001</CT_muid>
> <CT_ui36_s>55788327</CT_ui36_s>
> <CT_sby>Group</CT_sby>
> <CT_toid>NL:KVK:55788327</CT_toid>
> <CT_crid>LEA0000021509223370564294752110EXCC2000001</CT_crid>
> <CT_csid>LEA0000021509223370564294689844EXCC1000001</CT_csid>
> <CT_cid>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</CT_cid>
> <CT_fmid>NL:KVK:62440101</CT_fmid>
> <CT_sgnm>VA Beheer B.V.</CT_sgnm>
> <CT_mdt>1472808279002</CT_mdt>
> <CT_ui113_f>10.00</CT_ui113_f>
> <CT_tnm>Factuur</CT_tnm>
> <CT_said>A000002150</CT_said>
> <CT_ui115_s>62440101</CT_ui115_s>
> <CT_suid>UA000002150000001</CT_suid>
> <CT_raid>A000002150</CT_raid>
> <CT_mtid>GLDT9223370666504283001RA000000006DTP2000001</CT_mtid>
> <CT_dmtd>DM001</CT_dmtd>
> <CT_rcxkvk>55788327</CT_rcxkvk>
> <CT_ui111_s>VAT</CT_ui111_s>
> <CT_ui106_s>1</CT_ui106_s>
> <CT_ui50_s>VAT</CT_ui50_s>
> <CT_ui14_s>2916847</CT_ui14_s>
> <CT_exid>XCNIN199751</CT_exid>
> <CT_sdur>VB1 VB1</CT_sdur>
> <CT_ui153_s>PERF2020916145437</CT_ui153_s>
> <CT_ui100_t1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> </CT_ui100_t1>
> <CT_ui21_s>Ommen</CT_ui21_s>
> <CT_ui109_s>6.00</CT_ui109_s>
> <CT_csnm>VA Beheer B.V.</CT_csnm>
> <CT_ui85_f>53.00</CT_ui85_f>
> <CT_rby>Group</CT_rby>
> <CT_tid>GLDT9223370666504283001RA000000006DTP2000001</CT_tid>
> <CT_ui108_s>S</CT_ui108_s>
> <CT_crnm>Va Bene</CT_crnm>
> <CT_ui26_s>2916847</CT_ui26_s>
> <CT_ui20_s>23</CT_ui20_s>
> <CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>100000,false</CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>
> <CT_ui101_s>PrimaryImage</CT_ui101_s>
> <CT_ui24_s>NL</CT_ui24_s>
> <CT_ui22_s>7731GC</CT_ui22_s>
> <CT_uctx_UA000002150000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> false</CT_uctx_UA000002150000001_s1>
> <CT_ui43_s>Voorschoten</CT_ui43_s>
> <CT__s_dxat_2>RA000002150AT009425#pdf.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
> </CT__s_dxat_2>
> <CT__s_dxat_1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> </CT__s_dxat_1>
> <CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> false</CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>
> <CT_ui70_s>2916847</CT_ui70_s>
> <CT_cdt>1472808279002</CT_cdt>
> <CT_ui19_s>Schurinkstraat</CT_ui19_s>
> <CT_sgid>LEA0000021509223370564294689844EXCC1000001</CT_sgid>
> <CT_rgexnm>Va Bene</CT_rgexnm>
> <CT_ui72_f>3.00</CT_ui72_f>
> <CT_ui87_s>10</CT_ui87_s>
> <CT__s_RU_O_UA000002150000001>100000,false</CT__s_RU_O_UA000002150000001>
> <CT_ui77_s>S</CT_ui77_s>
> <CT_cnm>PERF2020916145437</CT_cnm>
> <CT_sanm>vbgroupnft+1</CT_sanm>
> <CT_ird>false</CT_ird>
> <CT_ui146_s>380</CT_ui146_s>
> <CT_ui89_f>50.00</CT_ui89_f>
> <CT_ui41_s>Voorstraat</CT_ui41_s>
> <CT_daf>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> </CT_daf>
> <CT_ui78_s>6.00</CT_ui78_s>
> <CT_rgid>LEA0000021509223370564294752110EXCC2000001</CT_rgid>
> </doc>
> </add>
>
>
> Only difference is when we post via manually via SOLR ADMIN, it's fired
> when there is no concurrency. But initially there would be around 50
> threads firing update POST request and also few threads fire's GET request
> to different collections.
> Little more information about the setup....
> We have around 5 Collection and each collection has 2 shards ( one shard in
> each node, one shard for index and other for replica), totally 2 nodes with
> master master setup.
>
> We are getting this error only when there is concurrency of of around 50
> threads firing POST request to various collections same time.
>
> Strange thing is why SOLR not returning error when it's not able to index
> it. If SOLR has returned error, we could have retry the document indexing.
> Is there any way we can make SOLR to return error instead of 200 when they
> fail to index ?
>
> Regards,
> Ganesh
>
> On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch <[hidden email]>
> wrote:
>
>> Can you identify the specific documents that 'fail'? What happens if
>> you post them manually? Try posting them manually but with one field
>> super-distinct to see whether it made it in. What happens if you post
>> it to an empty index (copy definition and try).
>>
>> Also, what's your request handler's parameters look like. Perhaps you
>> have a signature processor, in which case it may be triggering
>> duplicates avoidance with different calculation from just an id.
>>
>> My guess is still that it is some sort of duplicate issue.
>>
>> Regards,
>>    Alex.
>> ----
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:
>> > Some more information on this... Most of documents get indexed properly.
>> Few documents are not getting indexed.
>> >
>> > All documents POST are seen in the localhost_access and 200 OK response
>> is seen in local host access file. But in catalina, there are some
>> difference in the logs for which are indexing properly, following is the
>> logs.
>> >
>> > FINE: PRE_UPDATE add
>> >
>> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
>> >
>> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
>> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
>> > FINE: New TransactionLog
>> file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856,
>> exists=false, size=0, openExisting=false
>> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor submit
>> > FINE: sending update to
>> http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0
>> add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
>> params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
>> > Sep 01, 2016 7:39:31 AM
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
>> > FINE: starting runner:
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
>> > Sep 01, 2016 7:39:31 AM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> > FINE: PRE_UPDATE FINISH
>> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
>> > Sep 01, 2016 7:39:31 AM
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
>> > FINE: finished:
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
>> > Sep 01, 2016 7:39:31 AM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
>> >
>> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
>> >
>> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001
>> (1544254202941800448)]}
>> > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter
>> doFilter
>> > FINE: Closing out SolrRequest:
>> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
>> > -------------------------------------------------
>> >
>> > For the one which document is not getting indexed, we could see only
>> following log in catalina.out. Not sure whether it's getting added to SOLR.
>> >
>> >
>> > Sep 01, 2016 7:39:56 AM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> > FINE: PRE_UPDATE FINISH
>> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
>> > Sep 01, 2016 7:39:56 AM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
>> >
>> {crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
>> > {} 0 1
>> > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter
>> doFilter
>> > FINE: Closing out SolrRequest:
>> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
>> >
>> > ----------------------
>> >
>> > You can see that in above log for missing documents ( which is not
>> indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that
>> causing / reason for document not getting indexed ?
>> >
>> > We have set autosoftcommit to 1 seconds and autohardcommit to 30 seconds.
>> >
>> > We are not getting any errors or exceptions in the log.
>> >
>> > This issue is becoming very critical and sort of reliable factor. Though
>> we get 200 OK response from SOLR for update HTTP POST request, nothing
>> happens on the SOLR side. If SOLR is not able to process, isn't it we get
>> error from SOLR instead of giving 200 OK response.
>> >
>> > Anybody has faced this sort of issue or any sort of help would be very
>> much appreciated.
>> >
>> >
>> >
>> >
>> > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:
>> [hidden email]>> wrote:
>> > Nitin, Thanks for reply. Our each document has unique id and its hbase
>> rowkey id. So it will be unique only. So there is no chance of duplicates
>> id being send.
>> >
>> >
>> >
>> > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]
>> <mailto:[hidden email]>> wrote:
>> > Please check doc's unique key(Id). All keys shd be unique. Else docs
>> having
>> > same id will be replaced.
>> >
>> > On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:
>> [hidden email]>> wrote:
>> >
>> >> Hi,
>> >> we are keep sending documents to Solr from our app server. Single
>> document
>> >> per request, but in parallel of 10 request hits solr cloud in a second.
>> >>
>> >> We could see our post request ( update request ) hitting our solr 5.4 in
>> >> localhost_access logs, and it's response as 200 Ok response. And also we
>> >> get HTTP 200 OK response to our app servers as well for out HTTP
>> request we
>> >> fired to SOLR Cloud.
>> >>
>> >> But few documents are not getting indexed. Out of 2000 documents we sent
>> >> 10 documents are getting missed. Thought there is not error, few
>> documents
>> >> are getting missed.
>> >>
>> >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
>> >>
>> >> Why is that 10 documents not getting indexed and also no error getting
>> >> thrown back if server is not able to index it ?
>> >>
>> >> Regards,
>> >>
>> >>
>> >>
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Ganesh M-3
In reply to this post by Dheerendra Kulkarni
Hi Dheerendra,

This doesn't always happens. When we add single document, no issue on that.
It get's added. But when add in parallel with 50 threads concurrently, out
of 2000 documents 10 documents are getting missed ( not getting indexed ).
When this is happening, we also tried to do hard commit manually and tried
optimize too from Admin screen. But the documents are not getting indexed.
As I mentioned we are using autoSoftcommit as 1 sec and autohardcommit as
30 seconds.

Regards,
Ganesh

On Mon, Sep 5, 2016 at 1:47 AM Dheerendra Kulkarni <[hidden email]>
wrote:

> Can you try this:
>
> 1. Add the document
> 2. Follow up by optimize in the core admin ui,
>
> If above works then you may need to check your commit.
>
> Regards,
> Dheerendra
>
> On Sun, Sep 4, 2016 at 10:47 PM, Ganesh M <[hidden email]>
> wrote:
>
> > Hi Alex,
> > We tried to post the same manually from SOLR ADMIN / documents UI. It got
> > indexed successfully.  We are sure that it's not duplicate issue. We are
> > using default update handler and doesn't configure for custom one. We
> fire
> > the request to index using direct HTTP request using <add> <doc> XML
> > format. We are getting 200 OK response. But not getting indexed.
> >
> > This is the request we fired and got 200. But not getting indexed. Same
> > request fired via SOLR ADMIN / Document UI, it's getting indexed
> > successfully.
> > <add>
> > <doc>
> > <CT_iscof>false</CT_iscof>
> > <CT_ui116_s>55788327</CT_ui116_s>
> > <CT_iscod>false</CT_iscod>
> > <CT_ui114_s>Factuur _PERF29161663_Voor _Va Bene.pdf</CT_ui114_s>
> > <CT_ui68_s>55788327-PERF29161663</CT_ui68_s>
> > <CT_ui75_f>3.00</CT_ui75_f>
> > <CT_ui48_s>2916847</CT_ui48_s>
> > <CT_stsid>STCUA0000021500000011472808279078</CT_stsid>
> > <CT_ui6_s>EUR</CT_ui6_s>
> > <CT_ui74_f>50.00</CT_ui74_f>
> > <CT_ui28_s>VAT</CT_ui28_s>
> > <CT_ui82_f>50.00</CT_ui82_f>
> > <CT_lsti>UA000002150000001:VB1
> > VB1:A000002150:vbgroupnft+1:1472808278137</CT_lsti>
> > <CT_pdfid>RA000002150AT009428</CT_pdfid>
> > <CT__s_RU_I_UA000002150000001>100000,false</CT__s_RU_I_UA000002150000001>
> > <CT_ui30_s>62440101</CT_ui30_s>
> > <CT_ui152_s>UNKNOWN</CT_ui152_s>
> > <CT_content> RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#f
> > RA000002150AT009425#pdf.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278843.png#f
> > 1472808279002
> > CLEA0000021509223370564294689844EXCC100000192233705640464967
> > 93C1LEA0000021509223370564294752110EXCC2000001
> > PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA
> > Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA
> > Beheer B.V.null null null  2.1null  urn:www.cenbii.eu:
> > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull
> >  urn:www.cenbii.eu:profile:bii04:ver2.0null  PERF20209161454372  null
> >  1472754600000null  3806 UNCL1001 null  EUR6 ISO 4217 Alpha null null
> >  29168472  null null  pdf.pdf2  null null  RA000002150AT009425#pdf.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278843.png#fpdf.pdf
> > application/pdf null null  Factuur _PERF29161663_Voor _Va Bene.pdf2  null
> >  PrimaryImagenull null  RA000002150AT009424#Factuur _PERF29161663_Voor
> _Va
> > Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#fFactuur
> > _PERF29161663_Voor _Va Bene.pdf application/pdf null null null
> > 62440101ZZZ
> > NL:KVK null null  2916847ZZZ NL:VAT null null  VA Beheer B.V.null null
> >  Schurinkstraatnull  23null  Ommennull  7731GCnull null  NL6
> > ISO3166-1:Alpha2 null null  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153
> > null null  62440101ZZZ NL:KVK null null null  55788327ZZZ NL:KVK null
> null
> >  55788327ZZZ NL:KVK null null  Va Benenull null  Voorstraatnull  26null
> >  Voorschotennull  2251BNnull null  NL6 ISO3166-1:Alpha2 null null
> >  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153 null null  55788327ZZZ
> > NL:KVK null null  1475173800000null null null null  NL6 ISO3166-1:Alpha2
> > null null  316 UNCL4461 null  1475087400000null
> 55788327-PERF29161663null
> > null  29168472 IBAN null  UNKNOWNBIC null  Betaling?binnen?14?dagen op
> > bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null
> >  3.00EUR null null  50.00EUR null  3.00EUR null null  S6 UNCL5305 null
> >  6.00null null  VAT6 UN/ECE 5153 null null  50.00EUR null  50.00EUR null
> >  53.00EUR null  53.00EUR null null  102  null  5.00BX null  50.00EUR null
> > null  PERF2020916145437null  PERF2020916145437null null  12  null null
> S6
> > UNCL5305 null  6.00null null  VAT6 UN/ECE 5153 null null  10.00EUR null
> >  RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#f
> > DM001 XCNIN199751 NL:KVK:62440101 false false false false 10
> > UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen
> > 1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1
> VB1
> > UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group
> > 55788327 Va Bene XCNL034435 Va Bene
> > LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150
> > PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V.
> > LEA0000021509223370564294689844EXCC1000001
> > STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true
> > Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001
> > NL:KVK:55788327</CT_content>
> > <CT_ranm>vbgroupnft+1</CT_ranm>
> > <CT_lstc>10</CT_lstc>
> > <CT_rgnm>Va Bene</CT_rgnm>
> > <CT_tdr>true</CT_tdr>
> > <CT_ui83_f>50.00</CT_ui83_f>
> > <CT_ui64_s>NL</CT_ui64_s>
> > <CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>100000,false</CT_
> > _s_RU_O_LEA0000021509223370564294689844EXCC1000001>
> > <CT_scxkvk>62440101</CT_scxkvk>
> > <CT_sgexid>XCNL034436</CT_sgexid>
> > <CT_ui67_l>1475087400000</CT_ui67_l>
> > <CT_mtnm>Factuur</CT_mtnm>
> > <CT_ui8_s>2916847</CT_ui8_s>
> > <CT_sunm>VB1 VB1</CT_sunm>
> > <CT_ui66_s>31</CT_ui66_s>
> > <CT_ui46_s>NL</CT_ui46_s>
> > <CT_ui84_f>53.00</CT_ui84_f>
> > <CT_lsts>Ontvangen</CT_lsts>
> > <CT_ui42_s>26</CT_ui42_s>
> > <rowkey>CLEA0000021509223370564294689844EXCC100000192233705640464967
> > 93C1LEA0000021509223370564294752110EXCC2000001</rowkey>
> > <CT_rgexid>XCNL034435</CT_rgexid>
> > <CT_ui80_s>VAT</CT_ui80_s>
> > <CT_sgexnm>VA Beheer B.V.</CT_sgexnm>
> > <CT_ui16_s>VA Beheer B.V.</CT_ui16_s>
> > <CT_ui44_s>2251BN</CT_ui44_s>
> > <CT_ui38_s>Va Bene</CT_ui38_s>
> > <CT_iscvd>false</CT_iscvd>
> > <CT_munm>VB1 VB1</CT_munm>
> > <CT_ui52_s>55788327</CT_ui52_s>
> > <CT_ui1_s>2.1</CT_ui1_s>
> > <CT_ui104_s>PERF2020916145437</CT_ui104_s>
> > <CT_ui56_l>1475173800000</CT_ui56_l>
> > <CT_tmsg>EM0001</CT_tmsg>
> > <CT_sbj>PERF2020916145437</CT_sbj>
> > <CT_ui4_s>PERF2020916145437</CT_ui4_s>
> > <CT_ui3_s>urn:www.cenbii.eu:profile:bii04:ver2.0</CT_ui3_s>
> > <CT_ui98_s>Betaling?binnen?14?dagen op bankrekening?2916847?onder
> > vermelding van?55788327/PERF29161663</CT_ui98_s>
> > <CT_ui5_l>1472754600000</CT_ui5_l>
> > <CT_ui2_s>urn:www.cenbii.eu:
> > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:
> > si:si-ubl:ver1.1.x</CT_ui2_s>
> > <CT_ui88_f>5.00</CT_ui88_f>
> > <CT_muid>UA000002150000001</CT_muid>
> > <CT_ui36_s>55788327</CT_ui36_s>
> > <CT_sby>Group</CT_sby>
> > <CT_toid>NL:KVK:55788327</CT_toid>
> > <CT_crid>LEA0000021509223370564294752110EXCC2000001</CT_crid>
> > <CT_csid>LEA0000021509223370564294689844EXCC1000001</CT_csid>
> > <CT_cid>CLEA0000021509223370564294689844EXCC100000192233705640464967
> > 93C1LEA0000021509223370564294752110EXCC2000001</CT_cid>
> > <CT_fmid>NL:KVK:62440101</CT_fmid>
> > <CT_sgnm>VA Beheer B.V.</CT_sgnm>
> > <CT_mdt>1472808279002</CT_mdt>
> > <CT_ui113_f>10.00</CT_ui113_f>
> > <CT_tnm>Factuur</CT_tnm>
> > <CT_said>A000002150</CT_said>
> > <CT_ui115_s>62440101</CT_ui115_s>
> > <CT_suid>UA000002150000001</CT_suid>
> > <CT_raid>A000002150</CT_raid>
> > <CT_mtid>GLDT9223370666504283001RA000000006DTP2000001</CT_mtid>
> > <CT_dmtd>DM001</CT_dmtd>
> > <CT_rcxkvk>55788327</CT_rcxkvk>
> > <CT_ui111_s>VAT</CT_ui111_s>
> > <CT_ui106_s>1</CT_ui106_s>
> > <CT_ui50_s>VAT</CT_ui50_s>
> > <CT_ui14_s>2916847</CT_ui14_s>
> > <CT_exid>XCNIN199751</CT_exid>
> > <CT_sdur>VB1 VB1</CT_sdur>
> > <CT_ui153_s>PERF2020916145437</CT_ui153_s>
> > <CT_ui100_t1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#f
> > </CT_ui100_t1>
> > <CT_ui21_s>Ommen</CT_ui21_s>
> > <CT_ui109_s>6.00</CT_ui109_s>
> > <CT_csnm>VA Beheer B.V.</CT_csnm>
> > <CT_ui85_f>53.00</CT_ui85_f>
> > <CT_rby>Group</CT_rby>
> > <CT_tid>GLDT9223370666504283001RA000000006DTP2000001</CT_tid>
> > <CT_ui108_s>S</CT_ui108_s>
> > <CT_crnm>Va Bene</CT_crnm>
> > <CT_ui26_s>2916847</CT_ui26_s>
> > <CT_ui20_s>23</CT_ui20_s>
> > <CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>100000,false</CT_
> > _s_RU_I_LEA0000021509223370564294752110EXCC2000001>
> > <CT_ui101_s>PrimaryImage</CT_ui101_s>
> > <CT_ui24_s>NL</CT_ui24_s>
> > <CT_ui22_s>7731GC</CT_ui22_s>
> > <CT_uctx_UA000002150000001_s1>CLEA00000215092233705642946898
> > 44EXCC10000019223370564046496793C1LEA00000215092233705642947
> > 52110EXCC2000001
> > false</CT_uctx_UA000002150000001_s1>
> > <CT_ui43_s>Voorschoten</CT_ui43_s>
> > <CT__s_dxat_2>RA000002150AT009425#pdf.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278843.png#f
> > </CT__s_dxat_2>
> > <CT__s_dxat_1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va
> Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#f
> > </CT__s_dxat_1>
> > <CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>
> > CLEA0000021509223370564294689844EXCC100000192233705640464967
> > 93C1LEA0000021509223370564294752110EXCC2000001
> > false</CT_uctx_UA000002150000001_LEA000002150922337056429468984
> > 4EXCC1000001_s1>
> > <CT_ui70_s>2916847</CT_ui70_s>
> > <CT_cdt>1472808279002</CT_cdt>
> > <CT_ui19_s>Schurinkstraat</CT_ui19_s>
> > <CT_sgid>LEA0000021509223370564294689844EXCC1000001</CT_sgid>
> > <CT_rgexnm>Va Bene</CT_rgexnm>
> > <CT_ui72_f>3.00</CT_ui72_f>
> > <CT_ui87_s>10</CT_ui87_s>
> > <CT__s_RU_O_UA000002150000001>100000,false</CT__s_RU_O_UA000002150000001>
> > <CT_ui77_s>S</CT_ui77_s>
> > <CT_cnm>PERF2020916145437</CT_cnm>
> > <CT_sanm>vbgroupnft+1</CT_sanm>
> > <CT_ird>false</CT_ird>
> > <CT_ui146_s>380</CT_ui146_s>
> > <CT_ui89_f>50.00</CT_ui89_f>
> > <CT_ui41_s>Voorstraat</CT_ui41_s>
> > <CT_daf>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/
> > UA000002150000001/1472808278632.png#f
> > </CT_daf>
> > <CT_ui78_s>6.00</CT_ui78_s>
> > <CT_rgid>LEA0000021509223370564294752110EXCC2000001</CT_rgid>
> > </doc>
> > </add>
> >
> >
> > Only difference is when we post via manually via SOLR ADMIN, it's fired
> > when there is no concurrency. But initially there would be around 50
> > threads firing update POST request and also few threads fire's GET
> request
> > to different collections.
> > Little more information about the setup....
> > We have around 5 Collection and each collection has 2 shards ( one shard
> in
> > each node, one shard for index and other for replica), totally 2 nodes
> with
> > master master setup.
> >
> > We are getting this error only when there is concurrency of of around 50
> > threads firing POST request to various collections same time.
> >
> > Strange thing is why SOLR not returning error when it's not able to index
> > it. If SOLR has returned error, we could have retry the document
> indexing.
> > Is there any way we can make SOLR to return error instead of 200 when
> they
> > fail to index ?
> >
> > Regards,
> > Ganesh
> >
> > On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch <
> [hidden email]>
> > wrote:
> >
> > > Can you identify the specific documents that 'fail'? What happens if
> > > you post them manually? Try posting them manually but with one field
> > > super-distinct to see whether it made it in. What happens if you post
> > > it to an empty index (copy definition and try).
> > >
> > > Also, what's your request handler's parameters look like. Perhaps you
> > > have a signature processor, in which case it may be triggering
> > > duplicates avoidance with different calculation from just an id.
> > >
> > > My guess is still that it is some sort of duplicate issue.
> > >
> > > Regards,
> > >    Alex.
> > > ----
> > > Newsletter and resources for Solr beginners and intermediates:
> > > http://www.solr-start.com/
> > >
> > >
> > > On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:
> > > > Some more information on this... Most of documents get indexed
> > properly.
> > > Few documents are not getting indexed.
> > > >
> > > > All documents POST are seen in the localhost_access and 200 OK
> response
> > > is seen in local host access file. But in catalina, there are some
> > > difference in the logs for which are indexing properly, following is
> the
> > > logs.
> > > >
> > > > FINE: PRE_UPDATE add
> > > >
> > > {,id=CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001}
> > > >
> > >
> params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001),defaults(wt=xml)
> > > > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
> > > > FINE: New TransactionLog
> > > file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/
> > tlog.0000000000000220856,
> > > exists=false, size=0, openExisting=false
> > > > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor
> > submit
> > > > FINE: sending update to
> > > http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0
> > > add{version=1544254202941800448,id=CUA000000439000001922337056413
> > 9207241C3LEA0000020769223370567404392838EXCC3000001}
> > > params:update.distrib=FROMLEADER&distrib.from=http%
> > 3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
> > > > Sep 01, 2016 7:39:31 AM
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > > > FINE: starting runner:
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$
> > Runner@3fb794b2
> > > > Sep 01, 2016 7:39:31 AM
> > > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > > FINE: PRE_UPDATE FINISH
> > >
> params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001),defaults(wt=xml)
> > > > Sep 01, 2016 7:39:31 AM
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> > > > FINE: finished:
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$
> > Runner@3fb794b2
> > > > Sep 01, 2016 7:39:31 AM
> > > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> > > >
> > > {crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001}
> > > >
> > > {add=[CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001
> > > (1544254202941800448)]}
> > > > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter
> > > doFilter
> > > > FINE: Closing out SolrRequest:
> > >
> params(crid=CUA0000004390000019223370564139207241C3LEA000002076922337056
> > 7404392838EXCC3000001),defaults(wt=xml)
> > > > -------------------------------------------------
> > > >
> > > > For the one which document is not getting indexed, we could see only
> > > following log in catalina.out. Not sure whether it's getting added to
> > SOLR.
> > > >
> > > >
> > > > Sep 01, 2016 7:39:56 AM
> > > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > > FINE: PRE_UPDATE FINISH
> > >
> params(crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> > 7061972057EXCC1000002),defaults(wt=xml)
> > > > Sep 01, 2016 7:39:56 AM
> > > org.apache.solr.update.processor.LogUpdateProcessor finish
> > > > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> > > >
> > > {crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> > 7061972057EXCC1000002}
> > > > {} 0 1
> > > > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter
> > > doFilter
> > > > FINE: Closing out SolrRequest:
> > >
> params(crid=CUA0000004390000019223370564139182810C3LEA000002017922337056
> > 7061972057EXCC1000002),defaults(wt=xml)
> > > >
> > > > ----------------------
> > > >
> > > > You can see that in above log for missing documents ( which is not
> > > indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that
> > > causing / reason for document not getting indexed ?
> > > >
> > > > We have set autosoftcommit to 1 seconds and autohardcommit to 30
> > seconds.
> > > >
> > > > We are not getting any errors or exceptions in the log.
> > > >
> > > > This issue is becoming very critical and sort of reliable factor.
> > Though
> > > we get 200 OK response from SOLR for update HTTP POST request, nothing
> > > happens on the SOLR side. If SOLR is not able to process, isn't it we
> get
> > > error from SOLR instead of giving 200 OK response.
> > > >
> > > > Anybody has faced this sort of issue or any sort of help would be
> very
> > > much appreciated.
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:
> > > [hidden email]>> wrote:
> > > > Nitin, Thanks for reply. Our each document has unique id and its
> hbase
> > > rowkey id. So it will be unique only. So there is no chance of
> duplicates
> > > id being send.
> > > >
> > > >
> > > >
> > > > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]
> > > <mailto:[hidden email]>> wrote:
> > > > Please check doc's unique key(Id). All keys shd be unique. Else docs
> > > having
> > > > same id will be replaced.
> > > >
> > > > On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:
> > > [hidden email]>> wrote:
> > > >
> > > >> Hi,
> > > >> we are keep sending documents to Solr from our app server. Single
> > > document
> > > >> per request, but in parallel of 10 request hits solr cloud in a
> > second.
> > > >>
> > > >> We could see our post request ( update request ) hitting our solr
> 5.4
> > in
> > > >> localhost_access logs, and it's response as 200 Ok response. And
> also
> > we
> > > >> get HTTP 200 OK response to our app servers as well for out HTTP
> > > request we
> > > >> fired to SOLR Cloud.
> > > >>
> > > >> But few documents are not getting indexed. Out of 2000 documents we
> > sent
> > > >> 10 documents are getting missed. Thought there is not error, few
> > > documents
> > > >> are getting missed.
> > > >>
> > > >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
> > > >>
> > > >> Why is that 10 documents not getting indexed and also no error
> getting
> > > >> thrown back if server is not able to index it ?
> > > >>
> > > >> Regards,
> > > >>
> > > >>
> > > >>
> > > >>
> > >
> >
>
>
>
> --
> Regards,
> Dheerendra
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Ganesh M-3
In reply to this post by Alexandre Rafalovitch
Hi Alex,

We have captured all traffic of HTTP POST request going out from  app
server to SOLR request. Only once that particular document with that id (
in our case it's rowkey ) is going out to SOLR. Also in the SOLR side, we
have enabled localhost_access logs and we could see only once that document
with that unique ID is reached and in localhost_access logs we could also
see 200 OK response getting captured. So we are sure that it's not
identical documents going to SOLR.
We were using 4.10.2, as we faced this issue, we migrated to 5.4 and we
could see same issue appearing in SOLR 5.4 too.
My big question is why is that SOLR can't throw the error when it's not
able to handle the request due to concurrency or for other reason. May be
we are not using it right, but couldn't nail down the problem. We are
loosing the reliable factor on SOLR due to this, though SOLR is really NOT.
Is there any limit that after number of threads / concurrency, SOLR behaves
strange like this ? Any settings, configurations etc to control this ?

Regards,
Ganesh

On Mon, Sep 5, 2016 at 8:13 AM Alexandre Rafalovitch <[hidden email]>
wrote:

> I can't tell anything from the document provided. So, here would be my
> thoughts:
>
> If what you see is some sort of concurrency issues, the documents
> missed/dropped would unlikely be exactly the same ones. So, if you see
> the same documents dropped, it is much more likely to be something to
> do with documents, with handler end-points, with sharding, etc.
>
> If this is easily reproducible, I would run a network analyzer such as
> Wireshark and compare your Admin UI session with your client session
> and verify that everything expected is absolutely identical.
>
> You could also temporarily turn on Debug via Admin console (under
> logs). You could turn individual elements to Trace to get low-level
> information on what's happening.
>
> Finally, I am assuming this is all happening with latest Solr? If not,
> it may be worth trying that and/or checking Jira for bugs. Lots of
> things have been fixed/improved in more recent Solr related to
> multi-threaded, multi-server setups.
>
> Regards,
>    Alex.
>
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 5 September 2016 at 00:17, Ganesh M <[hidden email]>
> wrote:
> > Hi Alex,
> > We tried to post the same manually from SOLR ADMIN / documents UI. It got
> > indexed successfully.  We are sure that it's not duplicate issue. We are
> > using default update handler and doesn't configure for custom one. We
> fire
> > the request to index using direct HTTP request using <add> <doc> XML
> > format. We are getting 200 OK response. But not getting indexed.
> >
> > This is the request we fired and got 200. But not getting indexed. Same
> > request fired via SOLR ADMIN / Document UI, it's getting indexed
> > successfully.
> > <add>
> > <doc>
> > <CT_iscof>false</CT_iscof>
> > <CT_ui116_s>55788327</CT_ui116_s>
> > <CT_iscod>false</CT_iscod>
> > <CT_ui114_s>Factuur _PERF29161663_Voor _Va Bene.pdf</CT_ui114_s>
> > <CT_ui68_s>55788327-PERF29161663</CT_ui68_s>
> > <CT_ui75_f>3.00</CT_ui75_f>
> > <CT_ui48_s>2916847</CT_ui48_s>
> > <CT_stsid>STCUA0000021500000011472808279078</CT_stsid>
> > <CT_ui6_s>EUR</CT_ui6_s>
> > <CT_ui74_f>50.00</CT_ui74_f>
> > <CT_ui28_s>VAT</CT_ui28_s>
> > <CT_ui82_f>50.00</CT_ui82_f>
> > <CT_lsti>UA000002150000001:VB1
> > VB1:A000002150:vbgroupnft+1:1472808278137</CT_lsti>
> > <CT_pdfid>RA000002150AT009428</CT_pdfid>
> > <CT__s_RU_I_UA000002150000001>100000,false</CT__s_RU_I_UA000002150000001>
> > <CT_ui30_s>62440101</CT_ui30_s>
> > <CT_ui152_s>UNKNOWN</CT_ui152_s>
> > <CT_content> RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> > RA000002150AT009425#pdf.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
> > 1472808279002
> >
> CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> > PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA
> > Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA
> > Beheer B.V.null null null  2.1null  urn:www.cenbii.eu:
> > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull
> >  urn:www.cenbii.eu:profile:bii04:ver2.0null  PERF20209161454372  null
> >  1472754600000null  3806 UNCL1001 null  EUR6 ISO 4217 Alpha null null
> >  29168472  null null  pdf.pdf2  null null  RA000002150AT009425#pdf.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#fpdf.pdf
> > application/pdf null null  Factuur _PERF29161663_Voor _Va Bene.pdf2  null
> >  PrimaryImagenull null  RA000002150AT009424#Factuur _PERF29161663_Voor
> _Va
> > Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#fFactuur
> > _PERF29161663_Voor _Va Bene.pdf application/pdf null null null
> 62440101ZZZ
> > NL:KVK null null  2916847ZZZ NL:VAT null null  VA Beheer B.V.null null
> >  Schurinkstraatnull  23null  Ommennull  7731GCnull null  NL6
> > ISO3166-1:Alpha2 null null  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153
> > null null  62440101ZZZ NL:KVK null null null  55788327ZZZ NL:KVK null
> null
> >  55788327ZZZ NL:KVK null null  Va Benenull null  Voorstraatnull  26null
> >  Voorschotennull  2251BNnull null  NL6 ISO3166-1:Alpha2 null null
> >  2916847ZZZ NL:VAT null null  VAT6 UN/ECE 5153 null null  55788327ZZZ
> > NL:KVK null null  1475173800000null null null null  NL6 ISO3166-1:Alpha2
> > null null  316 UNCL4461 null  1475087400000null
> 55788327-PERF29161663null
> > null  29168472 IBAN null  UNKNOWNBIC null  Betaling?binnen?14?dagen op
> > bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null
> >  3.00EUR null null  50.00EUR null  3.00EUR null null  S6 UNCL5305 null
> >  6.00null null  VAT6 UN/ECE 5153 null null  50.00EUR null  50.00EUR null
> >  53.00EUR null  53.00EUR null null  102  null  5.00BX null  50.00EUR null
> > null  PERF2020916145437null  PERF2020916145437null null  12  null null
> S6
> > UNCL5305 null  6.00null null  VAT6 UN/ECE 5153 null null  10.00EUR null
> >  RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> > DM001 XCNIN199751 NL:KVK:62440101 false false false false 10
> > UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen
> > 1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1
> VB1
> > UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group
> > 55788327 Va Bene XCNL034435 Va Bene
> > LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150
> > PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V.
> > LEA0000021509223370564294689844EXCC1000001
> > STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true
> > Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001
> > NL:KVK:55788327</CT_content>
> > <CT_ranm>vbgroupnft+1</CT_ranm>
> > <CT_lstc>10</CT_lstc>
> > <CT_rgnm>Va Bene</CT_rgnm>
> > <CT_tdr>true</CT_tdr>
> > <CT_ui83_f>50.00</CT_ui83_f>
> > <CT_ui64_s>NL</CT_ui64_s>
> >
> <CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>100000,false</CT__s_RU_O_LEA0000021509223370564294689844EXCC1000001>
> > <CT_scxkvk>62440101</CT_scxkvk>
> > <CT_sgexid>XCNL034436</CT_sgexid>
> > <CT_ui67_l>1475087400000</CT_ui67_l>
> > <CT_mtnm>Factuur</CT_mtnm>
> > <CT_ui8_s>2916847</CT_ui8_s>
> > <CT_sunm>VB1 VB1</CT_sunm>
> > <CT_ui66_s>31</CT_ui66_s>
> > <CT_ui46_s>NL</CT_ui46_s>
> > <CT_ui84_f>53.00</CT_ui84_f>
> > <CT_lsts>Ontvangen</CT_lsts>
> > <CT_ui42_s>26</CT_ui42_s>
> >
> <rowkey>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</rowkey>
> > <CT_rgexid>XCNL034435</CT_rgexid>
> > <CT_ui80_s>VAT</CT_ui80_s>
> > <CT_sgexnm>VA Beheer B.V.</CT_sgexnm>
> > <CT_ui16_s>VA Beheer B.V.</CT_ui16_s>
> > <CT_ui44_s>2251BN</CT_ui44_s>
> > <CT_ui38_s>Va Bene</CT_ui38_s>
> > <CT_iscvd>false</CT_iscvd>
> > <CT_munm>VB1 VB1</CT_munm>
> > <CT_ui52_s>55788327</CT_ui52_s>
> > <CT_ui1_s>2.1</CT_ui1_s>
> > <CT_ui104_s>PERF2020916145437</CT_ui104_s>
> > <CT_ui56_l>1475173800000</CT_ui56_l>
> > <CT_tmsg>EM0001</CT_tmsg>
> > <CT_sbj>PERF2020916145437</CT_sbj>
> > <CT_ui4_s>PERF2020916145437</CT_ui4_s>
> > <CT_ui3_s>urn:www.cenbii.eu:profile:bii04:ver2.0</CT_ui3_s>
> > <CT_ui98_s>Betaling?binnen?14?dagen op bankrekening?2916847?onder
> > vermelding van?55788327/PERF29161663</CT_ui98_s>
> > <CT_ui5_l>1472754600000</CT_ui5_l>
> > <CT_ui2_s>urn:www.cenbii.eu:
> > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu:
> > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:
> > si:si-ubl:ver1.1.x</CT_ui2_s>
> > <CT_ui88_f>5.00</CT_ui88_f>
> > <CT_muid>UA000002150000001</CT_muid>
> > <CT_ui36_s>55788327</CT_ui36_s>
> > <CT_sby>Group</CT_sby>
> > <CT_toid>NL:KVK:55788327</CT_toid>
> > <CT_crid>LEA0000021509223370564294752110EXCC2000001</CT_crid>
> > <CT_csid>LEA0000021509223370564294689844EXCC1000001</CT_csid>
> >
> <CT_cid>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001</CT_cid>
> > <CT_fmid>NL:KVK:62440101</CT_fmid>
> > <CT_sgnm>VA Beheer B.V.</CT_sgnm>
> > <CT_mdt>1472808279002</CT_mdt>
> > <CT_ui113_f>10.00</CT_ui113_f>
> > <CT_tnm>Factuur</CT_tnm>
> > <CT_said>A000002150</CT_said>
> > <CT_ui115_s>62440101</CT_ui115_s>
> > <CT_suid>UA000002150000001</CT_suid>
> > <CT_raid>A000002150</CT_raid>
> > <CT_mtid>GLDT9223370666504283001RA000000006DTP2000001</CT_mtid>
> > <CT_dmtd>DM001</CT_dmtd>
> > <CT_rcxkvk>55788327</CT_rcxkvk>
> > <CT_ui111_s>VAT</CT_ui111_s>
> > <CT_ui106_s>1</CT_ui106_s>
> > <CT_ui50_s>VAT</CT_ui50_s>
> > <CT_ui14_s>2916847</CT_ui14_s>
> > <CT_exid>XCNIN199751</CT_exid>
> > <CT_sdur>VB1 VB1</CT_sdur>
> > <CT_ui153_s>PERF2020916145437</CT_ui153_s>
> > <CT_ui100_t1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> > </CT_ui100_t1>
> > <CT_ui21_s>Ommen</CT_ui21_s>
> > <CT_ui109_s>6.00</CT_ui109_s>
> > <CT_csnm>VA Beheer B.V.</CT_csnm>
> > <CT_ui85_f>53.00</CT_ui85_f>
> > <CT_rby>Group</CT_rby>
> > <CT_tid>GLDT9223370666504283001RA000000006DTP2000001</CT_tid>
> > <CT_ui108_s>S</CT_ui108_s>
> > <CT_crnm>Va Bene</CT_crnm>
> > <CT_ui26_s>2916847</CT_ui26_s>
> > <CT_ui20_s>23</CT_ui20_s>
> >
> <CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>100000,false</CT__s_RU_I_LEA0000021509223370564294752110EXCC2000001>
> > <CT_ui101_s>PrimaryImage</CT_ui101_s>
> > <CT_ui24_s>NL</CT_ui24_s>
> > <CT_ui22_s>7731GC</CT_ui22_s>
> >
> <CT_uctx_UA000002150000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> > false</CT_uctx_UA000002150000001_s1>
> > <CT_ui43_s>Voorschoten</CT_ui43_s>
> > <CT__s_dxat_2>RA000002150AT009425#pdf.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f
> > </CT__s_dxat_2>
> > <CT__s_dxat_1>RA000002150AT009424#Factuur _PERF29161663_Voor _Va
> Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> > </CT__s_dxat_1>
> >
> <CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001
> >
> false</CT_uctx_UA000002150000001_LEA0000021509223370564294689844EXCC1000001_s1>
> > <CT_ui70_s>2916847</CT_ui70_s>
> > <CT_cdt>1472808279002</CT_cdt>
> > <CT_ui19_s>Schurinkstraat</CT_ui19_s>
> > <CT_sgid>LEA0000021509223370564294689844EXCC1000001</CT_sgid>
> > <CT_rgexnm>Va Bene</CT_rgexnm>
> > <CT_ui72_f>3.00</CT_ui72_f>
> > <CT_ui87_s>10</CT_ui87_s>
> > <CT__s_RU_O_UA000002150000001>100000,false</CT__s_RU_O_UA000002150000001>
> > <CT_ui77_s>S</CT_ui77_s>
> > <CT_cnm>PERF2020916145437</CT_cnm>
> > <CT_sanm>vbgroupnft+1</CT_sanm>
> > <CT_ird>false</CT_ird>
> > <CT_ui146_s>380</CT_ui146_s>
> > <CT_ui89_f>50.00</CT_ui89_f>
> > <CT_ui41_s>Voorstraat</CT_ui41_s>
> > <CT_daf>RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf#
> >
> http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f
> > </CT_daf>
> > <CT_ui78_s>6.00</CT_ui78_s>
> > <CT_rgid>LEA0000021509223370564294752110EXCC2000001</CT_rgid>
> > </doc>
> > </add>
> >
> >
> > Only difference is when we post via manually via SOLR ADMIN, it's fired
> > when there is no concurrency. But initially there would be around 50
> > threads firing update POST request and also few threads fire's GET
> request
> > to different collections.
> > Little more information about the setup....
> > We have around 5 Collection and each collection has 2 shards ( one shard
> in
> > each node, one shard for index and other for replica), totally 2 nodes
> with
> > master master setup.
> >
> > We are getting this error only when there is concurrency of of around 50
> > threads firing POST request to various collections same time.
> >
> > Strange thing is why SOLR not returning error when it's not able to index
> > it. If SOLR has returned error, we could have retry the document
> indexing.
> > Is there any way we can make SOLR to return error instead of 200 when
> they
> > fail to index ?
> >
> > Regards,
> > Ganesh
> >
> > On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch <
> [hidden email]>
> > wrote:
> >
> >> Can you identify the specific documents that 'fail'? What happens if
> >> you post them manually? Try posting them manually but with one field
> >> super-distinct to see whether it made it in. What happens if you post
> >> it to an empty index (copy definition and try).
> >>
> >> Also, what's your request handler's parameters look like. Perhaps you
> >> have a signature processor, in which case it may be triggering
> >> duplicates avoidance with different calculation from just an id.
> >>
> >> My guess is still that it is some sort of duplicate issue.
> >>
> >> Regards,
> >>    Alex.
> >> ----
> >> Newsletter and resources for Solr beginners and intermediates:
> >> http://www.solr-start.com/
> >>
> >>
> >> On 4 September 2016 at 23:10, Ganesh M <[hidden email]> wrote:
> >> > Some more information on this... Most of documents get indexed
> properly.
> >> Few documents are not getting indexed.
> >> >
> >> > All documents POST are seen in the localhost_access and 200 OK
> response
> >> is seen in local host access file. But in catalina, there are some
> >> difference in the logs for which are indexing properly, following is the
> >> logs.
> >> >
> >> > FINE: PRE_UPDATE add
> >> >
> >>
> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> >> >
> >>
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> >> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog <init>
> >> > FINE: New TransactionLog
> >>
> file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856,
> >> exists=false, size=0, openExisting=false
> >> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor
> submit
> >> > FINE: sending update to
> >> http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0
> >>
> add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> >>
> params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F
> >> > Sep 01, 2016 7:39:31 AM
> >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> >> > FINE: starting runner:
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> >> > Sep 01, 2016 7:39:31 AM
> >> org.apache.solr.update.processor.LogUpdateProcessor finish
> >> > FINE: PRE_UPDATE FINISH
> >>
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> >> > Sep 01, 2016 7:39:31 AM
> >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run
> >> > FINE: finished:
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2
> >> > Sep 01, 2016 7:39:31 AM
> >> org.apache.solr.update.processor.LogUpdateProcessor finish
> >> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> >> >
> >>
> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> >> >
> >>
> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001
> >> (1544254202941800448)]}
> >> > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter
> >> doFilter
> >> > FINE: Closing out SolrRequest:
> >>
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
> >> > -------------------------------------------------
> >> >
> >> > For the one which document is not getting indexed, we could see only
> >> following log in catalina.out. Not sure whether it's getting added to
> SOLR.
> >> >
> >> >
> >> > Sep 01, 2016 7:39:56 AM
> >> org.apache.solr.update.processor.LogUpdateProcessor finish
> >> > FINE: PRE_UPDATE FINISH
> >>
> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> >> > Sep 01, 2016 7:39:56 AM
> >> org.apache.solr.update.processor.LogUpdateProcessor finish
> >> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> >> >
> >>
> {crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
> >> > {} 0 1
> >> > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter
> >> doFilter
> >> > FINE: Closing out SolrRequest:
> >>
> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> >> >
> >> > ----------------------
> >> >
> >> > You can see that in above log for missing documents ( which is not
> >> indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that
> >> causing / reason for document not getting indexed ?
> >> >
> >> > We have set autosoftcommit to 1 seconds and autohardcommit to 30
> seconds.
> >> >
> >> > We are not getting any errors or exceptions in the log.
> >> >
> >> > This issue is becoming very critical and sort of reliable factor.
> Though
> >> we get 200 OK response from SOLR for update HTTP POST request, nothing
> >> happens on the SOLR side. If SOLR is not able to process, isn't it we
> get
> >> error from SOLR instead of giving 200 OK response.
> >> >
> >> > Anybody has faced this sort of issue or any sort of help would be very
> >> much appreciated.
> >> >
> >> >
> >> >
> >> >
> >> > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M <[hidden email]<mailto:
> >> [hidden email]>> wrote:
> >> > Nitin, Thanks for reply. Our each document has unique id and its hbase
> >> rowkey id. So it will be unique only. So there is no chance of
> duplicates
> >> id being send.
> >> >
> >> >
> >> >
> >> > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, <[hidden email]
> >> <mailto:[hidden email]>> wrote:
> >> > Please check doc's unique key(Id). All keys shd be unique. Else docs
> >> having
> >> > same id will be replaced.
> >> >
> >> > On 04-Sep-2016 12:13 PM, "Ganesh M" <[hidden email]<mailto:
> >> [hidden email]>> wrote:
> >> >
> >> >> Hi,
> >> >> we are keep sending documents to Solr from our app server. Single
> >> document
> >> >> per request, but in parallel of 10 request hits solr cloud in a
> second.
> >> >>
> >> >> We could see our post request ( update request ) hitting our solr
> 5.4 in
> >> >> localhost_access logs, and it's response as 200 Ok response. And
> also we
> >> >> get HTTP 200 OK response to our app servers as well for out HTTP
> >> request we
> >> >> fired to SOLR Cloud.
> >> >>
> >> >> But few documents are not getting indexed. Out of 2000 documents we
> sent
> >> >> 10 documents are getting missed. Thought there is not error, few
> >> documents
> >> >> are getting missed.
> >> >>
> >> >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs.
> >> >>
> >> >> Why is that 10 documents not getting indexed and also no error
> getting
> >> >> thrown back if server is not able to index it ?
> >> >>
> >> >> Regards,
> >> >>
> >> >>
> >> >>
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Alexandre Rafalovitch
On 5 September 2016 at 11:02, Ganesh M <[hidden email]> wrote:
> My big question is why is that SOLR can't throw the error when it's not
> able to handle the request due to concurrency or for other reason.


Solr SHOULD throw an error if there is an issue. The problem is that
the concurrency is a HARD problem with a lot of moving parts. So far,
we can't even figure out in which subsystem you are seeing the unusual
behavior to figure out whether it is a misconfiguration,
misunderstanding, bug, or something completely different.

However, if you are repeating exactly the same test with the same
documents multiple times and getting different documents missing in
the end, that does sound like a bug. The question is whether it is a
known and fixed bug or something completely new.

Any chance you could try this against 6.2? Even in test
environment/VM, if you cannot put Java 8 anywhere near production.

Regards,
   Alex.


----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Shawn Heisey-2
In reply to this post by Ganesh M-3
On 9/4/2016 10:02 PM, Ganesh M wrote:
> We have captured all traffic of HTTP POST request going out from app

I'm the one you've interacted with on IRC for this issue.

If this index has multiple shards, one thing that might be a problem
here is the ShardHandler that's internal to Solr.  This is the internal
HttpClient that distributes requests between Solr nodes.  You may need
to bump up the maxConnectionsPerHost value from its default of 20 to
something larger, like 200 or 300.  This goes in a shardHandlerFactory
section of solr.xml.  If you do not have solr.xml in zookeeper, you'll
need to make this change on every Solr node.  All Solr nodes will need
to be restarted.

https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml

I hope this helps, but I cannot be certain that this is the problem.  If
it does fix your issue, then we might have a bug.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Ganesh M-3
Hi Shawn,

Good to know about this configuration in shardHandler. We will try this
settings and keep you posted on status. Hopefully it resolved the issue.

Regards,
Ganesh

On 05-Sep-2016 10:02 pm, "Shawn Heisey" <[hidden email]> wrote:

> On 9/4/2016 10:02 PM, Ganesh M wrote:
> > We have captured all traffic of HTTP POST request going out from app
>
> I'm the one you've interacted with on IRC for this issue.
>
> If this index has multiple shards, one thing that might be a problem
> here is the ShardHandler that's internal to Solr.  This is the internal
> HttpClient that distributes requests between Solr nodes.  You may need
> to bump up the maxConnectionsPerHost value from its default of 20 to
> something larger, like 200 or 300.  This goes in a shardHandlerFactory
> section of solr.xml.  If you do not have solr.xml in zookeeper, you'll
> need to make this change on every Solr node.  All Solr nodes will need
> to be restarted.
>
> https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml
>
> I hope this helps, but I cannot be certain that this is the problem.  If
> it does fix your issue, then we might have a bug.
>
> Thanks,
> Shawn
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

mganeshs
Hi Shawn,

Good to know about this configuration in shardHandler. We will try this
settings and keep you posted on status. Hope setting changes will resolve the issue.

Regards,
Ganesh

On 06-Sep-2016 10:32 pm, "Ganesh M" <[hidden email]<mailto:[hidden email]>> wrote:

Hi Shawn,

Good to know about this configuration in shardHandler. We will try this settings and keep you posted on status. Hopefully it resolved the issue.

Regards,
Ganesh

On 05-Sep-2016 10:02 pm, "Shawn Heisey" <[hidden email]<mailto:[hidden email]>> wrote:
On 9/4/2016 10:02 PM, Ganesh M wrote:
> We have captured all traffic of HTTP POST request going out from app

I'm the one you've interacted with on IRC for this issue.

If this index has multiple shards, one thing that might be a problem
here is the ShardHandler that's internal to Solr.  This is the internal
HttpClient that distributes requests between Solr nodes.  You may need
to bump up the maxConnectionsPerHost value from its default of 20 to
something larger, like 200 or 300.  This goes in a shardHandlerFactory
section of solr.xml.  If you do not have solr.xml in zookeeper, you'll
need to make this change on every Solr node.  All Solr nodes will need
to be restarted.

https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml

I hope this helps, but I cannot be certain that this is the problem.  If
it does fix your issue, then we might have a bug.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr document missing or not getting indexed though we get 200 ok status from server

Chris Hostetter-3
In reply to this post by Ganesh M-3

: We tried to post the same manually from SOLR ADMIN / documents UI. It got
: indexed successfully.  We are sure that it's not duplicate issue. We are
: using default update handler and doesn't configure for custom one. We fire
: the request to index using direct HTTP request using <add> <doc> XML
: format. We are getting 200 OK response. But not getting indexed.
:
: This is the request we fired and got 200. But not getting indexed. Same
: request fired via SOLR ADMIN / Document UI, it's getting indexed
: successfully.
: <add>
: <doc>
: <CT_iscof>false</CT_iscof>
: <CT_ui116_s>55788327</CT_ui116_s>
: ...

... hold on, let's back up here -- the XML you've provided is not in any
format that solr understands natively at all -- what are these "CT_foo"
xml tags?  are these intended to be the final field names?

If so, and if this type of XML document is working for you *sometimes*,
then you definitely have something non standard in your document pipeline
that's handling the XML parsing and then handing the documents to Solr --
either a proxy in between you and solr, or perhaps a custom update
processor or a custom RequestHandler .... nothing in SOLR would be able to
parse that XML in a sane way.

So please tell us more about your system, and how you index documents, and
all of your configs, etc...

        https://wiki.apache.org/solr/UsingMailingLists

...in particular, please show us your solrconfig.xml and a sample "curl"
command for indexing a document, and the output of that curl command when
using --dump-header.


With that said, Let's back up some more and look at those logs you posted...

> All documents POST are seen in the localhost_access and 200 OK response
> is seen in local host access file. But in catalina, there are some
> difference in the logs for which are indexing properly, following is the logs.
>
> FINE: PRE_UPDATE add
> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml)
>  ...
> org.apache.solr.update.processor.LogUpdateProcessor finish
> INFO: [IOB_shard1_replica1] webapp=/solr path=/update params=
> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001}
> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001
> (1544254202941800448)]}
>  ...

...your copy paste cut off the important bit at the begining about what
method was logging that PRE_UPDATE, but it's not really critical for
analysis: what matters is that the LogUpdateProcessor reports that it's
processing an "add" of "id=CUA000000..." which has a "crid=CUA00000"
request param (which is not something solr cares about, but is something
your software is sending which contains the same value as the uniqueKey of
the document) and then later it loggs that it finished processing the
request which included one "add" ... if the request had included multiple
update operations (ie: multiple adds, or adds mixed with deletes, or a
commit) then LogUpdateProcessor would have recorded a PRE_UPDATE for each
of them, but only one "finish" sumarising all of them.

Now let's look at your next log snipped...

> For the one which document is not getting indexed, we could see only
> following log in catalina.out. Not sure whether it's getting added to SOLR.
>
> Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> FINE: PRE_UPDATE FINISH
> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml)
> Sep 01, 2016 7:39:56 AM org.apache.solr.update.processor.LogUpdateProcessor finish
> INFO: [IOB_shard1_replica1] webapp=/solr path=/update
> params={crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002}
> {} 0 1
>  ...

If that's really every thing in your log for the request that doesn't
work, then what that tells us is that a request was sent to the /update
handler which did not contain any udpate commands -- ie: no documents to
add, no deletes, no commit, etc...  that's why there are no "PRE_UPDATE
add" log messages, and no ids listed in the "finish" log message.

that's the logs i would expect to see from something like this curl
request -- which triggers the update handler with a request to do
nothing...

curl --dump-header - -H 'Content-Type: application/json' http://localhost:8983/solr/techproducts/update --data-binary '[]'
HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Content-Length: 42

{"responseHeader":{"status":0,"QTime":0}}



-Hoss
http://www.lucidworks.com/