Dataimport problem

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Dataimport problem

Srinivas Kashyap-2
Hello,

We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.

But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.

Am I missing some configuration? Please let me know.

Thanks and Regards,
Srinivas Kashyap
________________________________
DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.
Reply | Threaded
Open this post in threaded view
|

Re: Dataimport problem

Alexandre Rafalovitch
A couple of things:
1) Solr on Tomcat has not been an option for quite a while. So, you
must be running an old version of Solr. Which one?
2) Compare that you have the same Solr config. In Admin UI, there will
be all O/S variables passed to the Java runtime, I would check them
side-by-side
3) You can enable Dataimport(DIH) debug in Admin UI, so perhaps you
can run a subset (1?) of the queries and see the difference
4) Worst case, you may want to track this in between Solr and DB by
using network analyzer (e.g. Wireshark). That may show you the actual
queries, timing, connection issues, etc
5) DIH is not actually recommended for production, more for
exploration; you may want to consider moving to a stronger
architecture given the complexity of your needs

Regards,
   Alex.

On Wed, 31 Jul 2019 at 10:04, Srinivas Kashyap <[hidden email]> wrote:

>
> Hello,
>
> We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.
>
> But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.
>
> Am I missing some configuration? Please let me know.
>
> Thanks and Regards,
> Srinivas Kashyap
> ________________________________
> DISCLAIMER:
> E-mails and attachments from Bamboo Rose, LLC are confidential.
> If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
> No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.
Reply | Threaded
Open this post in threaded view
|

RE: Dataimport problem

Srinivas Kashyap-2
Hi,

1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?

We are using Solr 5.2.1(WAR based deployment so)


5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs

Can you please give pointers to look into, We are using DIH for production and facing few issues. We need to start phasing out


Thanks and Regards,
Srinivas Kashyap
           
-----Original Message-----
From: Alexandre Rafalovitch <[hidden email]>
Sent: 31 July 2019 07:41 PM
To: solr-user <[hidden email]>
Subject: Re: Dataimport problem

A couple of things:
1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
2) Compare that you have the same Solr config. In Admin UI, there will be all O/S variables passed to the Java runtime, I would check them side-by-side
3) You can enable Dataimport(DIH) debug in Admin UI, so perhaps you can run a subset (1?) of the queries and see the difference
4) Worst case, you may want to track this in between Solr and DB by using network analyzer (e.g. Wireshark). That may show you the actual queries, timing, connection issues, etc
5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs

Regards,
   Alex.

On Wed, 31 Jul 2019 at 10:04, Srinivas Kashyap <[hidden email]> wrote:

>
> Hello,
>
> We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.
>
> But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.
>
> Am I missing some configuration? Please let me know.
>
> Thanks and Regards,
> Srinivas Kashyap
> ________________________________
> DISCLAIMER:
> E-mails and attachments from Bamboo Rose, LLC are confidential.
> If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
> No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.
Reply | Threaded
Open this post in threaded view
|

Re: Dataimport problem

Erick Erickson
This code is a little old, but should give you a place to start:

https://lucidworks.com/post/indexing-with-solrj/

As for DIH, my guess is that when you moved to Azure, your connectivity to the DB changed, possibly the driver Solr uses etc., and your SQL query in step 9 went from, maybe, batching rows to returning the entire result set or similar weirdness. Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches?

Best,
Erick

> On Jul 31, 2019, at 10:18 AM, Srinivas Kashyap <[hidden email]> wrote:
>
> Hi,
>
> 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
>
> We are using Solr 5.2.1(WAR based deployment so)
>
>
> 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
>
> Can you please give pointers to look into, We are using DIH for production and facing few issues. We need to start phasing out
>
>
> Thanks and Regards,
> Srinivas Kashyap
>            
> -----Original Message-----
> From: Alexandre Rafalovitch <[hidden email]>
> Sent: 31 July 2019 07:41 PM
> To: solr-user <[hidden email]>
> Subject: Re: Dataimport problem
>
> A couple of things:
> 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
> 2) Compare that you have the same Solr config. In Admin UI, there will be all O/S variables passed to the Java runtime, I would check them side-by-side
> 3) You can enable Dataimport(DIH) debug in Admin UI, so perhaps you can run a subset (1?) of the queries and see the difference
> 4) Worst case, you may want to track this in between Solr and DB by using network analyzer (e.g. Wireshark). That may show you the actual queries, timing, connection issues, etc
> 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
>
> Regards,
>   Alex.
>
> On Wed, 31 Jul 2019 at 10:04, Srinivas Kashyap <[hidden email]> wrote:
>>
>> Hello,
>>
>> We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.
>>
>> But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.
>>
>> Am I missing some configuration? Please let me know.
>>
>> Thanks and Regards,
>> Srinivas Kashyap
>> ________________________________
>> DISCLAIMER:
>> E-mails and attachments from Bamboo Rose, LLC are confidential.
>> If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
>> No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.

Reply | Threaded
Open this post in threaded view
|

RE: Dataimport problem

Srinivas Kashyap-2
Hi,
Hi,

1)Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches

The 9th request returns only 2 rows. This behaviour is happening for all the cores which have more than 8 SQL requests. But the same is working fine with AWS hosting. Really baffled.

Thanks and Regards,
Srinivas Kashyap

-----Original Message-----
From: Erick Erickson <[hidden email]>
Sent: 31 July 2019 08:00 PM
To: [hidden email]
Subject: Re: Dataimport problem

This code is a little old, but should give you a place to start:

https://lucidworks.com/post/indexing-with-solrj/

As for DIH, my guess is that when you moved to Azure, your connectivity to the DB changed, possibly the driver Solr uses etc., and your SQL query in step 9 went from, maybe, batching rows to returning the entire result set or similar weirdness. Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches?

Best,
Erick

> On Jul 31, 2019, at 10:18 AM, Srinivas Kashyap <[hidden email]> wrote:
>
> Hi,
>
> 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
>
> We are using Solr 5.2.1(WAR based deployment so)
>
>
> 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
>
> Can you please give pointers to look into, We are using DIH for production and facing few issues. We need to start phasing out
>
>
> Thanks and Regards,
> Srinivas Kashyap
>
> -----Original Message-----
> From: Alexandre Rafalovitch <[hidden email]>
> Sent: 31 July 2019 07:41 PM
> To: solr-user <[hidden email]>
> Subject: Re: Dataimport problem
>
> A couple of things:
> 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
> 2) Compare that you have the same Solr config. In Admin UI, there will be all O/S variables passed to the Java runtime, I would check them side-by-side
> 3) You can enable Dataimport(DIH) debug in Admin UI, so perhaps you can run a subset (1?) of the queries and see the difference
> 4) Worst case, you may want to track this in between Solr and DB by using network analyzer (e.g. Wireshark). That may show you the actual queries, timing, connection issues, etc
> 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
>
> Regards,
>   Alex.
>
> On Wed, 31 Jul 2019 at 10:04, Srinivas Kashyap <[hidden email]> wrote:
>>
>> Hello,
>>
>> We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.
>>
>> But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.
>>
>> Am I missing some configuration? Please let me know.
>>
>> Thanks and Regards,
>> Srinivas Kashyap
>> ________________________________
________________________________
DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.
Reply | Threaded
Open this post in threaded view
|

Re: Dataimport problem

Alexandre Rafalovitch
I wonder if you have some sort of JDBC pool enabled and/or the number
of worker threads is configured differently. Compare tomcat level
configuration and/or try thread dump of the java runtime when you are
stuck.

Or maybe something similar on the Postgres side.

Regards,
   Alex.

On Wed, 31 Jul 2019 at 10:36, Srinivas Kashyap <[hidden email]> wrote:

>
> Hi,
> Hi,
>
> 1)Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches
>
> The 9th request returns only 2 rows. This behaviour is happening for all the cores which have more than 8 SQL requests. But the same is working fine with AWS hosting. Really baffled.
>
> Thanks and Regards,
> Srinivas Kashyap
>
> -----Original Message-----
> From: Erick Erickson <[hidden email]>
> Sent: 31 July 2019 08:00 PM
> To: [hidden email]
> Subject: Re: Dataimport problem
>
> This code is a little old, but should give you a place to start:
>
> https://lucidworks.com/post/indexing-with-solrj/
>
> As for DIH, my guess is that when you moved to Azure, your connectivity to the DB changed, possibly the driver Solr uses etc., and your SQL query in step 9 went from, maybe, batching rows to returning the entire result set or similar weirdness. Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches?
>
> Best,
> Erick
>
> > On Jul 31, 2019, at 10:18 AM, Srinivas Kashyap <[hidden email]> wrote:
> >
> > Hi,
> >
> > 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
> >
> > We are using Solr 5.2.1(WAR based deployment so)
> >
> >
> > 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
> >
> > Can you please give pointers to look into, We are using DIH for production and facing few issues. We need to start phasing out
> >
> >
> > Thanks and Regards,
> > Srinivas Kashyap
> >
> > -----Original Message-----
> > From: Alexandre Rafalovitch <[hidden email]>
> > Sent: 31 July 2019 07:41 PM
> > To: solr-user <[hidden email]>
> > Subject: Re: Dataimport problem
> >
> > A couple of things:
> > 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one?
> > 2) Compare that you have the same Solr config. In Admin UI, there will be all O/S variables passed to the Java runtime, I would check them side-by-side
> > 3) You can enable Dataimport(DIH) debug in Admin UI, so perhaps you can run a subset (1?) of the queries and see the difference
> > 4) Worst case, you may want to track this in between Solr and DB by using network analyzer (e.g. Wireshark). That may show you the actual queries, timing, connection issues, etc
> > 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger architecture given the complexity of your needs
> >
> > Regards,
> >   Alex.
> >
> > On Wed, 31 Jul 2019 at 10:04, Srinivas Kashyap <[hidden email]> wrote:
> >>
> >> Hello,
> >>
> >> We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity.
> >>
> >> But the same setup, solr(tomcat) and postgres database works fine with AWS hosting.
> >>
> >> Am I missing some configuration? Please let me know.
> >>
> >> Thanks and Regards,
> >> Srinivas Kashyap
> >> ________________________________
> ________________________________
> DISCLAIMER:
> E-mails and attachments from Bamboo Rose, LLC are confidential.
> If you are not the intended recipient, please notify the sender immediately by replying to the e-mail, and then delete it without making copies or using it in any way.
> No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient.