InterruptedException handling between solr->zk interactions

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

InterruptedException handling between solr->zk interactions

Varun Thacker-4
Is there a general strategy on how to deal with InterruptedException while issues a zookeeper call from solr?

Here's a more concrete example which I am unsure if it's doing the right thing or not:


This code simply catches Exception. So if InterruptedException is thrown , we simply log an ERROR and move on. 

Excerpt logs from a local failed test run: https://gist.github.com/vthacker/5dcb8978ba177d8725e98c5d433ee6c2

Reply | Threaded
Open this post in threaded view
|

Re: InterruptedException handling between solr->zk interactions

Mikhail Khludnev-2
Hello, Varun.

If you are bothered with 
--- Thousands of "Session expired for /autoscaling.json" messages before I had to manually kill the test run
it should be resolved by


On Sat, Apr 14, 2018 at 12:31 AM, Varun Thacker <[hidden email]> wrote:
Is there a general strategy on how to deal with InterruptedException while issues a zookeeper call from solr?

Here's a more concrete example which I am unsure if it's doing the right thing or not:


This code simply catches Exception. So if InterruptedException is thrown , we simply log an ERROR and move on. 

Excerpt logs from a local failed test run: https://gist.github.com/vthacker/5dcb8978ba177d8725e98c5d433ee6c2




--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: InterruptedException handling between solr->zk interactions

Varun Thacker-4
Hi Mikhail,

My checkout already has that commit when i ran into this issue. I'll reply on SOLR-7736 with some more details. 


On Fri, Apr 13, 2018 at 3:02 PM, Mikhail Khludnev <[hidden email]> wrote:
Hello, Varun.

If you are bothered with 
--- Thousands of "Session expired for /autoscaling.json" messages before I had to manually kill the test run
it should be resolved by


On Sat, Apr 14, 2018 at 12:31 AM, Varun Thacker <[hidden email]> wrote:
Is there a general strategy on how to deal with InterruptedException while issues a zookeeper call from solr?

Here's a more concrete example which I am unsure if it's doing the right thing or not:


This code simply catches Exception. So if InterruptedException is thrown , we simply log an ERROR and move on. 

Excerpt logs from a local failed test run: https://gist.github.com/vthacker/5dcb8978ba177d8725e98c5d433ee6c2




--
Sincerely yours
Mikhail Khludnev

Reply | Threaded
Open this post in threaded view
|

Re: InterruptedException handling between solr->zk interactions

Tomás Fernández Löbbe
In reply to this post by Mikhail Khludnev-2
Yes, I've seen these issues too. The right thing to do is to close all resources (in some cases finish anything that can't be left in a bad state) and exit. In this particular case I'd think the InterruptedException is swallowed unintentionally because of the catch (Exception ). I suspect for the OverseerTaskProcessor the right thing to do is to close and exit?. We should at the very least be restoring the interrupted flag (so that Mikhail's fix would make the thread exit immediately)

On Fri, Apr 13, 2018 at 3:02 PM, Mikhail Khludnev <[hidden email]> wrote:
Hello, Varun.

If you are bothered with 
--- Thousands of "Session expired for /autoscaling.json" messages before I had to manually kill the test run
it should be resolved by


On Sat, Apr 14, 2018 at 12:31 AM, Varun Thacker <[hidden email]> wrote:
Is there a general strategy on how to deal with InterruptedException while issues a zookeeper call from solr?

Here's a more concrete example which I am unsure if it's doing the right thing or not:


This code simply catches Exception. So if InterruptedException is thrown , we simply log an ERROR and move on. 

Excerpt logs from a local failed test run: https://gist.github.com/vthacker/5dcb8978ba177d8725e98c5d433ee6c2




--
Sincerely yours
Mikhail Khludnev

Reply | Threaded
Open this post in threaded view
|

Re: InterruptedException handling between solr->zk interactions

Mikhail Khludnev-2
In reply to this post by Varun Thacker-4
Ok. I've found my fix for the expired /autoscaling.json spin in OTT

diff --git a/solr/core/src/java/org/apache/solr/cloud/autoscaling/OverseerTriggerThread.java b/solr/core/src/java/org/apache/solr/cloud/autoscaling/OverseerTriggerThread.java
index ece4c4c..6fe2057 100644
--- a/solr/core/src/java/org/apache/solr/cloud/autoscaling/OverseerTriggerThread.java
+++ b/solr/core/src/java/org/apache/solr/cloud/autoscaling/OverseerTriggerThread.java
@@ -142,8 +142,14 @@
         Thread.currentThread().interrupt();
         log.warn("Interrupted", e);
         break;
-      } catch (IOException | KeeperException e) {
+      }
+      catch (IOException | KeeperException e) {
         log.error("A ZK error has occurred", e);
+        if (e.getCause()!=null && e.getCause() instanceof KeeperException.SessionExpiredException) {
+          log.warn("Solr cannot talk to ZK, exiting " + 
+              getClass().getSimpleName() + " main queue loop", e);
+          return;
+        }
       }
     }
I'll put as a part of SOLR-12200 


On Sat, Apr 14, 2018 at 1:12 AM, Varun Thacker <[hidden email]> wrote:
Hi Mikhail,

My checkout already has that commit when i ran into this issue. I'll reply on SOLR-7736 with some more details. 


On Fri, Apr 13, 2018 at 3:02 PM, Mikhail Khludnev <[hidden email]> wrote:
Hello, Varun.

If you are bothered with 
--- Thousands of "Session expired for /autoscaling.json" messages before I had to manually kill the test run
it should be resolved by


On Sat, Apr 14, 2018 at 12:31 AM, Varun Thacker <[hidden email]> wrote:
Is there a general strategy on how to deal with InterruptedException while issues a zookeeper call from solr?

Here's a more concrete example which I am unsure if it's doing the right thing or not:


This code simply catches Exception. So if InterruptedException is thrown , we simply log an ERROR and move on. 

Excerpt logs from a local failed test run: https://gist.github.com/vthacker/5dcb8978ba177d8725e98c5d433ee6c2




--
Sincerely yours
Mikhail Khludnev




--
Sincerely yours
Mikhail Khludnev