7.3.1 creates thousands of threads after start up

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

7.3.1 creates thousands of threads after start up

Markus Jelsma-2
Hello,

Our local test environment mini cluster goes nuts right after start up. It is a two node/shard/replica collection starts up normally if only one node start up.  But as soon as the second node attempts to join the cluster, both nodes go crazy, creating thousands of threads with identical stack traces.

"qtp1458849419-4738" - Thread t@4738
   java.lang.Thread.State: TIMED_WAITING
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for <6ee32168> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
        at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
        at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
        - None

If does not happen always, but most of the time i am unable to boot the cluster normally. Sometimes, apparently right now for the first time, the GUI is still accessible.

Is this a known issue?

Thanks,
Markus
Reply | Threaded
Open this post in threaded view
|

Re: 7.3.1 creates thousands of threads after start up

Shawn Heisey-2
On 6/8/2018 8:17 AM, Markus Jelsma wrote:

> Our local test environment mini cluster goes nuts right after start up. It is a two node/shard/replica collection starts up normally if only one node start up.  But as soon as the second node attempts to join the cluster, both nodes go crazy, creating thousands of threads with identical stack traces.
>
> "qtp1458849419-4738" - Thread t@4738
>    java.lang.Thread.State: TIMED_WAITING
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for <6ee32168> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>         at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
>         at java.lang.Thread.run(Thread.java:748)
>
>    Locked ownable synchronizers:
>         - None
>
> If does not happen always, but most of the time i am unable to boot the cluster normally. Sometimes, apparently right now for the first time, the GUI is still accessible.
>
> Is this a known issue?

It's not a problem that I've heard of.  There are no Solr classes in the
stacktrace, only Jetty and Java classes.  I won't try to tell you that a
bug in Solr can't be the root cause, because it definitely can.  The
threads appear to be created by Jetty, but the supplied info doesn't
indicate WHY it's happening.

Presumably there's a previous version you've used where this problem did
NOT happen.  What version would that be?

Can you share the solr.log file from both nodes when this happens? 
There might be a clue there.

It sounds like you probably have a small number of collections in the
dev cluster.  Can you confirm that?

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

RE: 7.3.1 creates thousands of threads after start up

Markus Jelsma-2
In reply to this post by Markus Jelsma-2
Hello Shawn,

The logs appear useless, they are littered with these:

2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [   ] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://idx2:8983/solr/
search/admin/ping
        at org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
..
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.eclipse.jetty.io.EofException
        at org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:704)
..
        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:509)

Regarding the versions, it is a bit hard to recall but i do no thing i have seen this on 7.2, most certainly not on 7.1.

We operate three distinct type of Solr collections, they only share the same Zookeeper quorum. The other two collections do not seem to have this problem, but i don't restart those as often as i restart this collection, as i am STILL trying to REPRODUCE the dreaded memory leak i reported having on 7.3 about two weeks ago. Sorry, but i drives me nuts!

Thanks,
Markus

-----Original message-----

> From:Shawn Heisey <[hidden email]>
> Sent: Friday 8th June 2018 16:47
> To: [hidden email]
> Subject: Re: 7.3.1 creates thousands of threads after start up
>
> On 6/8/2018 8:17 AM, Markus Jelsma wrote:
> > Our local test environment mini cluster goes nuts right after start up. It is a two node/shard/replica collection starts up normally if only one node start up.  But as soon as the second node attempts to join the cluster, both nodes go crazy, creating thousands of threads with identical stack traces.
> >
> > "qtp1458849419-4738" - Thread t@4738
> >    java.lang.Thread.State: TIMED_WAITING
> >         at sun.misc.Unsafe.park(Native Method)
> >         - parking to wait for <6ee32168> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> >         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> >         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> >         at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
> >         at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
> >         at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
> >         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> >    Locked ownable synchronizers:
> >         - None
> >
> > If does not happen always, but most of the time i am unable to boot the cluster normally. Sometimes, apparently right now for the first time, the GUI is still accessible.
> >
> > Is this a known issue?
>
> It's not a problem that I've heard of.  There are no Solr classes in the
> stacktrace, only Jetty and Java classes.  I won't try to tell you that a
> bug in Solr can't be the root cause, because it definitely can.  The
> threads appear to be created by Jetty, but the supplied info doesn't
> indicate WHY it's happening.
>
> Presumably there's a previous version you've used where this problem did
> NOT happen.  What version would that be?
>
> Can you share the solr.log file from both nodes when this happens? 
> There might be a clue there.
>
> It sounds like you probably have a small number of collections in the
> dev cluster.  Can you confirm that?
>
> Thanks,
> Shawn
>
>
Reply | Threaded
Open this post in threaded view
|

Re: 7.3.1 creates thousands of threads after start up

Deepak Goel
Do these machines have a firewall in-between?

On Fri, 8 Jun 2018, 20:29 Markus Jelsma, <[hidden email]> wrote:

> Hello Shawn,
>
> The logs appear useless, they are littered with these:
>
> 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [   ]
> o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error
> trying to proxy request for url: http://idx2:8983/solr/
> search/admin/ping <http://idx2:8983/solr/search/admin/ping>
>         at
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> ..
>         at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>         at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.eclipse.jetty.io.EofException
>         at
> org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:704)
> ..
>         at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:509)
>
> Regarding the versions, it is a bit hard to recall but i do no thing i
> have seen this on 7.2, most certainly not on 7.1.
>
> We operate three distinct type of Solr collections, they only share the
> same Zookeeper quorum. The other two collections do not seem to have this
> problem, but i don't restart those as often as i restart this collection,
> as i am STILL trying to REPRODUCE the dreaded memory leak i reported having
> on 7.3 about two weeks ago. Sorry, but i drives me nuts!
>
> Thanks,
> Markus
>
> -----Original message-----
> > From:Shawn Heisey <[hidden email]>
> > Sent: Friday 8th June 2018 16:47
> > To: [hidden email]
> > Subject: Re: 7.3.1 creates thousands of threads after start up
> >
> > On 6/8/2018 8:17 AM, Markus Jelsma wrote:
> > > Our local test environment mini cluster goes nuts right after start
> up. It is a two node/shard/replica collection starts up normally if only
> one node start up.  But as soon as the second node attempts to join the
> cluster, both nodes go crazy, creating thousands of threads with identical
> stack traces.
> > >
> > > "qtp1458849419-4738" - Thread t@4738
> > >    java.lang.Thread.State: TIMED_WAITING
> > >         at sun.misc.Unsafe.park(Native Method)
> > >         - parking to wait for <6ee32168> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> > >         at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> > >         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> > >         at
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
> > >         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
> > >         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
> > >         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
> > >         at java.lang.Thread.run(Thread.java:748)
> > >
> > >    Locked ownable synchronizers:
> > >         - None
> > >
> > > If does not happen always, but most of the time i am unable to boot
> the cluster normally. Sometimes, apparently right now for the first time,
> the GUI is still accessible.
> > >
> > > Is this a known issue?
> >
> > It's not a problem that I've heard of.  There are no Solr classes in the
> > stacktrace, only Jetty and Java classes.  I won't try to tell you that a
> > bug in Solr can't be the root cause, because it definitely can.  The
> > threads appear to be created by Jetty, but the supplied info doesn't
> > indicate WHY it's happening.
> >
> > Presumably there's a previous version you've used where this problem did
> > NOT happen.  What version would that be?
> >
> > Can you share the solr.log file from both nodes when this happens?
> > There might be a clue there.
> >
> > It sounds like you probably have a small number of collections in the
> > dev cluster.  Can you confirm that?
> >
> > Thanks,
> > Shawn
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: 7.3.1 creates thousands of threads after start up

Markus Jelsma-2
In reply to this post by Markus Jelsma-2
Hello Deepak,

Not in-between.

Thanks,
Markus

-----Original message-----

> From:Deepak Goel <[hidden email]>
> Sent: Friday 8th June 2018 17:14
> To: [hidden email]
> Subject: Re: 7.3.1 creates thousands of threads after start up
>
> Do these machines have a firewall in-between?
>
> On Fri, 8 Jun 2018, 20:29 Markus Jelsma, <[hidden email]> wrote:
>
> > Hello Shawn,
> >
> > The logs appear useless, they are littered with these:
> >
> > 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [   ]
> > o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error
> > trying to proxy request for url: http://idx2:8983/solr/
> > search/admin/ping <http://idx2:8983/solr/search/admin/ping>
> >         at
> > org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> >         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> >         at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> > ..
> >         at
> > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
> >         at
> > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
> >         at
> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >         at
> > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >         at java.lang.Thread.run(Thread.java:748)
> > Caused by: org.eclipse.jetty.io.EofException
> >         at
> > org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:704)
> > ..
> >         at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:509)
> >
> > Regarding the versions, it is a bit hard to recall but i do no thing i
> > have seen this on 7.2, most certainly not on 7.1.
> >
> > We operate three distinct type of Solr collections, they only share the
> > same Zookeeper quorum. The other two collections do not seem to have this
> > problem, but i don't restart those as often as i restart this collection,
> > as i am STILL trying to REPRODUCE the dreaded memory leak i reported having
> > on 7.3 about two weeks ago. Sorry, but i drives me nuts!
> >
> > Thanks,
> > Markus
> >
> > -----Original message-----
> > > From:Shawn Heisey <[hidden email]>
> > > Sent: Friday 8th June 2018 16:47
> > > To: [hidden email]
> > > Subject: Re: 7.3.1 creates thousands of threads after start up
> > >
> > > On 6/8/2018 8:17 AM, Markus Jelsma wrote:
> > > > Our local test environment mini cluster goes nuts right after start
> > up. It is a two node/shard/replica collection starts up normally if only
> > one node start up.  But as soon as the second node attempts to join the
> > cluster, both nodes go crazy, creating thousands of threads with identical
> > stack traces.
> > > >
> > > > "qtp1458849419-4738" - Thread t@4738
> > > >    java.lang.Thread.State: TIMED_WAITING
> > > >         at sun.misc.Unsafe.park(Native Method)
> > > >         - parking to wait for <6ee32168> (a
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> > > >         at
> > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> > > >         at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> > > >         at
> > org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
> > > >         at
> > org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
> > > >         at
> > org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
> > > >         at
> > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
> > > >         at java.lang.Thread.run(Thread.java:748)
> > > >
> > > >    Locked ownable synchronizers:
> > > >         - None
> > > >
> > > > If does not happen always, but most of the time i am unable to boot
> > the cluster normally. Sometimes, apparently right now for the first time,
> > the GUI is still accessible.
> > > >
> > > > Is this a known issue?
> > >
> > > It's not a problem that I've heard of.  There are no Solr classes in the
> > > stacktrace, only Jetty and Java classes.  I won't try to tell you that a
> > > bug in Solr can't be the root cause, because it definitely can.  The
> > > threads appear to be created by Jetty, but the supplied info doesn't
> > > indicate WHY it's happening.
> > >
> > > Presumably there's a previous version you've used where this problem did
> > > NOT happen.  What version would that be?
> > >
> > > Can you share the solr.log file from both nodes when this happens?
> > > There might be a clue there.
> > >
> > > It sounds like you probably have a small number of collections in the
> > > dev cluster.  Can you confirm that?
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: 7.3.1 creates thousands of threads after start up

Shawn Heisey-2
In reply to this post by Markus Jelsma-2
On 6/8/2018 8:59 AM, Markus Jelsma wrote:
> 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [   ] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://idx2:8983/solr/
> search/admin/ping
<snip>
> Caused by: org.eclipse.jetty.io.EofException

If you haven't tweaked the shard handler config to drastically reduce
the socket timeout, that is weird.  The only thing that comes to mind is
extreme GC pauses that cause the socket timeout to be exceeded.

> We operate three distinct type of Solr collections, they only share the same Zookeeper quorum. The other two collections do not seem to have this problem, but i don't restart those as often as i restart this collection, as i am STILL trying to REPRODUCE the dreaded memory leak i reported having on 7.3 about two weeks ago. Sorry, but i drives me nuts!

I've reviewed the list messages about the leak.  As you might imagine,
my immediate thought is that the custom plugins you're running are
probably the cause, because we are not getting OOME reports like I would
expect if there were a leak in Solr itself.  It would not be unheard of
for a custom plugin to experience no leaks with one Solr version but
leak when Solr is upgraded, requiring a change in the plugin to properly
close resources.  I do not know if that's what's happening.

A leak could lead to GC pause problems, but it does seem really odd for
that to happen on a Solr node that's just been started.  You could try
bumping the heap size by 25 to 50 percent and see if the behavior
changes at all.  Honestly I don't expect it to change, and if it
doesn't, then I do not know what the next troubleshooting step should
be.  I could review your solr.log, though I can't be sure I would see
something you didn't.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

RE: 7.3.1 creates thousands of threads after start up

Markus Jelsma-2
In reply to this post by Markus Jelsma-2
Hello Shawn,

You mentioned shard handler tweaks, thanks. I see we have an incorrect setting there for maximumPoolSize, way too high, but that doesn't account for the number of threads created. After reducing the number, for dubious reasons, twice the number of threads are created and the node dies.

For a short time, there were two identical collections (just for different tests) on the nodes, i have removed one of them, but the number of threads created doesn't change one bit. So it appears shard handler config has nothing to do with it, or does it?

Regarding memory leaks, of course, the first that came to mind is that i made an error which only causes trouble on 7.3, but it is unreproducible so far, even if i fully replicate production in a test environment. Since it only leaks on commits, first suspect were URPs, and the URPs are the only things i can disable in production without affecting customers. Needless to say, it weren't the URPs.

But thanks anyway, whenever i have the courage again to tests it, i'll enable INFO logging, which is disabled. Maybe it will reveal something.

If anyone has even the weirdest unconventional suggestion on how to reproduce my production memory leak in a controlled test environment, let me know/

Thanks,
Markus
 
-----Original message-----

> From:Shawn Heisey <[hidden email]>
> Sent: Sunday 10th June 2018 22:42
> To: [hidden email]
> Subject: Re: 7.3.1 creates thousands of threads after start up
>
> On 6/8/2018 8:59 AM, Markus Jelsma wrote:
> > 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [   ] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://idx2:8983/solr/
> > search/admin/ping
> <snip>
> > Caused by: org.eclipse.jetty.io.EofException
>
> If you haven't tweaked the shard handler config to drastically reduce
> the socket timeout, that is weird.  The only thing that comes to mind is
> extreme GC pauses that cause the socket timeout to be exceeded.
>
> > We operate three distinct type of Solr collections, they only share the same Zookeeper quorum. The other two collections do not seem to have this problem, but i don't restart those as often as i restart this collection, as i am STILL trying to REPRODUCE the dreaded memory leak i reported having on 7.3 about two weeks ago. Sorry, but i drives me nuts!
>
> I've reviewed the list messages about the leak.  As you might imagine,
> my immediate thought is that the custom plugins you're running are
> probably the cause, because we are not getting OOME reports like I would
> expect if there were a leak in Solr itself.  It would not be unheard of
> for a custom plugin to experience no leaks with one Solr version but
> leak when Solr is upgraded, requiring a change in the plugin to properly
> close resources.  I do not know if that's what's happening.
>
> A leak could lead to GC pause problems, but it does seem really odd for
> that to happen on a Solr node that's just been started.  You could try
> bumping the heap size by 25 to 50 percent and see if the behavior
> changes at all.  Honestly I don't expect it to change, and if it
> doesn't, then I do not know what the next troubleshooting step should
> be.  I could review your solr.log, though I can't be sure I would see
> something you didn't.
>
> Thanks,
> Shawn
>
>
Reply | Threaded
Open this post in threaded view
|

Re: 7.3.1 creates thousands of threads after start up

Shawn Heisey-2
On 6/13/2018 4:04 AM, Markus Jelsma wrote:
> You mentioned shard handler tweaks, thanks. I see we have an incorrect setting there for maximumPoolSize, way too high, but that doesn't account for the number of threads created. After reducing the number, for dubious reasons, twice the number of threads are created and the node dies.

The specific config in the shard handler I was thinking of was socket
timeouts.

The EofException in Jetty almost always indicates that the client
disconnected before the server responded.  A low socket timeout will
cause a client to disconnect if the server takes longer than the timeout
to finish its work.  When the server finally does finish and try to
respond, it throws the EofException because the connection it was
expecting to use is no longer there.

I believe that the default socket timeout in Solr's shard handler is 60
seconds.  Which is a relative eternity in most situations.  It would
take a particularly nasty GC pause problem to exceed that timeout.

> Regarding memory leaks, of course, the first that came to mind is that i made an error which only causes trouble on 7.3, but it is unreproducible so far, even if i fully replicate production in a test environment. Since it only leaks on commits, first suspect were URPs, and the URPs are the only things i can disable in production without affecting customers. Needless to say, it weren't the URPs.

Custom update processors could cause leaks, but it is not something that
I would expect from a typical URP implementation. If you've disabled
them and it's still happening, then that's probably not it.  It's
plugins for queries that have the most potential for not closing
resources correctly even when written by experienced programmers.  I'm
not sure what the potential for leaks is on index-time plugins, but I
suspect that it's less likely than problems with query plugins.

Thanks,
Shawn