Doppleganger threads after ingestion completed

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Doppleganger threads after ingestion completed

karl.wright
Folks,
 
I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler under jetty.  The previous problems with resources have apparently been resolved by using Http1.1 with keep-alive, rather than creating and destroying 20,000,000 sockets. ;-)  However, after the client terminates, I still find the Solr process chewing away CPU – indeed, there were 5 threads doing this.
 
A thread dump yields the following partial trace for all 5 threads:
 
"btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable [0x00007f4a8c789000]
   java.lang.Thread.State: RUNNABLE
        at org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
        at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
        at org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
        at org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
        at org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
        at org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
        at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
        at org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
        at org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
        at org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
        at org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
        at org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
 
I could be wrong, but it looks to me like either jetty or fileupload may have a problem here.  I have not looked at the jetty source code, but infinitely spinning processes even after the socket has been abandoned do not seem reasonable to me.  Thoughts?
 
Karl
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Doppleganger threads after ingestion completed

Lance Norskog-2
"Chewing up cpu" or "blocked". The stack trace says it's blocked.

The sockets are abandoned by the program, yes, but TCP/IP itself has a
complex sequence for shutting down sockets that takes a few minutes.
If these sockets stay around for hours, then there's a real problem.
(In fact, there is a bug in the TCP/IP specification, 40 years old,
that causes zombie sockets that never shut down.)

The HTTP solr server really needs a socket close() method.

On Thu, Jun 17, 2010 at 6:08 AM,  <[hidden email]> wrote:

> Folks,
>
> I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler
> under jetty.  The previous problems with resources have apparently been
> resolved by using Http1.1 with keep-alive, rather than creating and
> destroying 20,000,000 sockets. ;-)  However, after the client terminates, I
> still find the Solr process chewing away CPU – indeed, there were 5 threads
> doing this.
>
> A thread dump yields the following partial trace for all 5 threads:
>
> "btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable
> [0x00007f4a8c789000]
>    java.lang.Thread.State: RUNNABLE
>         at
> org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
>         at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
>         at
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>         at
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
>         at
> org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
>         at
> org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
>         at
> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> …
>
> I could be wrong, but it looks to me like either jetty or fileupload may
> have a problem here.  I have not looked at the jetty source code, but
> infinitely spinning processes even after the socket has been abandoned do
> not seem reasonable to me.  Thoughts?
>
> Karl
>
>



--
Lance Norskog
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Doppleganger threads after ingestion completed

karl.wright
So far they've stayed around for 72 hours and counting.
Also, I don't care what the stack trace says - CPU is listing as 500%.  So it may be momentarily "blocked" but then it must loop.

Karl
________________________________________
From: ext Lance Norskog [[hidden email]]
Sent: Saturday, June 19, 2010 8:51 PM
To: [hidden email]
Subject: Re: Doppleganger threads after ingestion completed

"Chewing up cpu" or "blocked". The stack trace says it's blocked.

The sockets are abandoned by the program, yes, but TCP/IP itself has a
complex sequence for shutting down sockets that takes a few minutes.
If these sockets stay around for hours, then there's a real problem.
(In fact, there is a bug in the TCP/IP specification, 40 years old,
that causes zombie sockets that never shut down.)

The HTTP solr server really needs a socket close() method.

On Thu, Jun 17, 2010 at 6:08 AM,  <[hidden email]> wrote:

> Folks,
>
> I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler
> under jetty.  The previous problems with resources have apparently been
> resolved by using Http1.1 with keep-alive, rather than creating and
> destroying 20,000,000 sockets. ;-)  However, after the client terminates, I
> still find the Solr process chewing away CPU – indeed, there were 5 threads
> doing this.
>
> A thread dump yields the following partial trace for all 5 threads:
>
> "btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable
> [0x00007f4a8c789000]
>    java.lang.Thread.State: RUNNABLE
>         at
> org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
>         at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
>         at
> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
>         at
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>         at
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
>         at
> org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
>         at
> org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
>         at
> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>         at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> …
>
> I could be wrong, but it looks to me like either jetty or fileupload may
> have a problem here.  I have not looked at the jetty source code, but
> infinitely spinning processes even after the socket has been abandoned do
> not seem reasonable to me.  Thoughts?
>
> Karl
>
>



--
Lance Norskog
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Doppleganger threads after ingestion completed

Lance Norskog-2
Does 'netstat -an' show incoming sockets for these threads?

What Solr release is this?

Is this one long upload of 20m documents without committing? Are you
doing periodic commits, or automatic commits (in solrconfig.xml)?

How large are the documents?

Could this be a jetty bug? Have you tried this on tomcat?

Lance

On Sun, Jun 20, 2010 at 4:50 PM,  <[hidden email]> wrote:

> So far they've stayed around for 72 hours and counting.
> Also, I don't care what the stack trace says - CPU is listing as 500%.  So it may be momentarily "blocked" but then it must loop.
>
> Karl
> ________________________________________
> From: ext Lance Norskog [[hidden email]]
> Sent: Saturday, June 19, 2010 8:51 PM
> To: [hidden email]
> Subject: Re: Doppleganger threads after ingestion completed
>
> "Chewing up cpu" or "blocked". The stack trace says it's blocked.
>
> The sockets are abandoned by the program, yes, but TCP/IP itself has a
> complex sequence for shutting down sockets that takes a few minutes.
> If these sockets stay around for hours, then there's a real problem.
> (In fact, there is a bug in the TCP/IP specification, 40 years old,
> that causes zombie sockets that never shut down.)
>
> The HTTP solr server really needs a socket close() method.
>
> On Thu, Jun 17, 2010 at 6:08 AM,  <[hidden email]> wrote:
>> Folks,
>>
>> I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler
>> under jetty.  The previous problems with resources have apparently been
>> resolved by using Http1.1 with keep-alive, rather than creating and
>> destroying 20,000,000 sockets. ;-)  However, after the client terminates, I
>> still find the Solr process chewing away CPU – indeed, there were 5 threads
>> doing this.
>>
>> A thread dump yields the following partial trace for all 5 threads:
>>
>> "btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable
>> [0x00007f4a8c789000]
>>    java.lang.Thread.State: RUNNABLE
>>         at
>> org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
>>         at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
>>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
>>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
>>         at
>> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>>         at
>> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
>>         at
>> org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
>>         at
>> org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
>>         at
>> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
>>         at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>>         at
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>> …
>>
>> I could be wrong, but it looks to me like either jetty or fileupload may
>> have a problem here.  I have not looked at the jetty source code, but
>> infinitely spinning processes even after the socket has been abandoned do
>> not seem reasonable to me.  Thoughts?
>>
>> Karl
>>
>>
>
>
>
> --
> Lance Norskog
> [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



--
Lance Norskog
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Doppleganger threads after ingestion completed

karl.wright
Some answers below.

(1) netstat -an shows no sockets at all.  Remember, the client process is gone, dead, shut down.
(2) This is Solr 1.5 from approximately mid-March.
(3) Autocommit was on, using the standard configuration present in the example.

This could well be a jetty bug and, no, I have not tried tomcat yet.

Karl

________________________________________
From: ext Lance Norskog [[hidden email]]
Sent: Sunday, June 20, 2010 10:47 PM
To: [hidden email]
Subject: Re: Doppleganger threads after ingestion completed

Does 'netstat -an' show incoming sockets for these threads?

What Solr release is this?

Is this one long upload of 20m documents without committing? Are you
doing periodic commits, or automatic commits (in solrconfig.xml)?

How large are the documents?

Could this be a jetty bug? Have you tried this on tomcat?

Lance

On Sun, Jun 20, 2010 at 4:50 PM,  <[hidden email]> wrote:

> So far they've stayed around for 72 hours and counting.
> Also, I don't care what the stack trace says - CPU is listing as 500%.  So it may be momentarily "blocked" but then it must loop.
>
> Karl
> ________________________________________
> From: ext Lance Norskog [[hidden email]]
> Sent: Saturday, June 19, 2010 8:51 PM
> To: [hidden email]
> Subject: Re: Doppleganger threads after ingestion completed
>
> "Chewing up cpu" or "blocked". The stack trace says it's blocked.
>
> The sockets are abandoned by the program, yes, but TCP/IP itself has a
> complex sequence for shutting down sockets that takes a few minutes.
> If these sockets stay around for hours, then there's a real problem.
> (In fact, there is a bug in the TCP/IP specification, 40 years old,
> that causes zombie sockets that never shut down.)
>
> The HTTP solr server really needs a socket close() method.
>
> On Thu, Jun 17, 2010 at 6:08 AM,  <[hidden email]> wrote:
>> Folks,
>>
>> I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler
>> under jetty.  The previous problems with resources have apparently been
>> resolved by using Http1.1 with keep-alive, rather than creating and
>> destroying 20,000,000 sockets. ;-)  However, after the client terminates, I
>> still find the Solr process chewing away CPU – indeed, there were 5 threads
>> doing this.
>>
>> A thread dump yields the following partial trace for all 5 threads:
>>
>> "btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable
>> [0x00007f4a8c789000]
>>    java.lang.Thread.State: RUNNABLE
>>         at
>> org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
>>         at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
>>         at
>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
>>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
>>         at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
>>         at
>> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>>         at
>> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
>>         at
>> org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
>>         at
>> org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
>>         at
>> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
>>         at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>>         at
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>> …
>>
>> I could be wrong, but it looks to me like either jetty or fileupload may
>> have a problem here.  I have not looked at the jetty source code, but
>> infinitely spinning processes even after the socket has been abandoned do
>> not seem reasonable to me.  Thoughts?
>>
>> Karl
>>
>>
>
>
>
> --
> Lance Norskog
> [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



--
Lance Norskog
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Doppleganger threads after ingestion completed

Lance Norskog-2
I can't think of anything else.

On 6/21/10, [hidden email] <[hidden email]> wrote:

> Some answers below.
>
> (1) netstat -an shows no sockets at all.  Remember, the client process is
> gone, dead, shut down.
> (2) This is Solr 1.5 from approximately mid-March.
> (3) Autocommit was on, using the standard configuration present in the
> example.
>
> This could well be a jetty bug and, no, I have not tried tomcat yet.
>
> Karl
>
> ________________________________________
> From: ext Lance Norskog [[hidden email]]
> Sent: Sunday, June 20, 2010 10:47 PM
> To: [hidden email]
> Subject: Re: Doppleganger threads after ingestion completed
>
> Does 'netstat -an' show incoming sockets for these threads?
>
> What Solr release is this?
>
> Is this one long upload of 20m documents without committing? Are you
> doing periodic commits, or automatic commits (in solrconfig.xml)?
>
> How large are the documents?
>
> Could this be a jetty bug? Have you tried this on tomcat?
>
> Lance
>
> On Sun, Jun 20, 2010 at 4:50 PM,  <[hidden email]> wrote:
>> So far they've stayed around for 72 hours and counting.
>> Also, I don't care what the stack trace says - CPU is listing as 500%.  So
>> it may be momentarily "blocked" but then it must loop.
>>
>> Karl
>> ________________________________________
>> From: ext Lance Norskog [[hidden email]]
>> Sent: Saturday, June 19, 2010 8:51 PM
>> To: [hidden email]
>> Subject: Re: Doppleganger threads after ingestion completed
>>
>> "Chewing up cpu" or "blocked". The stack trace says it's blocked.
>>
>> The sockets are abandoned by the program, yes, but TCP/IP itself has a
>> complex sequence for shutting down sockets that takes a few minutes.
>> If these sockets stay around for hours, then there's a real problem.
>> (In fact, there is a bug in the TCP/IP specification, 40 years old,
>> that causes zombie sockets that never shut down.)
>>
>> The HTTP solr server really needs a socket close() method.
>>
>> On Thu, Jun 17, 2010 at 6:08 AM,  <[hidden email]> wrote:
>>> Folks,
>>>
>>> I ran 20,000,000 records into Solr via the extractingUpdateRequestHandler
>>> under jetty.  The previous problems with resources have apparently been
>>> resolved by using Http1.1 with keep-alive, rather than creating and
>>> destroying 20,000,000 sockets. ;-)  However, after the client terminates,
>>> I
>>> still find the Solr process chewing away CPU – indeed, there were 5
>>> threads
>>> doing this.
>>>
>>> A thread dump yields the following partial trace for all 5 threads:
>>>
>>> "btpool0-13" prio=10 tid=0x0000000041391000 nid=0xe7c runnable
>>> [0x00007f4a8c789000]
>>>    java.lang.Thread.State: RUNNABLE
>>>         at
>>> org.mortbay.jetty.HttpParser$Input.blockForContent(HttpParser.java:925)
>>>         at org.mortbay.jetty.HttpParser$Input.read(HttpParser.java:897)
>>>         at
>>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
>>>         at
>>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:924)
>>>         at
>>> org.apache.commons.fileupload.MultipartStream$ItemInputStream.close(MultipartStream.java:904)
>>>         at
>>> org.apache.commons.fileupload.util.Streams.copy(Streams.java:119)
>>>         at
>>> org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
>>>         at
>>> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>>>         at
>>> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
>>>         at
>>> org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
>>>         at
>>> org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
>>>         at
>>> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
>>>         at
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>>>         at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>>> …
>>>
>>> I could be wrong, but it looks to me like either jetty or fileupload may
>>> have a problem here.  I have not looked at the jetty source code, but
>>> infinitely spinning processes even after the socket has been abandoned do
>>> not seem reasonable to me.  Thoughts?
>>>
>>> Karl
>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> [hidden email]
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
>
> --
> Lance Norskog
> [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Lance Norskog
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]