Doc add limit

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Doc add limit

sangraal
Hey there... I'm having an issue with large doc updates on my solr
installation. I'm adding in batches between 2-20,000 docs at a time and I've
noticed Solr seems to hang at 6,144 docs every time. Breaking the adds into
smaller batches works just fine, but I was wondering if anyone knew why this
would happen. I've tried doubling memory as well as tweaking various config
options but nothing seems to let me break the 6,144 barrier.

This is the output from Solr admin. Any help would be greatly appreciated.


*name: * updateHandler  *class: *
org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0  *description:
* Update handler that efficiently directly updates the on-disk main lucene
index  *stats: *commits : 0
optimizes : 0
docsPending : 6144
deletesPending : 6144
adds : 6144
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 6144
cumulative_deletesById : 0
cumulative_deletesByQuery : 0
cumulative_errors : 0
docsDeleted : 0
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley
It's possible it's not hanging, but just takes a long time on a
specific add.  This is because Lucene will occasionally merge
segments.  When very large segments are merged, it can take a long
time.

In the log file, add commands are followed by the number of
milliseconds the operation took.  Next time Solr hangs, wait for a
number of minutes until you see the operation logged and note how long
it took.

How many documents are in the index before you do a batch that causes
a hang?  Does it happen on the first batch?  If so, you might be
seeing some other bug.  What appserver are you using?  Do the admin
pages respond when you see this hang?  If so, what does a stack trace
look like?

-Yonik


On 7/26/06, sangraal aiken <[hidden email]> wrote:

> Hey there... I'm having an issue with large doc updates on my solr
> installation. I'm adding in batches between 2-20,000 docs at a time and I've
> noticed Solr seems to hang at 6,144 docs every time. Breaking the adds into
> smaller batches works just fine, but I was wondering if anyone knew why this
> would happen. I've tried doubling memory as well as tweaking various config
> options but nothing seems to let me break the 6,144 barrier.
>
> This is the output from Solr admin. Any help would be greatly appreciated.
>
>
> *name: * updateHandler  *class: *
> org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0  *description:
> * Update handler that efficiently directly updates the on-disk main lucene
> index  *stats: *commits : 0
> optimizes : 0
> docsPending : 6144
> deletesPending : 6144
> adds : 6144
> deletesById : 0
> deletesByQuery : 0
> errors : 0
> cumulative_adds : 6144
> cumulative_deletesById : 0
> cumulative_deletesByQuery : 0
> cumulative_errors : 0
> docsDeleted : 0
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
Thanks for you help Yonik, I've responded to your questions below:

On 7/26/06, Yonik Seeley <[hidden email]> wrote:
>
> It's possible it's not hanging, but just takes a long time on a
> specific add.  This is because Lucene will occasionally merge
> segments.  When very large segments are merged, it can take a long
> time.


I've left it running (hung) for up to a half hour at a time and I've
verified that my cpu idles during the hang. I have witnessed much shorter
hangs on the ramp up to my 6,144 limit but they have been more like 2 - 10
seconds in length. Perhaps this is the Lucene merging you mentioned.

In the log file, add commands are followed by the number of
> milliseconds the operation took.  Next time Solr hangs, wait for a
> number of minutes until you see the operation logged and note how long
> it took.


Here are the last 5 log entries before the hang the last one is doc #6,144.
Also it looks like Tomcat is trying to redeploy the webapp those last tomcat
entries repeat indefinitely every 10 seconds or so. Perhaps this is a Tomcat
problem?

Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
INFO: add (id=110705) 0 36596
Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
INFO: add (id=110700) 0 36600
Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
INFO: add (id=110688) 0 36603
Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
INFO: add (id=110690) 0 36608
Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
INFO: add (id=110686) 0 36611
Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17
/webapps/ROOT
Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17
/webapps/ROOT/META-INF/context.xml
Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
/webapps/ROOT/WEB-INF/web.xml
Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
/webapps/ROOT/META-INF/context.xml
Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
/conf/context.xml

How many documents are in the index before you do a batch that causes
> a hang?  Does it happen on the first batch?  If so, you might be
> seeing some other bug.  What appserver are you using?  Do the admin
> pages respond when you see this hang?  If so, what does a stack trace
> look like?


I actually don't think I had the problem on the first batch, in fact my
first batch contained very close to 6,144 documents so perhaps there is a
relation there. Right now, I'm adding to an index with close to 90,000
documents in it.
I'm running Tomcat 5.5.17 and the admin pages respond just fine when it's
hung... I did a thread dump and this is the trace of my update:

"http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu
time=6330.7360ms user time=5769.5920ms
     at java.net.SocketOutputStream.socketWrite0(Native Method)
     at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
     at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
     at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
     at java.io.PrintStream.write(PrintStream.java:412)
     at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java
:112)
     at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533)
     at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
HttpURLConnection.java:410)
     at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
HttpURLConnection.java:934)
     at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java:169)
     at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62)
     at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
     at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
     at org.apache.jasper.servlet.JspServletWrapper.service(
JspServletWrapper.java:332)
     at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java
:314)
     at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
ApplicationFilterChain.java:252)
     at org.apache.catalina.core.ApplicationFilterChain.doFilter(
ApplicationFilterChain.java:173)
     at org.apache.catalina.core.StandardWrapperValve.invoke(
StandardWrapperValve.java:213)
     at org.apache.catalina.core.StandardContextValve.invoke(
StandardContextValve.java:178)
     at org.apache.catalina.core.StandardHostValve.invoke(
StandardHostValve.java:126)
     at org.apache.catalina.valves.ErrorReportValve.invoke(
ErrorReportValve.java:105)
     at org.apache.catalina.core.StandardEngineValve.invoke(
StandardEngineValve.java:107)
     at org.apache.catalina.connector.CoyoteAdapter.service(
CoyoteAdapter.java:148)
     at org.apache.coyote.http11.Http11Processor.process(
Http11Processor.java:869)
     at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
(Http11BaseProtocol.java:664)
     at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
PoolTcpEndpoint.java:527)
     at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
LeaderFollowerWorkerThread.java:80)
     at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
ThreadPool.java:684)
     at java.lang.Thread.run(Thread.java:613)


-Yonik

>
>
> On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > Hey there... I'm having an issue with large doc updates on my solr
> > installation. I'm adding in batches between 2-20,000 docs at a time and
> I've
> > noticed Solr seems to hang at 6,144 docs every time. Breaking the adds
> into
> > smaller batches works just fine, but I was wondering if anyone knew why
> this
> > would happen. I've tried doubling memory as well as tweaking various
> config
> > options but nothing seems to let me break the 6,144 barrier.
> >
> > This is the output from Solr admin. Any help would be greatly
> appreciated.
> >
> >
> > *name: * updateHandler  *class: *
> > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
>   *description:
> > * Update handler that efficiently directly updates the on-disk main
> lucene
> > index  *stats: *commits : 0
> > optimizes : 0
> > docsPending : 6144
> > deletesPending : 6144
> > adds : 6144
> > deletesById : 0
> > deletesByQuery : 0
> > errors : 0
> > cumulative_adds : 6144
> > cumulative_deletesById : 0
> > cumulative_deletesByQuery : 0
> > cumulative_errors : 0
> > docsDeleted : 0
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley
So it looks like your client is hanging trying to send somethig over
the socket to the server and blocking... probably because Tomcat isn't
reading anything from the socket because it's busy trying to restart
the webapp.

What is the heap size of the server? try increasing it... maybe tomcat
could have detected low memory and tried to reload the webapp.

-Yonik

On 7/26/06, sangraal aiken <[hidden email]> wrote:

> Thanks for you help Yonik, I've responded to your questions below:
>
> On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> >
> > It's possible it's not hanging, but just takes a long time on a
> > specific add.  This is because Lucene will occasionally merge
> > segments.  When very large segments are merged, it can take a long
> > time.
>
>
> I've left it running (hung) for up to a half hour at a time and I've
> verified that my cpu idles during the hang. I have witnessed much shorter
> hangs on the ramp up to my 6,144 limit but they have been more like 2 - 10
> seconds in length. Perhaps this is the Lucene merging you mentioned.
>
> In the log file, add commands are followed by the number of
> > milliseconds the operation took.  Next time Solr hangs, wait for a
> > number of minutes until you see the operation logged and note how long
> > it took.
>
>
> Here are the last 5 log entries before the hang the last one is doc #6,144.
> Also it looks like Tomcat is trying to redeploy the webapp those last tomcat
> entries repeat indefinitely every 10 seconds or so. Perhaps this is a Tomcat
> problem?
>
> Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> INFO: add (id=110705) 0 36596
> Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> INFO: add (id=110700) 0 36600
> Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> INFO: add (id=110688) 0 36603
> Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> INFO: add (id=110690) 0 36608
> Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> INFO: add (id=110686) 0 36611
> Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
> FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17
> /webapps/ROOT
> Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
> FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17
> /webapps/ROOT/META-INF/context.xml
> Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
> FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
> /webapps/ROOT/WEB-INF/web.xml
> Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
> FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
> /webapps/ROOT/META-INF/context.xml
> Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources
> FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17
> /conf/context.xml
>
> How many documents are in the index before you do a batch that causes
> > a hang?  Does it happen on the first batch?  If so, you might be
> > seeing some other bug.  What appserver are you using?  Do the admin
> > pages respond when you see this hang?  If so, what does a stack trace
> > look like?
>
>
> I actually don't think I had the problem on the first batch, in fact my
> first batch contained very close to 6,144 documents so perhaps there is a
> relation there. Right now, I'm adding to an index with close to 90,000
> documents in it.
> I'm running Tomcat 5.5.17 and the admin pages respond just fine when it's
> hung... I did a thread dump and this is the trace of my update:
>
> "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu
> time=6330.7360ms user time=5769.5920ms
>      at java.net.SocketOutputStream.socketWrite0(Native Method)
>      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>      at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>      at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>      at java.io.PrintStream.write(PrintStream.java:412)
>      at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java
> :112)
>      at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533)
>      at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> HttpURLConnection.java:410)
>      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> HttpURLConnection.java:934)
>      at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java:169)
>      at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62)
>      at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
>      at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
>      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>      at org.apache.jasper.servlet.JspServletWrapper.service(
> JspServletWrapper.java:332)
>      at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java
> :314)
>      at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
>      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
> ApplicationFilterChain.java:252)
>      at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> ApplicationFilterChain.java:173)
>      at org.apache.catalina.core.StandardWrapperValve.invoke(
> StandardWrapperValve.java:213)
>      at org.apache.catalina.core.StandardContextValve.invoke(
> StandardContextValve.java:178)
>      at org.apache.catalina.core.StandardHostValve.invoke(
> StandardHostValve.java:126)
>      at org.apache.catalina.valves.ErrorReportValve.invoke(
> ErrorReportValve.java:105)
>      at org.apache.catalina.core.StandardEngineValve.invoke(
> StandardEngineValve.java:107)
>      at org.apache.catalina.connector.CoyoteAdapter.service(
> CoyoteAdapter.java:148)
>      at org.apache.coyote.http11.Http11Processor.process(
> Http11Processor.java:869)
>      at
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> (Http11BaseProtocol.java:664)
>      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> PoolTcpEndpoint.java:527)
>      at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> LeaderFollowerWorkerThread.java:80)
>      at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> ThreadPool.java:684)
>      at java.lang.Thread.run(Thread.java:613)
>
>
> -Yonik
> >
> >
> > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > Hey there... I'm having an issue with large doc updates on my solr
> > > installation. I'm adding in batches between 2-20,000 docs at a time and
> > I've
> > > noticed Solr seems to hang at 6,144 docs every time. Breaking the adds
> > into
> > > smaller batches works just fine, but I was wondering if anyone knew why
> > this
> > > would happen. I've tried doubling memory as well as tweaking various
> > config
> > > options but nothing seems to let me break the 6,144 barrier.
> > >
> > > This is the output from Solr admin. Any help would be greatly
> > appreciated.
> > >
> > >
> > > *name: * updateHandler  *class: *
> > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> >   *description:
> > > * Update handler that efficiently directly updates the on-disk main
> > lucene
> > > index  *stats: *commits : 0
> > > optimizes : 0
> > > docsPending : 6144
> > > deletesPending : 6144
> > > adds : 6144
> > > deletesById : 0
> > > deletesByQuery : 0
> > > errors : 0
> > > cumulative_adds : 6144
> > > cumulative_deletesById : 0
> > > cumulative_deletesByQuery : 0
> > > cumulative_errors : 0
> > > docsDeleted : 0
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
Right now the heap is set to 512M but I've increased it up to 2GB and yet it
still hangs at the same number 6,144...

Here's something interesting... I pushed this code over to a different
server and tried an update. On that server it's hanging on #5,267. Then
tomcat seems to try to reload the webapp... indefinitely.

So I guess this is looking more like a tomcat problem more than a
lucene/solr problem huh?

-Sangraal

On 7/26/06, Yonik Seeley <[hidden email]> wrote:

>
> So it looks like your client is hanging trying to send somethig over
> the socket to the server and blocking... probably because Tomcat isn't
> reading anything from the socket because it's busy trying to restart
> the webapp.
>
> What is the heap size of the server? try increasing it... maybe tomcat
> could have detected low memory and tried to reload the webapp.
>
> -Yonik
>
> On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > Thanks for you help Yonik, I've responded to your questions below:
> >
> > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > It's possible it's not hanging, but just takes a long time on a
> > > specific add.  This is because Lucene will occasionally merge
> > > segments.  When very large segments are merged, it can take a long
> > > time.
> >
> >
> > I've left it running (hung) for up to a half hour at a time and I've
> > verified that my cpu idles during the hang. I have witnessed much
> shorter
> > hangs on the ramp up to my 6,144 limit but they have been more like 2 -
> 10
> > seconds in length. Perhaps this is the Lucene merging you mentioned.
> >
> > In the log file, add commands are followed by the number of
> > > milliseconds the operation took.  Next time Solr hangs, wait for a
> > > number of minutes until you see the operation logged and note how long
> > > it took.
> >
> >
> > Here are the last 5 log entries before the hang the last one is doc
> #6,144.
> > Also it looks like Tomcat is trying to redeploy the webapp those last
> tomcat
> > entries repeat indefinitely every 10 seconds or so. Perhaps this is a
> Tomcat
> > problem?
> >
> > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > INFO: add (id=110705) 0 36596
> > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > INFO: add (id=110700) 0 36600
> > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > INFO: add (id=110688) 0 36603
> > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > INFO: add (id=110690) 0 36608
> > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > INFO: add (id=110686) 0 36611
> > Jul 26, 2006 1:25:36 PM
> org.apache.catalina.startup.HostConfigcheckResources
> > FINE: Checking context[] redeploy resource /source/solr/apache-
> tomcat-5.5.17
> > /webapps/ROOT
> > Jul 26, 2006 1:25:36 PM
> org.apache.catalina.startup.HostConfigcheckResources
> > FINE: Checking context[] redeploy resource /source/solr/apache-
> tomcat-5.5.17
> > /webapps/ROOT/META-INF/context.xml
> > Jul 26, 2006 1:25:36 PM
> org.apache.catalina.startup.HostConfigcheckResources
> > FINE: Checking context[] reload resource /source/solr/apache-
> tomcat-5.5.17
> > /webapps/ROOT/WEB-INF/web.xml
> > Jul 26, 2006 1:25:36 PM
> org.apache.catalina.startup.HostConfigcheckResources
> > FINE: Checking context[] reload resource /source/solr/apache-
> tomcat-5.5.17
> > /webapps/ROOT/META-INF/context.xml
> > Jul 26, 2006 1:25:36 PM
> org.apache.catalina.startup.HostConfigcheckResources
> > FINE: Checking context[] reload resource /source/solr/apache-
> tomcat-5.5.17
> > /conf/context.xml
> >
> > How many documents are in the index before you do a batch that causes
> > > a hang?  Does it happen on the first batch?  If so, you might be
> > > seeing some other bug.  What appserver are you using?  Do the admin
> > > pages respond when you see this hang?  If so, what does a stack trace
> > > look like?
> >
> >
> > I actually don't think I had the problem on the first batch, in fact my
> > first batch contained very close to 6,144 documents so perhaps there is
> a
> > relation there. Right now, I'm adding to an index with close to 90,000
> > documents in it.
> > I'm running Tomcat 5.5.17 and the admin pages respond just fine when
> it's
> > hung... I did a thread dump and this is the trace of my update:
> >
> > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu
> > time=6330.7360ms user time=5769.5920ms
> >      at java.net.SocketOutputStream.socketWrite0(Native Method)
> >      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
> :92)
> >      at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >      at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> :105)
> >      at java.io.PrintStream.write(PrintStream.java:412)
> >      at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java
> > :112)
> >      at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533)
> >      at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> > HttpURLConnection.java:410)
> >      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> > HttpURLConnection.java:934)
> >      at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java
> :169)
> >      at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62)
> >      at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
> >      at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java
> :97)
> >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> >      at org.apache.jasper.servlet.JspServletWrapper.service(
> > JspServletWrapper.java:332)
> >      at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> JspServlet.java
> > :314)
> >      at org.apache.jasper.servlet.JspServlet.service(JspServlet.java
> :264)
> >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> >      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
> (
> > ApplicationFilterChain.java:252)
> >      at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> > ApplicationFilterChain.java:173)
> >      at org.apache.catalina.core.StandardWrapperValve.invoke(
> > StandardWrapperValve.java:213)
> >      at org.apache.catalina.core.StandardContextValve.invoke(
> > StandardContextValve.java:178)
> >      at org.apache.catalina.core.StandardHostValve.invoke(
> > StandardHostValve.java:126)
> >      at org.apache.catalina.valves.ErrorReportValve.invoke(
> > ErrorReportValve.java:105)
> >      at org.apache.catalina.core.StandardEngineValve.invoke(
> > StandardEngineValve.java:107)
> >      at org.apache.catalina.connector.CoyoteAdapter.service(
> > CoyoteAdapter.java:148)
> >      at org.apache.coyote.http11.Http11Processor.process(
> > Http11Processor.java:869)
> >      at
> >
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> > (Http11BaseProtocol.java:664)
> >      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> > PoolTcpEndpoint.java:527)
> >      at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> > LeaderFollowerWorkerThread.java:80)
> >      at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> > ThreadPool.java:684)
> >      at java.lang.Thread.run(Thread.java:613)
> >
> >
> > -Yonik
> > >
> > >
> > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > Hey there... I'm having an issue with large doc updates on my solr
> > > > installation. I'm adding in batches between 2-20,000 docs at a time
> and
> > > I've
> > > > noticed Solr seems to hang at 6,144 docs every time. Breaking the
> adds
> > > into
> > > > smaller batches works just fine, but I was wondering if anyone knew
> why
> > > this
> > > > would happen. I've tried doubling memory as well as tweaking various
> > > config
> > > > options but nothing seems to let me break the 6,144 barrier.
> > > >
> > > > This is the output from Solr admin. Any help would be greatly
> > > appreciated.
> > > >
> > > >
> > > > *name: * updateHandler  *class: *
> > > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> > >   *description:
> > > > * Update handler that efficiently directly updates the on-disk main
> > > lucene
> > > > index  *stats: *commits : 0
> > > > optimizes : 0
> > > > docsPending : 6144
> > > > deletesPending : 6144
> > > > adds : 6144
> > > > deletesById : 0
> > > > deletesByQuery : 0
> > > > errors : 0
> > > > cumulative_adds : 6144
> > > > cumulative_deletesById : 0
> > > > cumulative_deletesByQuery : 0
> > > > cumulative_errors : 0
> > > > docsDeleted : 0
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley
Tomcat problem, or a Solr problem that is only manifesting on your
platform, or a JVM or libc problem, or even a client update problem...
(possibly you might be exhausting the number of sockets in the server
by using persistent connections with a long timeout and never reusing
them?)

What is your OS/JVM?

-Yonik

On 7/26/06, sangraal aiken <[hidden email]> wrote:

> Right now the heap is set to 512M but I've increased it up to 2GB and yet it
> still hangs at the same number 6,144...
>
> Here's something interesting... I pushed this code over to a different
> server and tried an update. On that server it's hanging on #5,267. Then
> tomcat seems to try to reload the webapp... indefinitely.
>
> So I guess this is looking more like a tomcat problem more than a
> lucene/solr problem huh?
>
> -Sangraal
>
> On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> >
> > So it looks like your client is hanging trying to send somethig over
> > the socket to the server and blocking... probably because Tomcat isn't
> > reading anything from the socket because it's busy trying to restart
> > the webapp.
> >
> > What is the heap size of the server? try increasing it... maybe tomcat
> > could have detected low memory and tried to reload the webapp.
> >
> > -Yonik
> >
> > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > Thanks for you help Yonik, I've responded to your questions below:
> > >
> > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > >
> > > > It's possible it's not hanging, but just takes a long time on a
> > > > specific add.  This is because Lucene will occasionally merge
> > > > segments.  When very large segments are merged, it can take a long
> > > > time.
> > >
> > >
> > > I've left it running (hung) for up to a half hour at a time and I've
> > > verified that my cpu idles during the hang. I have witnessed much
> > shorter
> > > hangs on the ramp up to my 6,144 limit but they have been more like 2 -
> > 10
> > > seconds in length. Perhaps this is the Lucene merging you mentioned.
> > >
> > > In the log file, add commands are followed by the number of
> > > > milliseconds the operation took.  Next time Solr hangs, wait for a
> > > > number of minutes until you see the operation logged and note how long
> > > > it took.
> > >
> > >
> > > Here are the last 5 log entries before the hang the last one is doc
> > #6,144.
> > > Also it looks like Tomcat is trying to redeploy the webapp those last
> > tomcat
> > > entries repeat indefinitely every 10 seconds or so. Perhaps this is a
> > Tomcat
> > > problem?
> > >
> > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > INFO: add (id=110705) 0 36596
> > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > INFO: add (id=110700) 0 36600
> > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > INFO: add (id=110688) 0 36603
> > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > INFO: add (id=110690) 0 36608
> > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > INFO: add (id=110686) 0 36611
> > > Jul 26, 2006 1:25:36 PM
> > org.apache.catalina.startup.HostConfigcheckResources
> > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > tomcat-5.5.17
> > > /webapps/ROOT
> > > Jul 26, 2006 1:25:36 PM
> > org.apache.catalina.startup.HostConfigcheckResources
> > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > tomcat-5.5.17
> > > /webapps/ROOT/META-INF/context.xml
> > > Jul 26, 2006 1:25:36 PM
> > org.apache.catalina.startup.HostConfigcheckResources
> > > FINE: Checking context[] reload resource /source/solr/apache-
> > tomcat-5.5.17
> > > /webapps/ROOT/WEB-INF/web.xml
> > > Jul 26, 2006 1:25:36 PM
> > org.apache.catalina.startup.HostConfigcheckResources
> > > FINE: Checking context[] reload resource /source/solr/apache-
> > tomcat-5.5.17
> > > /webapps/ROOT/META-INF/context.xml
> > > Jul 26, 2006 1:25:36 PM
> > org.apache.catalina.startup.HostConfigcheckResources
> > > FINE: Checking context[] reload resource /source/solr/apache-
> > tomcat-5.5.17
> > > /conf/context.xml
> > >
> > > How many documents are in the index before you do a batch that causes
> > > > a hang?  Does it happen on the first batch?  If so, you might be
> > > > seeing some other bug.  What appserver are you using?  Do the admin
> > > > pages respond when you see this hang?  If so, what does a stack trace
> > > > look like?
> > >
> > >
> > > I actually don't think I had the problem on the first batch, in fact my
> > > first batch contained very close to 6,144 documents so perhaps there is
> > a
> > > relation there. Right now, I'm adding to an index with close to 90,000
> > > documents in it.
> > > I'm running Tomcat 5.5.17 and the admin pages respond just fine when
> > it's
> > > hung... I did a thread dump and this is the trace of my update:
> > >
> > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu
> > > time=6330.7360ms user time=5769.5920ms
> > >      at java.net.SocketOutputStream.socketWrite0(Native Method)
> > >      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
> > :92)
> > >      at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> > >      at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> > :105)
> > >      at java.io.PrintStream.write(PrintStream.java:412)
> > >      at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java
> > > :112)
> > >      at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533)
> > >      at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> > > HttpURLConnection.java:410)
> > >      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> > > HttpURLConnection.java:934)
> > >      at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java
> > :169)
> > >      at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62)
> > >      at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
> > >      at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java
> > :97)
> > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > >      at org.apache.jasper.servlet.JspServletWrapper.service(
> > > JspServletWrapper.java:332)
> > >      at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> > JspServlet.java
> > > :314)
> > >      at org.apache.jasper.servlet.JspServlet.service(JspServlet.java
> > :264)
> > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > >      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
> > (
> > > ApplicationFilterChain.java:252)
> > >      at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> > > ApplicationFilterChain.java:173)
> > >      at org.apache.catalina.core.StandardWrapperValve.invoke(
> > > StandardWrapperValve.java:213)
> > >      at org.apache.catalina.core.StandardContextValve.invoke(
> > > StandardContextValve.java:178)
> > >      at org.apache.catalina.core.StandardHostValve.invoke(
> > > StandardHostValve.java:126)
> > >      at org.apache.catalina.valves.ErrorReportValve.invoke(
> > > ErrorReportValve.java:105)
> > >      at org.apache.catalina.core.StandardEngineValve.invoke(
> > > StandardEngineValve.java:107)
> > >      at org.apache.catalina.connector.CoyoteAdapter.service(
> > > CoyoteAdapter.java:148)
> > >      at org.apache.coyote.http11.Http11Processor.process(
> > > Http11Processor.java:869)
> > >      at
> > >
> > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> > > (Http11BaseProtocol.java:664)
> > >      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> > > PoolTcpEndpoint.java:527)
> > >      at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> > > LeaderFollowerWorkerThread.java:80)
> > >      at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> > > ThreadPool.java:684)
> > >      at java.lang.Thread.run(Thread.java:613)
> > >
> > >
> > > -Yonik
> > > >
> > > >
> > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > Hey there... I'm having an issue with large doc updates on my solr
> > > > > installation. I'm adding in batches between 2-20,000 docs at a time
> > and
> > > > I've
> > > > > noticed Solr seems to hang at 6,144 docs every time. Breaking the
> > adds
> > > > into
> > > > > smaller batches works just fine, but I was wondering if anyone knew
> > why
> > > > this
> > > > > would happen. I've tried doubling memory as well as tweaking various
> > > > config
> > > > > options but nothing seems to let me break the 6,144 barrier.
> > > > >
> > > > > This is the output from Solr admin. Any help would be greatly
> > > > appreciated.
> > > > >
> > > > >
> > > > > *name: * updateHandler  *class: *
> > > > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> > > >   *description:
> > > > > * Update handler that efficiently directly updates the on-disk main
> > > > lucene
> > > > > index  *stats: *commits : 0
> > > > > optimizes : 0
> > > > > docsPending : 6144
> > > > > deletesPending : 6144
> > > > > adds : 6144
> > > > > deletesById : 0
> > > > > deletesByQuery : 0
> > > > > errors : 0
> > > > > cumulative_adds : 6144
> > > > > cumulative_deletesById : 0
> > > > > cumulative_deletesByQuery : 0
> > > > > cumulative_errors : 0
> > > > > docsDeleted : 0
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07.

I don't think it's a socket problem, because I can initiate additional
updates while the server is hung... weird I know.

Thanks for all your help, I'll send a post if/when I find a solution.

-S

On 7/26/06, Yonik Seeley <[hidden email]> wrote:

>
> Tomcat problem, or a Solr problem that is only manifesting on your
> platform, or a JVM or libc problem, or even a client update problem...
> (possibly you might be exhausting the number of sockets in the server
> by using persistent connections with a long timeout and never reusing
> them?)
>
> What is your OS/JVM?
>
> -Yonik
>
> On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > Right now the heap is set to 512M but I've increased it up to 2GB and
> yet it
> > still hangs at the same number 6,144...
> >
> > Here's something interesting... I pushed this code over to a different
> > server and tried an update. On that server it's hanging on #5,267. Then
> > tomcat seems to try to reload the webapp... indefinitely.
> >
> > So I guess this is looking more like a tomcat problem more than a
> > lucene/solr problem huh?
> >
> > -Sangraal
> >
> > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > So it looks like your client is hanging trying to send somethig over
> > > the socket to the server and blocking... probably because Tomcat isn't
> > > reading anything from the socket because it's busy trying to restart
> > > the webapp.
> > >
> > > What is the heap size of the server? try increasing it... maybe tomcat
> > > could have detected low memory and tried to reload the webapp.
> > >
> > > -Yonik
> > >
> > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > Thanks for you help Yonik, I've responded to your questions below:
> > > >
> > > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > > >
> > > > > It's possible it's not hanging, but just takes a long time on a
> > > > > specific add.  This is because Lucene will occasionally merge
> > > > > segments.  When very large segments are merged, it can take a long
> > > > > time.
> > > >
> > > >
> > > > I've left it running (hung) for up to a half hour at a time and I've
> > > > verified that my cpu idles during the hang. I have witnessed much
> > > shorter
> > > > hangs on the ramp up to my 6,144 limit but they have been more like
> 2 -
> > > 10
> > > > seconds in length. Perhaps this is the Lucene merging you mentioned.
> > > >
> > > > In the log file, add commands are followed by the number of
> > > > > milliseconds the operation took.  Next time Solr hangs, wait for a
> > > > > number of minutes until you see the operation logged and note how
> long
> > > > > it took.
> > > >
> > > >
> > > > Here are the last 5 log entries before the hang the last one is doc
> > > #6,144.
> > > > Also it looks like Tomcat is trying to redeploy the webapp those
> last
> > > tomcat
> > > > entries repeat indefinitely every 10 seconds or so. Perhaps this is
> a
> > > Tomcat
> > > > problem?
> > > >
> > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > INFO: add (id=110705) 0 36596
> > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > INFO: add (id=110700) 0 36600
> > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > INFO: add (id=110688) 0 36603
> > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > INFO: add (id=110690) 0 36608
> > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > INFO: add (id=110686) 0 36611
> > > > Jul 26, 2006 1:25:36 PM
> > > org.apache.catalina.startup.HostConfigcheckResources
> > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > tomcat-5.5.17
> > > > /webapps/ROOT
> > > > Jul 26, 2006 1:25:36 PM
> > > org.apache.catalina.startup.HostConfigcheckResources
> > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > tomcat-5.5.17
> > > > /webapps/ROOT/META-INF/context.xml
> > > > Jul 26, 2006 1:25:36 PM
> > > org.apache.catalina.startup.HostConfigcheckResources
> > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > tomcat-5.5.17
> > > > /webapps/ROOT/WEB-INF/web.xml
> > > > Jul 26, 2006 1:25:36 PM
> > > org.apache.catalina.startup.HostConfigcheckResources
> > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > tomcat-5.5.17
> > > > /webapps/ROOT/META-INF/context.xml
> > > > Jul 26, 2006 1:25:36 PM
> > > org.apache.catalina.startup.HostConfigcheckResources
> > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > tomcat-5.5.17
> > > > /conf/context.xml
> > > >
> > > > How many documents are in the index before you do a batch that
> causes
> > > > > a hang?  Does it happen on the first batch?  If so, you might be
> > > > > seeing some other bug.  What appserver are you using?  Do the
> admin
> > > > > pages respond when you see this hang?  If so, what does a stack
> trace
> > > > > look like?
> > > >
> > > >
> > > > I actually don't think I had the problem on the first batch, in fact
> my
> > > > first batch contained very close to 6,144 documents so perhaps there
> is
> > > a
> > > > relation there. Right now, I'm adding to an index with close to
> 90,000
> > > > documents in it.
> > > > I'm running Tomcat 5.5.17 and the admin pages respond just fine when
> > > it's
> > > > hung... I did a thread dump and this is the trace of my update:
> > > >
> > > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total
> cpu
> > > > time=6330.7360ms user time=5769.5920ms
> > > >      at java.net.SocketOutputStream.socketWrite0(Native Method)
> > > >      at java.net.SocketOutputStream.socketWrite(
> SocketOutputStream.java
> > > :92)
> > > >      at java.net.SocketOutputStream.write(SocketOutputStream.java
> :136)
> > > >      at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> > > :105)
> > > >      at java.io.PrintStream.write(PrintStream.java:412)
> > > >      at java.io.ByteArrayOutputStream.writeTo(
> ByteArrayOutputStream.java
> > > > :112)
> > > >      at sun.net.www.http.HttpClient.writeRequests(HttpClient.java
> :533)
> > > >      at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> > > > HttpURLConnection.java:410)
> > > >      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> > > > HttpURLConnection.java:934)
> > > >      at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java
> > > :169)
> > > >      at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java
> :62)
> > > >      at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
> > > >      at org.apache.jasper.runtime.HttpJspBase.service(
> HttpJspBase.java
> > > :97)
> > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > > >      at org.apache.jasper.servlet.JspServletWrapper.service(
> > > > JspServletWrapper.java:332)
> > > >      at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> > > JspServlet.java
> > > > :314)
> > > >      at org.apache.jasper.servlet.JspServlet.service(JspServlet.java
> > > :264)
> > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > > >      at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
> > > (
> > > > ApplicationFilterChain.java:252)
> > > >      at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> > > > ApplicationFilterChain.java:173)
> > > >      at org.apache.catalina.core.StandardWrapperValve.invoke(
> > > > StandardWrapperValve.java:213)
> > > >      at org.apache.catalina.core.StandardContextValve.invoke(
> > > > StandardContextValve.java:178)
> > > >      at org.apache.catalina.core.StandardHostValve.invoke(
> > > > StandardHostValve.java:126)
> > > >      at org.apache.catalina.valves.ErrorReportValve.invoke(
> > > > ErrorReportValve.java:105)
> > > >      at org.apache.catalina.core.StandardEngineValve.invoke(
> > > > StandardEngineValve.java:107)
> > > >      at org.apache.catalina.connector.CoyoteAdapter.service(
> > > > CoyoteAdapter.java:148)
> > > >      at org.apache.coyote.http11.Http11Processor.process(
> > > > Http11Processor.java:869)
> > > >      at
> > > >
> > >
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> > > > (Http11BaseProtocol.java:664)
> > > >      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> > > > PoolTcpEndpoint.java:527)
> > > >      at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> > > > LeaderFollowerWorkerThread.java:80)
> > > >      at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> > > > ThreadPool.java:684)
> > > >      at java.lang.Thread.run(Thread.java:613)
> > > >
> > > >
> > > > -Yonik
> > > > >
> > > > >
> > > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > > Hey there... I'm having an issue with large doc updates on my
> solr
> > > > > > installation. I'm adding in batches between 2-20,000 docs at a
> time
> > > and
> > > > > I've
> > > > > > noticed Solr seems to hang at 6,144 docs every time. Breaking
> the
> > > adds
> > > > > into
> > > > > > smaller batches works just fine, but I was wondering if anyone
> knew
> > > why
> > > > > this
> > > > > > would happen. I've tried doubling memory as well as tweaking
> various
> > > > > config
> > > > > > options but nothing seems to let me break the 6,144 barrier.
> > > > > >
> > > > > > This is the output from Solr admin. Any help would be greatly
> > > > > appreciated.
> > > > > >
> > > > > >
> > > > > > *name: * updateHandler  *class: *
> > > > > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> > > > >   *description:
> > > > > > * Update handler that efficiently directly updates the on-disk
> main
> > > > > lucene
> > > > > > index  *stats: *commits : 0
> > > > > > optimizes : 0
> > > > > > docsPending : 6144
> > > > > > deletesPending : 6144
> > > > > > adds : 6144
> > > > > > deletesById : 0
> > > > > > deletesByQuery : 0
> > > > > > errors : 0
> > > > > > cumulative_adds : 6144
> > > > > > cumulative_deletesById : 0
> > > > > > cumulative_deletesByQuery : 0
> > > > > > cumulative_errors : 0
> > > > > > docsDeleted : 0
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley
If you narrow the docs down to just the "id" field, does it still
happen at the same place?

-Yonik

On 7/26/06, sangraal aiken <[hidden email]> wrote:

> I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07.
>
> I don't think it's a socket problem, because I can initiate additional
> updates while the server is hung... weird I know.
>
> Thanks for all your help, I'll send a post if/when I find a solution.
>
> -S
>
> On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> >
> > Tomcat problem, or a Solr problem that is only manifesting on your
> > platform, or a JVM or libc problem, or even a client update problem...
> > (possibly you might be exhausting the number of sockets in the server
> > by using persistent connections with a long timeout and never reusing
> > them?)
> >
> > What is your OS/JVM?
> >
> > -Yonik
> >
> > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > Right now the heap is set to 512M but I've increased it up to 2GB and
> > yet it
> > > still hangs at the same number 6,144...
> > >
> > > Here's something interesting... I pushed this code over to a different
> > > server and tried an update. On that server it's hanging on #5,267. Then
> > > tomcat seems to try to reload the webapp... indefinitely.
> > >
> > > So I guess this is looking more like a tomcat problem more than a
> > > lucene/solr problem huh?
> > >
> > > -Sangraal
> > >
> > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > >
> > > > So it looks like your client is hanging trying to send somethig over
> > > > the socket to the server and blocking... probably because Tomcat isn't
> > > > reading anything from the socket because it's busy trying to restart
> > > > the webapp.
> > > >
> > > > What is the heap size of the server? try increasing it... maybe tomcat
> > > > could have detected low memory and tried to reload the webapp.
> > > >
> > > > -Yonik
> > > >
> > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > Thanks for you help Yonik, I've responded to your questions below:
> > > > >
> > > > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > > > >
> > > > > > It's possible it's not hanging, but just takes a long time on a
> > > > > > specific add.  This is because Lucene will occasionally merge
> > > > > > segments.  When very large segments are merged, it can take a long
> > > > > > time.
> > > > >
> > > > >
> > > > > I've left it running (hung) for up to a half hour at a time and I've
> > > > > verified that my cpu idles during the hang. I have witnessed much
> > > > shorter
> > > > > hangs on the ramp up to my 6,144 limit but they have been more like
> > 2 -
> > > > 10
> > > > > seconds in length. Perhaps this is the Lucene merging you mentioned.
> > > > >
> > > > > In the log file, add commands are followed by the number of
> > > > > > milliseconds the operation took.  Next time Solr hangs, wait for a
> > > > > > number of minutes until you see the operation logged and note how
> > long
> > > > > > it took.
> > > > >
> > > > >
> > > > > Here are the last 5 log entries before the hang the last one is doc
> > > > #6,144.
> > > > > Also it looks like Tomcat is trying to redeploy the webapp those
> > last
> > > > tomcat
> > > > > entries repeat indefinitely every 10 seconds or so. Perhaps this is
> > a
> > > > Tomcat
> > > > > problem?
> > > > >
> > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > INFO: add (id=110705) 0 36596
> > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > INFO: add (id=110700) 0 36600
> > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > INFO: add (id=110688) 0 36603
> > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > INFO: add (id=110690) 0 36608
> > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > INFO: add (id=110686) 0 36611
> > > > > Jul 26, 2006 1:25:36 PM
> > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > > tomcat-5.5.17
> > > > > /webapps/ROOT
> > > > > Jul 26, 2006 1:25:36 PM
> > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > > tomcat-5.5.17
> > > > > /webapps/ROOT/META-INF/context.xml
> > > > > Jul 26, 2006 1:25:36 PM
> > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > tomcat-5.5.17
> > > > > /webapps/ROOT/WEB-INF/web.xml
> > > > > Jul 26, 2006 1:25:36 PM
> > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > tomcat-5.5.17
> > > > > /webapps/ROOT/META-INF/context.xml
> > > > > Jul 26, 2006 1:25:36 PM
> > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > tomcat-5.5.17
> > > > > /conf/context.xml
> > > > >
> > > > > How many documents are in the index before you do a batch that
> > causes
> > > > > > a hang?  Does it happen on the first batch?  If so, you might be
> > > > > > seeing some other bug.  What appserver are you using?  Do the
> > admin
> > > > > > pages respond when you see this hang?  If so, what does a stack
> > trace
> > > > > > look like?
> > > > >
> > > > >
> > > > > I actually don't think I had the problem on the first batch, in fact
> > my
> > > > > first batch contained very close to 6,144 documents so perhaps there
> > is
> > > > a
> > > > > relation there. Right now, I'm adding to an index with close to
> > 90,000
> > > > > documents in it.
> > > > > I'm running Tomcat 5.5.17 and the admin pages respond just fine when
> > > > it's
> > > > > hung... I did a thread dump and this is the trace of my update:
> > > > >
> > > > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total
> > cpu
> > > > > time=6330.7360ms user time=5769.5920ms
> > > > >      at java.net.SocketOutputStream.socketWrite0(Native Method)
> > > > >      at java.net.SocketOutputStream.socketWrite(
> > SocketOutputStream.java
> > > > :92)
> > > > >      at java.net.SocketOutputStream.write(SocketOutputStream.java
> > :136)
> > > > >      at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> > > > :105)
> > > > >      at java.io.PrintStream.write(PrintStream.java:412)
> > > > >      at java.io.ByteArrayOutputStream.writeTo(
> > ByteArrayOutputStream.java
> > > > > :112)
> > > > >      at sun.net.www.http.HttpClient.writeRequests(HttpClient.java
> > :533)
> > > > >      at sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> > > > > HttpURLConnection.java:410)
> > > > >      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> > > > > HttpURLConnection.java:934)
> > > > >      at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java
> > > > :169)
> > > > >      at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java
> > :62)
> > > > >      at org.apache.jsp.update_jsp._jspService(update_jsp.java:57)
> > > > >      at org.apache.jasper.runtime.HttpJspBase.service(
> > HttpJspBase.java
> > > > :97)
> > > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > > > >      at org.apache.jasper.servlet.JspServletWrapper.service(
> > > > > JspServletWrapper.java:332)
> > > > >      at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> > > > JspServlet.java
> > > > > :314)
> > > > >      at org.apache.jasper.servlet.JspServlet.service(JspServlet.java
> > > > :264)
> > > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > > > >      at
> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
> > > > (
> > > > > ApplicationFilterChain.java:252)
> > > > >      at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> > > > > ApplicationFilterChain.java:173)
> > > > >      at org.apache.catalina.core.StandardWrapperValve.invoke(
> > > > > StandardWrapperValve.java:213)
> > > > >      at org.apache.catalina.core.StandardContextValve.invoke(
> > > > > StandardContextValve.java:178)
> > > > >      at org.apache.catalina.core.StandardHostValve.invoke(
> > > > > StandardHostValve.java:126)
> > > > >      at org.apache.catalina.valves.ErrorReportValve.invoke(
> > > > > ErrorReportValve.java:105)
> > > > >      at org.apache.catalina.core.StandardEngineValve.invoke(
> > > > > StandardEngineValve.java:107)
> > > > >      at org.apache.catalina.connector.CoyoteAdapter.service(
> > > > > CoyoteAdapter.java:148)
> > > > >      at org.apache.coyote.http11.Http11Processor.process(
> > > > > Http11Processor.java:869)
> > > > >      at
> > > > >
> > > >
> > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> > > > > (Http11BaseProtocol.java:664)
> > > > >      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> > > > > PoolTcpEndpoint.java:527)
> > > > >      at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> > > > > LeaderFollowerWorkerThread.java:80)
> > > > >      at
> > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> > > > > ThreadPool.java:684)
> > > > >      at java.lang.Thread.run(Thread.java:613)
> > > > >
> > > > >
> > > > > -Yonik
> > > > > >
> > > > > >
> > > > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > > > Hey there... I'm having an issue with large doc updates on my
> > solr
> > > > > > > installation. I'm adding in batches between 2-20,000 docs at a
> > time
> > > > and
> > > > > > I've
> > > > > > > noticed Solr seems to hang at 6,144 docs every time. Breaking
> > the
> > > > adds
> > > > > > into
> > > > > > > smaller batches works just fine, but I was wondering if anyone
> > knew
> > > > why
> > > > > > this
> > > > > > > would happen. I've tried doubling memory as well as tweaking
> > various
> > > > > > config
> > > > > > > options but nothing seems to let me break the 6,144 barrier.
> > > > > > >
> > > > > > > This is the output from Solr admin. Any help would be greatly
> > > > > > appreciated.
> > > > > > >
> > > > > > >
> > > > > > > *name: * updateHandler  *class: *
> > > > > > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> > > > > >   *description:
> > > > > > > * Update handler that efficiently directly updates the on-disk
> > main
> > > > > > lucene
> > > > > > > index  *stats: *commits : 0
> > > > > > > optimizes : 0
> > > > > > > docsPending : 6144
> > > > > > > deletesPending : 6144
> > > > > > > adds : 6144
> > > > > > > deletesById : 0
> > > > > > > deletesByQuery : 0
> > > > > > > errors : 0
> > > > > > > cumulative_adds : 6144
> > > > > > > cumulative_deletesById : 0
> > > > > > > cumulative_deletesByQuery : 0
> > > > > > > cumulative_errors : 0
> > > > > > > docsDeleted : 0
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
I removed everything from the Add xml so the docs looked like this:

<doc>
<field name="id">187880</field>
</doc>
<doc>
<field name="id">187852</field>
</doc>

and it still hung at 6,144...

-S


On 7/26/06, Yonik Seeley <[hidden email]> wrote:

>
> If you narrow the docs down to just the "id" field, does it still
> happen at the same place?
>
> -Yonik
>
> On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07.
> >
> > I don't think it's a socket problem, because I can initiate additional
> > updates while the server is hung... weird I know.
> >
> > Thanks for all your help, I'll send a post if/when I find a solution.
> >
> > -S
> >
> > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > Tomcat problem, or a Solr problem that is only manifesting on your
> > > platform, or a JVM or libc problem, or even a client update problem...
> > > (possibly you might be exhausting the number of sockets in the server
> > > by using persistent connections with a long timeout and never reusing
> > > them?)
> > >
> > > What is your OS/JVM?
> > >
> > > -Yonik
> > >
> > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > Right now the heap is set to 512M but I've increased it up to 2GB
> and
> > > yet it
> > > > still hangs at the same number 6,144...
> > > >
> > > > Here's something interesting... I pushed this code over to a
> different
> > > > server and tried an update. On that server it's hanging on #5,267.
> Then
> > > > tomcat seems to try to reload the webapp... indefinitely.
> > > >
> > > > So I guess this is looking more like a tomcat problem more than a
> > > > lucene/solr problem huh?
> > > >
> > > > -Sangraal
> > > >
> > > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > > >
> > > > > So it looks like your client is hanging trying to send somethig
> over
> > > > > the socket to the server and blocking... probably because Tomcat
> isn't
> > > > > reading anything from the socket because it's busy trying to
> restart
> > > > > the webapp.
> > > > >
> > > > > What is the heap size of the server? try increasing it... maybe
> tomcat
> > > > > could have detected low memory and tried to reload the webapp.
> > > > >
> > > > > -Yonik
> > > > >
> > > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > > Thanks for you help Yonik, I've responded to your questions
> below:
> > > > > >
> > > > > > On 7/26/06, Yonik Seeley <[hidden email]> wrote:
> > > > > > >
> > > > > > > It's possible it's not hanging, but just takes a long time on
> a
> > > > > > > specific add.  This is because Lucene will occasionally merge
> > > > > > > segments.  When very large segments are merged, it can take a
> long
> > > > > > > time.
> > > > > >
> > > > > >
> > > > > > I've left it running (hung) for up to a half hour at a time and
> I've
> > > > > > verified that my cpu idles during the hang. I have witnessed
> much
> > > > > shorter
> > > > > > hangs on the ramp up to my 6,144 limit but they have been more
> like
> > > 2 -
> > > > > 10
> > > > > > seconds in length. Perhaps this is the Lucene merging you
> mentioned.
> > > > > >
> > > > > > In the log file, add commands are followed by the number of
> > > > > > > milliseconds the operation took.  Next time Solr hangs, wait
> for a
> > > > > > > number of minutes until you see the operation logged and note
> how
> > > long
> > > > > > > it took.
> > > > > >
> > > > > >
> > > > > > Here are the last 5 log entries before the hang the last one is
> doc
> > > > > #6,144.
> > > > > > Also it looks like Tomcat is trying to redeploy the webapp those
> > > last
> > > > > tomcat
> > > > > > entries repeat indefinitely every 10 seconds or so. Perhaps this
> is
> > > a
> > > > > Tomcat
> > > > > > problem?
> > > > > >
> > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > > INFO: add (id=110705) 0 36596
> > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > > INFO: add (id=110700) 0 36600
> > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > > INFO: add (id=110688) 0 36603
> > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > > INFO: add (id=110690) 0 36608
> > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update
> > > > > > INFO: add (id=110686) 0 36611
> > > > > > Jul 26, 2006 1:25:36 PM
> > > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > > > tomcat-5.5.17
> > > > > > /webapps/ROOT
> > > > > > Jul 26, 2006 1:25:36 PM
> > > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > > FINE: Checking context[] redeploy resource /source/solr/apache-
> > > > > tomcat-5.5.17
> > > > > > /webapps/ROOT/META-INF/context.xml
> > > > > > Jul 26, 2006 1:25:36 PM
> > > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > > tomcat-5.5.17
> > > > > > /webapps/ROOT/WEB-INF/web.xml
> > > > > > Jul 26, 2006 1:25:36 PM
> > > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > > tomcat-5.5.17
> > > > > > /webapps/ROOT/META-INF/context.xml
> > > > > > Jul 26, 2006 1:25:36 PM
> > > > > org.apache.catalina.startup.HostConfigcheckResources
> > > > > > FINE: Checking context[] reload resource /source/solr/apache-
> > > > > tomcat-5.5.17
> > > > > > /conf/context.xml
> > > > > >
> > > > > > How many documents are in the index before you do a batch that
> > > causes
> > > > > > > a hang?  Does it happen on the first batch?  If so, you might
> be
> > > > > > > seeing some other bug.  What appserver are you using?  Do the
> > > admin
> > > > > > > pages respond when you see this hang?  If so, what does a
> stack
> > > trace
> > > > > > > look like?
> > > > > >
> > > > > >
> > > > > > I actually don't think I had the problem on the first batch, in
> fact
> > > my
> > > > > > first batch contained very close to 6,144 documents so perhaps
> there
> > > is
> > > > > a
> > > > > > relation there. Right now, I'm adding to an index with close to
> > > 90,000
> > > > > > documents in it.
> > > > > > I'm running Tomcat 5.5.17 and the admin pages respond just fine
> when
> > > > > it's
> > > > > > hung... I did a thread dump and this is the trace of my update:
> > > > > >
> > > > > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native)
> total
> > > cpu
> > > > > > time=6330.7360ms user time=5769.5920ms
> > > > > >      at java.net.SocketOutputStream.socketWrite0(Native Method)
> > > > > >      at java.net.SocketOutputStream.socketWrite(
> > > SocketOutputStream.java
> > > > > :92)
> > > > > >      at java.net.SocketOutputStream.write(
> SocketOutputStream.java
> > > :136)
> > > > > >      at java.io.BufferedOutputStream.write(
> BufferedOutputStream.java
> > > > > :105)
> > > > > >      at java.io.PrintStream.write(PrintStream.java:412)
> > > > > >      at java.io.ByteArrayOutputStream.writeTo(
> > > ByteArrayOutputStream.java
> > > > > > :112)
> > > > > >      at sun.net.www.http.HttpClient.writeRequests(
> HttpClient.java
> > > :533)
> > > > > >      at
> sun.net.www.protocol.http.HttpURLConnection.writeRequests(
> > > > > > HttpURLConnection.java:410)
> > > > > >      at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(
> > > > > > HttpURLConnection.java:934)
> > > > > >      at com.gawker.solr.update.GanjaUpdate.doUpdate(
> GanjaUpdate.java
> > > > > :169)
> > > > > >      at com.gawker.solr.update.GanjaUpdate.update(
> GanjaUpdate.java
> > > :62)
> > > > > >      at org.apache.jsp.update_jsp._jspService
> (update_jsp.java:57)
> > > > > >      at org.apache.jasper.runtime.HttpJspBase.service(
> > > HttpJspBase.java
> > > > > :97)
> > > > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java
> :802)
> > > > > >      at org.apache.jasper.servlet.JspServletWrapper.service(
> > > > > > JspServletWrapper.java:332)
> > > > > >      at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> > > > > JspServlet.java
> > > > > > :314)
> > > > > >      at org.apache.jasper.servlet.JspServlet.service(
> JspServlet.java
> > > > > :264)
> > > > > >      at javax.servlet.http.HttpServlet.service(HttpServlet.java
> :802)
> > > > > >      at
> > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
> > > > > (
> > > > > > ApplicationFilterChain.java:252)
> > > > > >      at org.apache.catalina.core.ApplicationFilterChain.doFilter
> (
> > > > > > ApplicationFilterChain.java:173)
> > > > > >      at org.apache.catalina.core.StandardWrapperValve.invoke(
> > > > > > StandardWrapperValve.java:213)
> > > > > >      at org.apache.catalina.core.StandardContextValve.invoke(
> > > > > > StandardContextValve.java:178)
> > > > > >      at org.apache.catalina.core.StandardHostValve.invoke(
> > > > > > StandardHostValve.java:126)
> > > > > >      at org.apache.catalina.valves.ErrorReportValve.invoke(
> > > > > > ErrorReportValve.java:105)
> > > > > >      at org.apache.catalina.core.StandardEngineValve.invoke(
> > > > > > StandardEngineValve.java:107)
> > > > > >      at org.apache.catalina.connector.CoyoteAdapter.service(
> > > > > > CoyoteAdapter.java:148)
> > > > > >      at org.apache.coyote.http11.Http11Processor.process(
> > > > > > Http11Processor.java:869)
> > > > > >      at
> > > > > >
> > > > >
> > >
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> > > > > > (Http11BaseProtocol.java:664)
> > > > > >      at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket
> (
> > > > > > PoolTcpEndpoint.java:527)
> > > > > >      at
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> > > > > > LeaderFollowerWorkerThread.java:80)
> > > > > >      at
> > > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> > > > > > ThreadPool.java:684)
> > > > > >      at java.lang.Thread.run(Thread.java:613)
> > > > > >
> > > > > >
> > > > > > -Yonik
> > > > > > >
> > > > > > >
> > > > > > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > > > > > Hey there... I'm having an issue with large doc updates on
> my
> > > solr
> > > > > > > > installation. I'm adding in batches between 2-20,000 docs at
> a
> > > time
> > > > > and
> > > > > > > I've
> > > > > > > > noticed Solr seems to hang at 6,144 docs every time.
> Breaking
> > > the
> > > > > adds
> > > > > > > into
> > > > > > > > smaller batches works just fine, but I was wondering if
> anyone
> > > knew
> > > > > why
> > > > > > > this
> > > > > > > > would happen. I've tried doubling memory as well as tweaking
> > > various
> > > > > > > config
> > > > > > > > options but nothing seems to let me break the 6,144 barrier.
> > > > > > > >
> > > > > > > > This is the output from Solr admin. Any help would be
> greatly
> > > > > > > appreciated.
> > > > > > > >
> > > > > > > >
> > > > > > > > *name: * updateHandler  *class: *
> > > > > > > > org.apache.solr.update.DirectUpdateHandler2  *version: * 1.0
> > > > > > >   *description:
> > > > > > > > * Update handler that efficiently directly updates the
> on-disk
> > > main
> > > > > > > lucene
> > > > > > > > index  *stats: *commits : 0
> > > > > > > > optimizes : 0
> > > > > > > > docsPending : 6144
> > > > > > > > deletesPending : 6144
> > > > > > > > adds : 6144
> > > > > > > > deletesById : 0
> > > > > > > > deletesByQuery : 0
> > > > > > > > errors : 0
> > > > > > > > cumulative_adds : 6144
> > > > > > > > cumulative_deletesById : 0
> > > > > > > > cumulative_deletesByQuery : 0
> > > > > > > > cumulative_errors : 0
> > > > > > > > docsDeleted : 0
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley-2
On 7/26/06, sangraal aiken <[hidden email]> wrote:

> I removed everything from the Add xml so the docs looked like this:
>
> <doc>
> <field name="id">187880</field>
> </doc>
> <doc>
> <field name="id">187852</field>
> </doc>
>
> and it still hung at 6,144...

Maybe you can try the following simple Python client to try and rule
out some kind of different client interactions... the attached script
adds 10,000 documents and works fine for me in WinXP w/ Tomcat 5.5.17
and Jetty

-Yonik


------------------------------------ solr.py ----------------------
import httplib
import socket

class SolrConnection:
  def __init__(self, host='localhost:8983', solrBase='/solr'):
    self.host = host
    self.solrBase = solrBase
    #a connection to the server is not opened at this point.
    self.conn = httplib.HTTPConnection(self.host)
    #self.conn.set_debuglevel(1000000)
    self.postheaders = {"Connection":"close"}

  def doUpdateXML(self, request):
    try:
      self.conn.request('POST', self.solrBase+'/update', request,
self.postheaders)
    except (socket.error,httplib.CannotSendRequest) :
      #reconnect in case the connection was broken from the server going down,
      #the server timing out our persistent connection, or another
      #network failure.
      #Also catch httplib.CannotSendRequest because the HTTPConnection object
      #can get in a bad state.
      self.conn.close()
      self.conn.connect()
      self.conn.request('POST', self.solrBase+'/update', request,
self.postheaders)

    rsp = self.conn.getresponse()
    #print rsp.status, rsp.reason
    data = rsp.read()
    #print "data=",data
    self.conn.close()

  def delete(self, id):
    xstr = '<delete><id>'+id+'</id></delete>'
    self.doUpdateXML(xstr)

  def add(self, **fields):
    #todo: XML escaping
    flist=['<field name="%s">%s</field>' % f for f in fields.items() ]
    flist.insert(0,'<add><doc>')
    flist.append('</doc></add>')
    xstr = ''.join(flist)
    self.doUpdateXML(xstr)

c = SolrConnection()
#for i in range(10000):
#  c.delete(str(i))
for i in range(10000):
  c.add(id=i)
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
Yonik,
It looks like the problem is with the way I'm posting to the SolrUpdate
servlet. I am able to use curl to post the data to my tomcat instance
without a problem. It only fails when I try to handle the http post from
java... my code is below:

      URL url = new URL("http://localhost:8983/solr/update");
      HttpURLConnection conn = (HttpURLConnection) url.openConnection();
      conn.setRequestMethod("POST");
      conn.setRequestProperty("Content-Type", "application/octet-stream");
      conn.setDoOutput(true);
      conn.setDoInput(true);
      conn.setUseCaches(false);

      // Write to server
      log.info("About to post to SolrUpdate servlet.");
      DataOutputStream output = new DataOutputStream(conn.getOutputStream
());
      output.writeBytes(sw);
      output.flush();
      log.info("Finished posting to SolrUpdate servlet.");

-Sangraal

On 7/27/06, Yonik Seeley <[hidden email]> wrote:

>
> On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > I removed everything from the Add xml so the docs looked like this:
> >
> > <doc>
> > <field name="id">187880</field>
> > </doc>
> > <doc>
> > <field name="id">187852</field>
> > </doc>
> >
> > and it still hung at 6,144...
>
> Maybe you can try the following simple Python client to try and rule
> out some kind of different client interactions... the attached script
> adds 10,000 documents and works fine for me in WinXP w/ Tomcat 5.5.17
> and Jetty
>
> -Yonik
>
>
> ------------------------------------ solr.py ----------------------
> import httplib
> import socket
>
> class SolrConnection:
>   def __init__(self, host='localhost:8983', solrBase='/solr'):
>     self.host = host
>     self.solrBase = solrBase
>     #a connection to the server is not opened at this point.
>     self.conn = httplib.HTTPConnection(self.host)
>     #self.conn.set_debuglevel(1000000)
>     self.postheaders = {"Connection":"close"}
>
>   def doUpdateXML(self, request):
>     try:
>       self.conn.request('POST', self.solrBase+'/update', request,
> self.postheaders)
>     except (socket.error,httplib.CannotSendRequest) :
>       #reconnect in case the connection was broken from the server going
> down,
>       #the server timing out our persistent connection, or another
>       #network failure.
>       #Also catch httplib.CannotSendRequest because the HTTPConnection
> object
>       #can get in a bad state.
>       self.conn.close()
>       self.conn.connect()
>       self.conn.request('POST', self.solrBase+'/update', request,
> self.postheaders)
>
>     rsp = self.conn.getresponse()
>     #print rsp.status, rsp.reason
>     data = rsp.read()
>     #print "data=",data
>     self.conn.close()
>
>   def delete(self, id):
>     xstr = '<delete><id>'+id+'</id></delete>'
>     self.doUpdateXML(xstr)
>
>   def add(self, **fields):
>     #todo: XML escaping
>     flist=['<field name="%s">%s</field>' % f for f in fields.items() ]
>     flist.insert(0,'<add><doc>')
>     flist.append('</doc></add>')
>     xstr = ''.join(flist)
>     self.doUpdateXML(xstr)
>
> c = SolrConnection()
> #for i in range(10000):
> #  c.delete(str(i))
> for i in range(10000):
>   c.add(id=i)
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Mike Klaas
In reply to this post by Yonik Seeley-2
On 7/27/06, Yonik Seeley <[hidden email]> wrote:

> class SolrConnection:
>   def __init__(self, host='localhost:8983', solrBase='/solr'):
>     self.host = host
>     self.solrBase = solrBase
>     #a connection to the server is not opened at this point.
>     self.conn = httplib.HTTPConnection(self.host)
>     #self.conn.set_debuglevel(1000000)
>     self.postheaders = {"Connection":"close"}
>
>   def doUpdateXML(self, request):
>     try:
>       self.conn.request('POST', self.solrBase+'/update', request,
> self.postheaders)

Disgressive note: I'm not sure if it is necessary with tomcat, but in
my experience driving solr with python using Jetty, it was necessary
to specify the content-type when posting utf-8 data:

self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
Mike,
 I've been posting with the content type set like this:
      conn.setRequestProperty("Content-Type", "application/octet-stream");

I tried your suggestion though, and unfortunately there was no change.
      conn.setRequestProperty("Content-Type", "text/xml; charset=utf-8");

-Sangraal


On 7/27/06, Mike Klaas <[hidden email]> wrote:

>
> On 7/27/06, Yonik Seeley <[hidden email]> wrote:
>
> > class SolrConnection:
> >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> >     self.host = host
> >     self.solrBase = solrBase
> >     #a connection to the server is not opened at this point.
> >     self.conn = httplib.HTTPConnection(self.host)
> >     #self.conn.set_debuglevel(1000000)
> >     self.postheaders = {"Connection":"close"}
> >
> >   def doUpdateXML(self, request):
> >     try:
> >       self.conn.request('POST', self.solrBase+'/update', request,
> > self.postheaders)
>
> Disgressive note: I'm not sure if it is necessary with tomcat, but in
> my experience driving solr with python using Jetty, it was necessary
> to specify the content-type when posting utf-8 data:
>
> self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})
>
> -Mike
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Mike Klaas
Hi Sangraal:

Sorry--I tried not to imply that this might affect your issue.  You
may have to crank up the solr logging to determine where it is
freezing (and what might be happening).

It is certainly worth investigating why this occurs, but I wonder
about the advantages of using such huge batches.  Assuming a few
hundred bytes per document, 6100 docs produces a POST over 1MB in
size.

-Mike

On 7/27/06, sangraal aiken <[hidden email]> wrote:

> Mike,
>  I've been posting with the content type set like this:
>       conn.setRequestProperty("Content-Type", "application/octet-stream");
>
> I tried your suggestion though, and unfortunately there was no change.
>       conn.setRequestProperty("Content-Type", "text/xml; charset=utf-8");
>
> -Sangraal
>
>
> On 7/27/06, Mike Klaas <[hidden email]> wrote:
> >
> > On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> >
> > > class SolrConnection:
> > >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> > >     self.host = host
> > >     self.solrBase = solrBase
> > >     #a connection to the server is not opened at this point.
> > >     self.conn = httplib.HTTPConnection(self.host)
> > >     #self.conn.set_debuglevel(1000000)
> > >     self.postheaders = {"Connection":"close"}
> > >
> > >   def doUpdateXML(self, request):
> > >     try:
> > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > self.postheaders)
> >
> > Disgressive note: I'm not sure if it is necessary with tomcat, but in
> > my experience driving solr with python using Jetty, it was necessary
> > to specify the content-type when posting utf-8 data:
> >
> > self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})
> >
> > -Mike
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
I think you're right... I will probably work on splitting the batches up
into smaller pieces at some point in the future. I think I will need the
capability to do large batches at some point though, so I want to make sure
the system can handle it. I also want to make sure this problem doesn't pop
up and bite me later.

-Sangraal

On 7/27/06, Mike Klaas <[hidden email]> wrote:

>
> Hi Sangraal:
>
> Sorry--I tried not to imply that this might affect your issue.  You
> may have to crank up the solr logging to determine where it is
> freezing (and what might be happening).
>
> It is certainly worth investigating why this occurs, but I wonder
> about the advantages of using such huge batches.  Assuming a few
> hundred bytes per document, 6100 docs produces a POST over 1MB in
> size.
>
> -Mike
>
> On 7/27/06, sangraal aiken <[hidden email]> wrote:
> > Mike,
> >  I've been posting with the content type set like this:
> >       conn.setRequestProperty("Content-Type",
> "application/octet-stream");
> >
> > I tried your suggestion though, and unfortunately there was no change.
> >       conn.setRequestProperty("Content-Type", "text/xml;
> charset=utf-8");
> >
> > -Sangraal
> >
> >
> > On 7/27/06, Mike Klaas <[hidden email]> wrote:
> > >
> > > On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > > class SolrConnection:
> > > >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> > > >     self.host = host
> > > >     self.solrBase = solrBase
> > > >     #a connection to the server is not opened at this point.
> > > >     self.conn = httplib.HTTPConnection(self.host)
> > > >     #self.conn.set_debuglevel(1000000)
> > > >     self.postheaders = {"Connection":"close"}
> > > >
> > > >   def doUpdateXML(self, request):
> > > >     try:
> > > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > > self.postheaders)
> > >
> > > Disgressive note: I'm not sure if it is necessary with tomcat, but in
> > > my experience driving solr with python using Jetty, it was necessary
> > > to specify the content-type when posting utf-8 data:
> > >
> > > self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})
> > >
> > > -Mike
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley-2
In reply to this post by sangraal
Are you reading the response and closing the connection?  If not, you
are probably running out of socket connections.

-Yonik

On 7/27/06, sangraal aiken <[hidden email]> wrote:

> Yonik,
> It looks like the problem is with the way I'm posting to the SolrUpdate
> servlet. I am able to use curl to post the data to my tomcat instance
> without a problem. It only fails when I try to handle the http post from
> java... my code is below:
>
>       URL url = new URL("http://localhost:8983/solr/update");
>       HttpURLConnection conn = (HttpURLConnection) url.openConnection();
>       conn.setRequestMethod("POST");
>       conn.setRequestProperty("Content-Type", "application/octet-stream");
>       conn.setDoOutput(true);
>       conn.setDoInput(true);
>       conn.setUseCaches(false);
>
>       // Write to server
>       log.info("About to post to SolrUpdate servlet.");
>       DataOutputStream output = new DataOutputStream(conn.getOutputStream
> ());
>       output.writeBytes(sw);
>       output.flush();
>       log.info("Finished posting to SolrUpdate servlet.");
>
> -Sangraal
>
> On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> >
> > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > I removed everything from the Add xml so the docs looked like this:
> > >
> > > <doc>
> > > <field name="id">187880</field>
> > > </doc>
> > > <doc>
> > > <field name="id">187852</field>
> > > </doc>
> > >
> > > and it still hung at 6,144...
> >
> > Maybe you can try the following simple Python client to try and rule
> > out some kind of different client interactions... the attached script
> > adds 10,000 documents and works fine for me in WinXP w/ Tomcat 5.5.17
> > and Jetty
> >
> > -Yonik
> >
> >
> > ------------------------------------ solr.py ----------------------
> > import httplib
> > import socket
> >
> > class SolrConnection:
> >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> >     self.host = host
> >     self.solrBase = solrBase
> >     #a connection to the server is not opened at this point.
> >     self.conn = httplib.HTTPConnection(self.host)
> >     #self.conn.set_debuglevel(1000000)
> >     self.postheaders = {"Connection":"close"}
> >
> >   def doUpdateXML(self, request):
> >     try:
> >       self.conn.request('POST', self.solrBase+'/update', request,
> > self.postheaders)
> >     except (socket.error,httplib.CannotSendRequest) :
> >       #reconnect in case the connection was broken from the server going
> > down,
> >       #the server timing out our persistent connection, or another
> >       #network failure.
> >       #Also catch httplib.CannotSendRequest because the HTTPConnection
> > object
> >       #can get in a bad state.
> >       self.conn.close()
> >       self.conn.connect()
> >       self.conn.request('POST', self.solrBase+'/update', request,
> > self.postheaders)
> >
> >     rsp = self.conn.getresponse()
> >     #print rsp.status, rsp.reason
> >     data = rsp.read()
> >     #print "data=",data
> >     self.conn.close()
> >
> >   def delete(self, id):
> >     xstr = '<delete><id>'+id+'</id></delete>'
> >     self.doUpdateXML(xstr)
> >
> >   def add(self, **fields):
> >     #todo: XML escaping
> >     flist=['<field name="%s">%s</field>' % f for f in fields.items() ]
> >     flist.insert(0,'<add><doc>')
> >     flist.append('</doc></add>')
> >     xstr = ''.join(flist)
> >     self.doUpdateXML(xstr)
> >
> > c = SolrConnection()
> > #for i in range(10000):
> > #  c.delete(str(i))
> > for i in range(10000):
> >   c.add(id=i)
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
Yeah, I'm closing them.  Here's the method:

---------
  private String doUpdate(String sw) {
    StringBuffer updateResult = new StringBuffer();
    try {
      // open connection
      log.info("Connecting to and preparing to post to SolrUpdate
servlet.");
      URL url = new URL("http://localhost:8080/update");
      HttpURLConnection conn = (HttpURLConnection) url.openConnection();
      conn.setRequestMethod("POST");
      conn.setRequestProperty("Content-Type", "application/octet-stream");
      conn.setDoOutput(true);
      conn.setDoInput(true);
      conn.setUseCaches(false);

      // Write to server
      log.info("About to post to SolrUpdate servlet.");
      DataOutputStream output = new DataOutputStream(conn.getOutputStream
());
      output.writeBytes(sw);
      output.flush();
      output.close();
      log.info("Finished posting to SolrUpdate servlet.");

      // Read response
      log.info("Ready to read response.");
      BufferedReader rd = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
      log.info("Got reader....");
      String line;
      while ((line = rd.readLine()) != null) {
        log.info("Writing to result...");
        updateResult.append(line);
      }
      rd.close();

      // close connections
      conn.disconnect();

      log.info("Done updating Solr for site" + updateSite);
    } catch (Exception e) {
      e.printStackTrace();
    }

    return updateResult.toString();
  }
}

-Sangraal

On 7/27/06, Yonik Seeley <[hidden email]> wrote:

>
> Are you reading the response and closing the connection?  If not, you
> are probably running out of socket connections.
>
> -Yonik
>
> On 7/27/06, sangraal aiken <[hidden email]> wrote:
> > Yonik,
> > It looks like the problem is with the way I'm posting to the SolrUpdate
> > servlet. I am able to use curl to post the data to my tomcat instance
> > without a problem. It only fails when I try to handle the http post from
> > java... my code is below:
> >
> >       URL url = new URL("http://localhost:8983/solr/update");
> >       HttpURLConnection conn = (HttpURLConnection) url.openConnection();
> >       conn.setRequestMethod("POST");
> >       conn.setRequestProperty("Content-Type",
> "application/octet-stream");
> >       conn.setDoOutput(true);
> >       conn.setDoInput(true);
> >       conn.setUseCaches(false);
> >
> >       // Write to server
> >       log.info("About to post to SolrUpdate servlet.");
> >       DataOutputStream output = new DataOutputStream(
> conn.getOutputStream
> > ());
> >       output.writeBytes(sw);
> >       output.flush();
> >       log.info("Finished posting to SolrUpdate servlet.");
> >
> > -Sangraal
> >
> > On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > On 7/26/06, sangraal aiken <[hidden email]> wrote:
> > > > I removed everything from the Add xml so the docs looked like this:
> > > >
> > > > <doc>
> > > > <field name="id">187880</field>
> > > > </doc>
> > > > <doc>
> > > > <field name="id">187852</field>
> > > > </doc>
> > > >
> > > > and it still hung at 6,144...
> > >
> > > Maybe you can try the following simple Python client to try and rule
> > > out some kind of different client interactions... the attached script
> > > adds 10,000 documents and works fine for me in WinXP w/ Tomcat 5.5.17
> > > and Jetty
> > >
> > > -Yonik
> > >
> > >
> > > ------------------------------------ solr.py ----------------------
> > > import httplib
> > > import socket
> > >
> > > class SolrConnection:
> > >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> > >     self.host = host
> > >     self.solrBase = solrBase
> > >     #a connection to the server is not opened at this point.
> > >     self.conn = httplib.HTTPConnection(self.host)
> > >     #self.conn.set_debuglevel(1000000)
> > >     self.postheaders = {"Connection":"close"}
> > >
> > >   def doUpdateXML(self, request):
> > >     try:
> > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > self.postheaders)
> > >     except (socket.error,httplib.CannotSendRequest) :
> > >       #reconnect in case the connection was broken from the server
> going
> > > down,
> > >       #the server timing out our persistent connection, or another
> > >       #network failure.
> > >       #Also catch httplib.CannotSendRequest because the HTTPConnection
> > > object
> > >       #can get in a bad state.
> > >       self.conn.close()
> > >       self.conn.connect()
> > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > self.postheaders)
> > >
> > >     rsp = self.conn.getresponse()
> > >     #print rsp.status, rsp.reason
> > >     data = rsp.read()
> > >     #print "data=",data
> > >     self.conn.close()
> > >
> > >   def delete(self, id):
> > >     xstr = '<delete><id>'+id+'</id></delete>'
> > >     self.doUpdateXML(xstr)
> > >
> > >   def add(self, **fields):
> > >     #todo: XML escaping
> > >     flist=['<field name="%s">%s</field>' % f for f in fields.items() ]
> > >     flist.insert(0,'<add><doc>')
> > >     flist.append('</doc></add>')
> > >     xstr = ''.join(flist)
> > >     self.doUpdateXML(xstr)
> > >
> > > c = SolrConnection()
> > > #for i in range(10000):
> > > #  c.delete(str(i))
> > > for i in range(10000):
> > >   c.add(id=i)
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Otis Gospodnetic-2
In reply to this post by Mike Klaas
I haven't been following the thread, but....
Not sure if you are using Tomcat or Jetty, but Jetty has a POST size limit (set somewhere in its configs) that may be the source of the problem.

Otis
P.S.
Just occurred to me.
Tomcat.  Jetty.  Tom & Jerry.  Jetty guys should have called their thing Jerry or Jerrymouse.

----- Original Message ----
From: Mike Klaas <[hidden email]>
To: [hidden email]
Sent: Thursday, July 27, 2006 6:33:16 PM
Subject: Re: Doc add limit

Hi Sangraal:

Sorry--I tried not to imply that this might affect your issue.  You
may have to crank up the solr logging to determine where it is
freezing (and what might be happening).

It is certainly worth investigating why this occurs, but I wonder
about the advantages of using such huge batches.  Assuming a few
hundred bytes per document, 6100 docs produces a POST over 1MB in
size.

-Mike

On 7/27/06, sangraal aiken <[hidden email]> wrote:

> Mike,
>  I've been posting with the content type set like this:
>       conn.setRequestProperty("Content-Type", "application/octet-stream");
>
> I tried your suggestion though, and unfortunately there was no change.
>       conn.setRequestProperty("Content-Type", "text/xml; charset=utf-8");
>
> -Sangraal
>
>
> On 7/27/06, Mike Klaas <[hidden email]> wrote:
> >
> > On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> >
> > > class SolrConnection:
> > >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> > >     self.host = host
> > >     self.solrBase = solrBase
> > >     #a connection to the server is not opened at this point.
> > >     self.conn = httplib.HTTPConnection(self.host)
> > >     #self.conn.set_debuglevel(1000000)
> > >     self.postheaders = {"Connection":"close"}
> > >
> > >   def doUpdateXML(self, request):
> > >     try:
> > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > self.postheaders)
> >
> > Disgressive note: I'm not sure if it is necessary with tomcat, but in
> > my experience driving solr with python using Jetty, it was necessary
> > to specify the content-type when posting utf-8 data:
> >
> > self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})
> >
> > -Mike
> >
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

sangraal
I'm running on Tomcat... and I've verified that the complete post is making
it through the SolrUpdate servlet and into the SolrCore object... thanks for
the info though.
--
So the code is hanging on this call in SolrCore.java

            writer.write("<result status=\"" + status + "\"></result>");

The thread dump:

"http-8080-Processor24" Id=32 in RUNNABLE (running in native) total cpu
time=40698.0440ms user time=38646.1680ms
     at java.net.SocketOutputStream.socketWrite0(Native Method)
     at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
     at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
     at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(
InternalOutputBuffer.java:746)
     at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:433)
     at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:348)
     at
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite
(InternalOutputBuffer.java:769)
     at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(
ChunkedOutputFilter.java:125)
     at org.apache.coyote.http11.InternalOutputBuffer.doWrite(
InternalOutputBuffer.java:579)
     at org.apache.coyote.Response.doWrite(Response.java:559)
     at org.apache.catalina.connector.OutputBuffer.realWriteBytes(
OutputBuffer.java:361)
     at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:324)
     at org.apache.tomcat.util.buf.IntermediateOutputStream.write(
C2BConverter.java:235)
     at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java
:336)
     at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(
StreamEncoder.java:404)
     at sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408)
     at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152)
     at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213)
     at org.apache.tomcat.util.buf.WriteConvertor.flush(C2BConverter.java
:184)
     at org.apache.tomcat.util.buf.C2BConverter.flushBuffer(
C2BConverter.java:127)
     at org.apache.catalina.connector.OutputBuffer.realWriteChars(
OutputBuffer.java:536)
     at org.apache.tomcat.util.buf.CharChunk.flushBuffer(CharChunk.java:439)
     at org.apache.tomcat.util.buf.CharChunk.append(CharChunk.java:370)
     at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java
:491)
     at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java
:161)
     at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java
:170)
     at org.apache.solr.core.SolrCore.update(SolrCore.java:695)
     at org.apache.solr.servlet.SolrUpdateServlet.doPost(
SolrUpdateServlet.java:52)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
     at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
ApplicationFilterChain.java:252)
     at org.apache.catalina.core.ApplicationFilterChain.doFilter(
ApplicationFilterChain.java:173)
     at org.apache.catalina.core.StandardWrapperValve.invoke(
StandardWrapperValve.java:213)
     at org.apache.catalina.core.StandardContextValve.invoke(
StandardContextValve.java:178)
     at org.apache.catalina.core.StandardHostValve.invoke(
StandardHostValve.java:126)
     at org.apache.catalina.valves.ErrorReportValve.invoke(
ErrorReportValve.java:105)
     at org.apache.catalina.core.StandardEngineValve.invoke(
StandardEngineValve.java:107)
     at org.apache.catalina.connector.CoyoteAdapter.service(
CoyoteAdapter.java:148)
     at org.apache.coyote.http11.Http11Processor.process(
Http11Processor.java:869)
     at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
(Http11BaseProtocol.java:664)
     at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
PoolTcpEndpoint.java:527)
     at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
LeaderFollowerWorkerThread.java:80)
     at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
ThreadPool.java:684)
     at java.lang.Thread.run(Thread.java:613)

On 7/27/06, Otis Gospodnetic <[hidden email]> wrote:

>
> I haven't been following the thread, but....
> Not sure if you are using Tomcat or Jetty, but Jetty has a POST size limit
> (set somewhere in its configs) that may be the source of the problem.
>
> Otis
> P.S.
> Just occurred to me.
> Tomcat.  Jetty.  Tom & Jerry.  Jetty guys should have called their thing
> Jerry or Jerrymouse.
>
> ----- Original Message ----
> From: Mike Klaas <[hidden email]>
> To: [hidden email]
> Sent: Thursday, July 27, 2006 6:33:16 PM
> Subject: Re: Doc add limit
>
> Hi Sangraal:
>
> Sorry--I tried not to imply that this might affect your issue.  You
> may have to crank up the solr logging to determine where it is
> freezing (and what might be happening).
>
> It is certainly worth investigating why this occurs, but I wonder
> about the advantages of using such huge batches.  Assuming a few
> hundred bytes per document, 6100 docs produces a POST over 1MB in
> size.
>
> -Mike
>
> On 7/27/06, sangraal aiken <[hidden email]> wrote:
> > Mike,
> >  I've been posting with the content type set like this:
> >       conn.setRequestProperty("Content-Type",
> "application/octet-stream");
> >
> > I tried your suggestion though, and unfortunately there was no change.
> >       conn.setRequestProperty("Content-Type", "text/xml;
> charset=utf-8");
> >
> > -Sangraal
> >
> >
> > On 7/27/06, Mike Klaas <[hidden email]> wrote:
> > >
> > > On 7/27/06, Yonik Seeley <[hidden email]> wrote:
> > >
> > > > class SolrConnection:
> > > >   def __init__(self, host='localhost:8983', solrBase='/solr'):
> > > >     self.host = host
> > > >     self.solrBase = solrBase
> > > >     #a connection to the server is not opened at this point.
> > > >     self.conn = httplib.HTTPConnection(self.host)
> > > >     #self.conn.set_debuglevel(1000000)
> > > >     self.postheaders = {"Connection":"close"}
> > > >
> > > >   def doUpdateXML(self, request):
> > > >     try:
> > > >       self.conn.request('POST', self.solrBase+'/update', request,
> > > > self.postheaders)
> > >
> > > Disgressive note: I'm not sure if it is necessary with tomcat, but in
> > > my experience driving solr with python using Jetty, it was necessary
> > > to specify the content-type when posting utf-8 data:
> > >
> > > self.postheaders.update({'Content-Type': 'text/xml; charset=utf-8'})
> > >
> > > -Mike
> > >
> >
> >
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Doc add limit

Yonik Seeley-2
In reply to this post by sangraal
You might also try the Java update client here:
http://issues.apache.org/jira/browse/SOLR-20

-Yonik
12