Quantcast

DIH fails after processing roughly 10million records

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

DIH fails after processing roughly 10million records

vijeshnair
Solr version : 4.0 (running with 9GB of RAM)
MySQL : 5.5
JDBC : mysql-connector-java-5.1.22-bin.jar

I am trying to run the full import for my catalog data which is roughly 13million of products. The DIH ran smoothly for 18 hours, and processed roughly 10million of records. But all of a sudden it broke due to the jdbc exception i.e. Communication failure with the server. I did an extensive googling on this topic, and there are multiple recommendation to use "readonly=true", "autocommit=true" etc. If I understand it correctly, the possible reason is when DIH stops indexing due to the segment merging, and when it tries to reconnect with the server. When index is slightly large and multiple merging happening at the same time, DIH stops indexing for some time, and by the time it re-starts MySQL would have already discontinued the connection. So I am going to increase the wait time out at MySQL side from the default 120 to some thing slightly large, to see if that solve the issue or not. I would know the result of that approach only after completing one full run, which I will update you tomorrow. Mean time I thought of validating my approach, and checking with you for any other fix which exist.

Here is the error stack

Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@32d051c1 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:923)
        at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:3234)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2399)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2728)
        at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4908)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4794)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
SEVERE: Ignoring Error when closing connection
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
        at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
        at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
        at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
        at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
        at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
        at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Jan 8, 2013 12:44:00 PM org.apache.solr.common.SolrException log
SEVERE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT DISTINCT mpn FROM product_retailers WHERE product_id = '9393557198' Processing Document # 10194031
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT DISTINCT mpn FROM product_retailers WHERE product_id = '9393557198' Processing Document # 10194031
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
        ... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT DISTINCT mpn FROM product_retailers WHERE product_id = '9393557198' Processing Document # 10194031
        at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)
        at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:252)
        at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:209)
        at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:38)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:472)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
        ... 5 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 0 milliseconds ago.  The last packet sent successfully to the server was 0 milliseconds ago.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3589)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2728)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2678)
        at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:894)
        at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:732)
        at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:245)
        ... 13 more
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3039)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489)
        ... 22 more
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Travis Low
What you describe sounds right to me and seems consistent with the error
stacktrace..  I would increase the MySQL wait_timeout to 3600 and,
depending on your server, you might want to also increase max_connections.

cheers,

Travis

On Tue, Jan 8, 2013 at 4:10 AM, vijeshnair <[hidden email]> wrote:

> Solr version : 4.0 (running with 9GB of RAM)
> MySQL : 5.5
> JDBC : mysql-connector-java-5.1.22-bin.jar
>
> I am trying to run the full import for my catalog data which is roughly
> 13million of products. The DIH ran smoothly for 18 hours, and processed
> roughly 10million of records. But all of a sudden it broke due to the jdbc
> exception i.e. Communication failure with the server. I did an extensive
> googling on this topic, and there are multiple recommendation to use
> "readonly=true", "autocommit=true" etc. If I understand it correctly, the
> possible reason is when DIH stops indexing due to the segment merging, and
> when it tries to reconnect with the server. When index is slightly large
> and
> multiple merging happening at the same time, DIH stops indexing for some
> time, and by the time it re-starts MySQL would have already discontinued
> the
> connection. So I am going to increase the wait time out at MySQL side from
> the default 120 to some thing slightly large, to see if that solve the
> issue
> or not. I would know the result of that approach only after completing one
> full run, which I will update you tomorrow. Mean time I thought of
> validating my approach, and checking with you for any other fix which
> exist.
>
> Here is the error stack
>
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> java.sql.SQLException: Streaming result set
> com.mysql.jdbc.RowDataDynamic@32d051c1 is still active. No statements may
> be
> issued when any streaming result sets are open and in use on a given
> connection. Ensure that you have called .close() on any active streaming
> result sets before attempting more queries.
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:923)
>         at
> com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:3234)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2399)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
>         at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2728)
>         at
> com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4908)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4794)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transaction resolution
> unknown.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at com.mysql.jdbc.Util.getInstance(Util.java:386)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transaction resolution
> unknown.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at com.mysql.jdbc.Util.getInstance(Util.java:386)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transaction resolution
> unknown.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at com.mysql.jdbc.Util.getInstance(Util.java:386)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transaction resolution
> unknown.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at com.mysql.jdbc.Util.getInstance(Util.java:386)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.handler.dataimport.JdbcDataSource
> closeConnection
> SEVERE: Ignoring Error when closing connection
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transaction resolution
> unknown.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at com.mysql.jdbc.Util.getInstance(Util.java:386)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1014)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:988)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:974)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:919)
>         at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4808)
>         at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4403)
>         at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1594)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:400)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:391)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:291)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:293)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:280)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Jan 8, 2013 12:44:00 PM org.apache.solr.common.SolrException log
> SEVERE: Full Import failed:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT DISTINCT mpn FROM product_retailers WHERE product_id
> =
> '9393557198' Processing Document # 10194031
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Caused by: java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT DISTINCT mpn FROM product_retailers WHERE product_id
> =
> '9393557198' Processing Document # 10194031
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
>         ... 3 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable to execute query: SELECT DISTINCT mpn FROM product_retailers WHERE
> product_id = '9393557198' Processing Document # 10194031
>         at
>
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:252)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:209)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:38)
>         at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>         at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>         at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:472)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
>         ... 5 more
> Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
> Communications link failure
>
> The last packet successfully received from the server was 0 milliseconds
> ago.  The last packet sent successfully to the server was 0 milliseconds
> ago.
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>         at
> com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
>         at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3589)
>         at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
>         at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2728)
>         at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2678)
>         at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:894)
>         at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:732)
>         at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:245)
>         ... 13 more
> Caused by: java.io.EOFException: Can not read response from server.
> Expected
> to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
>         at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3039)
>         at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489)
>         ... 22 more
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/DIH-fails-after-processing-roughly-10million-records-tp4031508.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--

**

*Travis Low, Director of Development*


** <[hidden email]>* *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* <http://www.centurionresearch.com>

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed
to be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from
the content of this email.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Shawn Heisey-4
In reply to this post by vijeshnair
On 1/8/2013 2:10 AM, vijeshnair wrote:

> Solr version : 4.0 (running with 9GB of RAM)
> MySQL : 5.5
> JDBC : mysql-connector-java-5.1.22-bin.jar
>
> I am trying to run the full import for my catalog data which is roughly
> 13million of products. The DIH ran smoothly for 18 hours, and processed
> roughly 10million of records. But all of a sudden it broke due to the jdbc
> exception i.e. Communication failure with the server. I did an extensive
> googling on this topic, and there are multiple recommendation to use
> "readonly=true", "autocommit=true" etc. If I understand it correctly, the
> possible reason is when DIH stops indexing due to the segment merging, and
> when it tries to reconnect with the server. When index is slightly large and
> multiple merging happening at the same time, DIH stops indexing for some
> time, and by the time it re-starts MySQL would have already discontinued the
> connection. So I am going to increase the wait time out at MySQL side from
> the default 120 to some thing slightly large, to see if that solve the issue
> or not. I would know the result of that approach only after completing one
> full run, which I will update you tomorrow. Mean time I thought of
> validating my approach, and checking with you for any other fix which exist.


This is how I fixed it.  On version 4, this goes in the indexConfig
section.  On 3.x it goes into indexDefaults:

   <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
     <int name="maxThreadCount">4</int>
     <int name="maxMergeCount">4</int>
   </mergeScheduler>

A recent jira issue (LUCENE-4661) changed the maxThreadCount to 1 for
better performance, so I'm not sure if both of my changes above are
actually required or if just maxMergeCount will fix it.  I commented on
the issue to find out.

https://issues.apache.org/jira/browse/LUCENE-4661

If I don't get a definitive answer soon, I'll go ahead and test for myself.

Side question: you're already setting batchSize to a negative number, right?

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Shawn Heisey-4
> A recent jira issue (LUCENE-4661) changed the maxThreadCount to 1 for
> better performance, so I'm not sure if both of my changes above are
> actually required or if just maxMergeCount will fix it.  I commented on
> the issue to find out.

Discussion on the issue has suggested that a maxThreadCount of 1 and a
maxMergeCount of 6 will probably make sure this issue never happens and
that I get the best possible performance for spinning-magnet disks.

I will be testing this theory when I make it into work today.

Thanks,
Shawn



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

vijeshnair
In reply to this post by Shawn Heisey-4
Yes Shawn, the batchSize is -1 only and I also have the mergeScheduler exactly same as you mentioned.  When I had this problem in SOLR 3.4, I did an extensive googling and gathered much of the tweaks and tuning from different blogs and forums and configured the 4.0 instance. My next full run is scheduled for this weekend, I will try with a higher mysql wait_timeout value and update you the outcome.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Shawn Heisey-4
On 1/8/2013 11:19 PM, vijeshnair wrote:
> Yes Shawn, the batchSize is -1 only and I also have the mergeScheduler
> exactly same as you mentioned.  When I had this problem in SOLR 3.4, I did
> an extensive googling and gathered much of the tweaks and tuning from
> different blogs and forums and configured the 4.0 instance. My next full run
> is scheduled for this weekend, I will try with a higher mysql wait_timeout
> value and update you the outcome.

With maxThreadCount at 1 and maxMergeCount at 6, I was able to complete
full-import with no problems.  All mysql (5.1.61) server-side timeouts
are at their defaults - they don't show up in my.cnf and I haven't
tweaked them anywhere else either.

A full import for me consists of six simultaneous imports into six Solr
cores, each of which is over 12 million rows.  It takes three hours, and
each of those six imports creates a 16GB index on Solr 4.1-SNAPSHOT,
22GB on Solr 3.5.0.  There is a seventh import as well, but it only does
a few hundred thousand rows.  That one finishes before any major merging
takes place.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Shawn Heisey-4
On 1/9/2013 9:41 AM, Shawn Heisey wrote:

> With maxThreadCount at 1 and maxMergeCount at 6, I was able to complete
> full-import with no problems.  All mysql (5.1.61) server-side timeouts
> are at their defaults - they don't show up in my.cnf and I haven't
> tweaked them anywhere else either.
>
> A full import for me consists of six simultaneous imports into six Solr
> cores, each of which is over 12 million rows.  It takes three hours, and
> each of those six imports creates a 16GB index on Solr 4.1-SNAPSHOT,
> 22GB on Solr 3.5.0.  There is a seventh import as well, but it only does
> a few hundred thousand rows.  That one finishes before any major merging
> takes place.

Full timeout info:

mysql> SHOW SESSION VARIABLES LIKE '%timeout%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| connect_timeout            | 10    |
| delayed_insert_timeout     | 300   |
| innodb_lock_wait_timeout   | 50    |
| innodb_rollback_on_timeout | OFF   |
| interactive_timeout        | 28800 |
| net_read_timeout           | 30    |
| net_write_timeout          | 60    |
| slave_net_timeout          | 3600  |
| table_lock_wait_timeout    | 50    |
| wait_timeout               | 28800 |
+----------------------------+-------+
10 rows in set (0.00 sec)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Lance Norskog-2
In reply to this post by vijeshnair
At this scale, your indexing job is prone to break in various ways.
If you want this to be reliable, it should be able to restart in the
middle of an upload, rather than starting over.

On 01/08/2013 10:19 PM, vijeshnair wrote:

> Yes Shawn, the batchSize is -1 only and I also have the mergeScheduler
> exactly same as you mentioned.  When I had this problem in SOLR 3.4, I did
> an extensive googling and gathered much of the tweaks and tuning from
> different blogs and forums and configured the 4.0 instance. My next full run
> is scheduled for this weekend, I will try with a higher mysql wait_timeout
> value and update you the outcome.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/DIH-fails-after-processing-roughly-10million-records-tp4031508p4031779.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

vijeshnair
First of all thanks to Shawn and all of you folks. The good news is my full indexing started working fine, and it's pretty fast too. The entire 12.5 million of catalog data i.e. roughly 11GB got indexed in less than 6 hrs on my windows development machine, which is a quad core, 4gb windows 7 PC. Infact I wanted to share the memory snapshot with you guyz, it's pretty awesome. The heap which I have allocated for the SOLR tomcat was only 1GB, and at any point of time it didn't cross half mark of that.

Now the major changes which I have made during this run are the following

- Previously my DIH config file had roughly 6 sub entities configured under the root entity, so obviously it was creating so many additional connections and db look-ups. I have deleted the sub entities and wrote it as one big query, which is configured in the root entity only.

- No changes were made in the MySQL side, like increasing wait_timeout etc.

- In addition to this I am using the following values for indexConfig in the solrconfig, I am mentioning only the modified properties, rest of them you can assume as the SOLR default.

<ramBufferSizeMB>128</ramBufferSizeMB>
<commitLockTimeout>10000</commitLockTimeout>
<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
          <int name="maxMergeAtOnce">35</int>
          <int name="segmentsPerTier">35</int>
 </mergePolicy>

I am yet to evaluate the indexing performance with the SOLR default values for the above fields.

As Shawn I suggested, I have used the following values for the ConcurrentMergeScheduler
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
         <int name="maxMergeCount">6</int>
          <int name="maxThreadCount">1</int>
</mergeScheduler>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DIH fails after processing roughly 10million records

Erick Erickson
Thanks for closing this off and showing what your changes were that made
the difference. I'm sure others will find this very useful!

Erick


On Fri, Jan 11, 2013 at 12:56 AM, vijeshnair <[hidden email]> wrote:

> First of all thanks to Shawn and all of you folks. The good news is my full
> indexing started working fine, and it's pretty fast too. The entire 12.5
> million of catalog data i.e. roughly 11GB got indexed in less than 6 hrs on
> my windows development machine, which is a quad core, 4gb windows 7 PC.
> Infact I wanted to share the memory snapshot with you guyz, it's pretty
> awesome. The heap which I have allocated for the SOLR tomcat was only 1GB,
> and at any point of time it didn't cross half mark of that.
>
> Now the major changes which I have made during this run are the following
>
> - Previously my DIH config file had roughly 6 sub entities configured under
> the root entity, so obviously it was creating so many additional
> connections
> and db look-ups. I have deleted the sub entities and wrote it as one big
> query, which is configured in the root entity only.
>
> - No changes were made in the MySQL side, like increasing wait_timeout etc.
>
> - In addition to this I am using the following values for indexConfig in
> the
> solrconfig, I am mentioning only the modified properties, rest of them you
> can assume as the SOLR default.
>
> <ramBufferSizeMB>128</ramBufferSizeMB>
> <commitLockTimeout>10000</commitLockTimeout>
> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>           <int name="maxMergeAtOnce">35</int>
>           <int name="segmentsPerTier">35</int>
>  </mergePolicy>
>
> I am yet to evaluate the indexing performance with the SOLR default values
> for the above fields.
>
> As Shawn I suggested, I have used the following values for the
> ConcurrentMergeScheduler
> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
>           <int name="maxMergeCount">6</int>
>           <int name="maxThreadCount">1</int>
> </mergeScheduler>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/DIH-fails-after-processing-roughly-10million-records-tp4031508p4032440.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Loading...