8.3.0: Invalid UUID String while indexing document with a UUID field

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet

Hi,

I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.

I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.

In schema.xml I have:

    <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
    <fieldType name="uuid" class="solr.UUIDField"/>

The data-config is a simple select, the uuid field is of UUID type in postgres.

I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.

Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.

Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?

2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
...
Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
        at org.apache.solr.schema.StrField.createFields(StrField.java:48)
        at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
        ... 69 more
Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
        at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
        ... 72 more

Kind Regards,
Boris

This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
Few things I forgot to mention: 
- I'm running on java 8
- the collection where the problem happens is made of 3 shards with 3 NRT replicas each.


On Thu, 14 Nov 2019 at 11:52, Boris Chazalet <[hidden email]> wrote:

Hi,

I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.

I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.

In schema.xml I have:

    <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
    <fieldType name="uuid" class="solr.UUIDField"/>

The data-config is a simple select, the uuid field is of UUID type in postgres.

I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.

Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.

Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?

2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
...
Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
        at org.apache.solr.schema.StrField.createFields(StrField.java:48)
        at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
        ... 69 more
Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
        at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
        ... 72 more

Kind Regards,
Boris


--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
Also, if I manually add the document from collection's document tab in the UI (using /update handler with CSV Document Type), I just works.


On Thu, 14 Nov 2019 at 12:08, Boris Chazalet <[hidden email]> wrote:
Few things I forgot to mention: 
- I'm running on java 8
- the collection where the problem happens is made of 3 shards with 3 NRT replicas each.


On Thu, 14 Nov 2019 at 11:52, Boris Chazalet <[hidden email]> wrote:

Hi,

I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.

I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.

In schema.xml I have:

    <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
    <fieldType name="uuid" class="solr.UUIDField"/>

The data-config is a simple select, the uuid field is of UUID type in postgres.

I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.

Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.

Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?

2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
...
Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
        at org.apache.solr.schema.StrField.createFields(StrField.java:48)
        at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
        ... 69 more
Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
        at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
        at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
        ... 72 more

Kind Regards,
Boris


--

Boris Chazalet
Senior developer and problem solver
Co_watch_signature
T: +44 (0)20 3740 9402
E: [hidden email]






--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Jörn Franke
In reply to this post by Boris Chazalet
It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?

> Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>
> 
>
> Hi,
>
> I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>
> I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>
> In schema.xml I have:
>
>     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>     <fieldType name="uuid" class="solr.UUIDField"/>
>
> The data-config is a simple select, the uuid field is of UUID type in postgres.
>
> I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>
> Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>
> Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>
> 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
> at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
> at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
> ...
> Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>         ... 69 more
> Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>         ... 72 more
>
> Kind Regards,
> Boris
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email  
>
Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
Thanks for your response Jörn. Yes, I saw the prefix and I suspect this is the problem. But I'm not doing anything special in the DIH config, this is a minimalized version of it:

<dataConfig>
    <dataSource type="JdbcDataSource"
                driver="org.postgresql.Driver"
                url="jdbc:postgresql://redacted"
                readOnly="false" autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED" holdability="CLOSE_CURSORS_AT_COMMIT" application_name="solr" prepareThreshold="0" />
    <document>
        <entity name="index" pk="cinumber" transformer="RegexTransformer"
                query="
                SELECT myuuidfield, mypk
                FROM {{solr__datasource}}
                WHERE
                    '${dataimporter.request.clean}' != 'false' OR lastmodified  > '${dataimporter.last_index_time}'::timestamp - '3 hours'::interval
                ">
            <field column="myuuidfield" name="myuuidfield" />
            <field column="mypk" name="mypk" />
        </entity>
    </document>
</dataConfig>

The field is a UUID in the database, so it's definitely valid and without prefix. Where can I double check for myself of the DataImportHandler seralises an UUID in the code?



On Thu, 14 Nov 2019 at 13:38, Jörn Franke <[hidden email]> wrote:
It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?

> Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>
> 
>
> Hi,
>
> I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>
> I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>
> In schema.xml I have:
>
>     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>     <fieldType name="uuid" class="solr.UUIDField"/>
>
> The data-config is a simple select, the uuid field is of UUID type in postgres.
>
> I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>
> Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>
> Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>
> 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
> at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
> at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
> ...
> Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>         ... 69 more
> Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>         ... 72 more
>
> Kind Regards,
> Boris
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email 
>


--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
I dug a little in the dataimport code, and there's a special case for BigDecimal in the JdbcDataSource class, here exactly: 

I believe we might need the same kind of logic for a UUID object coming directly from the jdbc driver.

On Thu, 14 Nov 2019 at 13:46, Boris Chazalet <[hidden email]> wrote:
Thanks for your response Jörn. Yes, I saw the prefix and I suspect this is the problem. But I'm not doing anything special in the DIH config, this is a minimalized version of it:

<dataConfig>
    <dataSource type="JdbcDataSource"
                driver="org.postgresql.Driver"
                url="jdbc:postgresql://redacted"
                readOnly="false" autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED" holdability="CLOSE_CURSORS_AT_COMMIT" application_name="solr" prepareThreshold="0" />
    <document>
        <entity name="index" pk="cinumber" transformer="RegexTransformer"
                query="
                SELECT myuuidfield, mypk
                FROM {{solr__datasource}}
                WHERE
                    '${dataimporter.request.clean}' != 'false' OR lastmodified  > '${dataimporter.last_index_time}'::timestamp - '3 hours'::interval
                ">
            <field column="myuuidfield" name="myuuidfield" />
            <field column="mypk" name="mypk" />
        </entity>
    </document>
</dataConfig>

The field is a UUID in the database, so it's definitely valid and without prefix. Where can I double check for myself of the DataImportHandler seralises an UUID in the code?



On Thu, 14 Nov 2019 at 13:38, Jörn Franke <[hidden email]> wrote:
It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?

> Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>
> 
>
> Hi,
>
> I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>
> I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>
> In schema.xml I have:
>
>     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>     <fieldType name="uuid" class="solr.UUIDField"/>
>
> The data-config is a simple select, the uuid field is of UUID type in postgres.
>
> I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>
> Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>
> Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>
> 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
> at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
> at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
> ...
> Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>         ... 69 more
> Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>         ... 72 more
>
> Kind Regards,
> Boris
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email 
>


--

Boris Chazalet
Senior developer and problem solver
Co_watch_signature
T: +44 (0)20 3740 9402
E: [hidden email]






--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
If that's right, I'd be happy to try to provide a bug fix, although do not know the first about contributing to the solr project.

On Thu, 14 Nov 2019 at 14:09, Boris Chazalet <[hidden email]> wrote:
I dug a little in the dataimport code, and there's a special case for BigDecimal in the JdbcDataSource class, here exactly: 

I believe we might need the same kind of logic for a UUID object coming directly from the jdbc driver.

On Thu, 14 Nov 2019 at 13:46, Boris Chazalet <[hidden email]> wrote:
Thanks for your response Jörn. Yes, I saw the prefix and I suspect this is the problem. But I'm not doing anything special in the DIH config, this is a minimalized version of it:

<dataConfig>
    <dataSource type="JdbcDataSource"
                driver="org.postgresql.Driver"
                url="jdbc:postgresql://redacted"
                readOnly="false" autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED" holdability="CLOSE_CURSORS_AT_COMMIT" application_name="solr" prepareThreshold="0" />
    <document>
        <entity name="index" pk="cinumber" transformer="RegexTransformer"
                query="
                SELECT myuuidfield, mypk
                FROM {{solr__datasource}}
                WHERE
                    '${dataimporter.request.clean}' != 'false' OR lastmodified  > '${dataimporter.last_index_time}'::timestamp - '3 hours'::interval
                ">
            <field column="myuuidfield" name="myuuidfield" />
            <field column="mypk" name="mypk" />
        </entity>
    </document>
</dataConfig>

The field is a UUID in the database, so it's definitely valid and without prefix. Where can I double check for myself of the DataImportHandler seralises an UUID in the code?



On Thu, 14 Nov 2019 at 13:38, Jörn Franke <[hidden email]> wrote:
It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?

> Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>
> 
>
> Hi,
>
> I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>
> I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>
> In schema.xml I have:
>
>     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>     <fieldType name="uuid" class="solr.UUIDField"/>
>
> The data-config is a simple select, the uuid field is of UUID type in postgres.
>
> I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>
> Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>
> Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>
> 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
> at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
> at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
> at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
> ...
> Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>         ... 69 more
> Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>         ... 72 more
>
> Kind Regards,
> Boris
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email 
>


--

Boris Chazalet
Senior developer and problem solver
Co_watch_signature
T: +44 (0)20 3740 9402
E: [hidden email]






--

Boris Chazalet
Senior developer and problem solver
Co_watch_signature
T: +44 (0)20 3740 9402
E: [hidden email]






--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Jörn Franke
In reply to this post by Boris Chazalet
You can use an updateScript handler to do this kind of postprocessing or you can cast it in your sql Statement as string.


> Am 14.11.2019 um 14:09 schrieb Boris Chazalet <[hidden email]>:
>
> 
> I dug a little in the dataimport code, and there's a special case for BigDecimal in the JdbcDataSource class, here exactly:
> https://github.com/apache/lucene-solr/blob/faaee86efb01fa6e431fcb129cfb956c7d62d514/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/JdbcDataSource.java#L403
>
> I believe we might need the same kind of logic for a UUID object coming directly from the jdbc driver.
>
>> On Thu, 14 Nov 2019 at 13:46, Boris Chazalet <[hidden email]> wrote:
>> Thanks for your response Jörn. Yes, I saw the prefix and I suspect this is the problem. But I'm not doing anything special in the DIH config, this is a minimalized version of it:
>>
>> <dataConfig>
>>     <dataSource type="JdbcDataSource"
>>                 driver="org.postgresql.Driver"
>>                 url="jdbc:postgresql://redacted"
>>                 readOnly="false" autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED" holdability="CLOSE_CURSORS_AT_COMMIT" application_name="solr" prepareThreshold="0" />
>>     <document>
>>         <entity name="index" pk="cinumber" transformer="RegexTransformer"
>>                 query="
>>                 SELECT myuuidfield, mypk
>>                 FROM {{solr__datasource}}
>>                 WHERE
>>                     '${dataimporter.request.clean}' != 'false' OR lastmodified  > '${dataimporter.last_index_time}'::timestamp - '3 hours'::interval
>>                 ">
>>             <field column="myuuidfield" name="myuuidfield" />
>>             <field column="mypk" name="mypk" />
>>         </entity>
>>     </document>
>> </dataConfig>
>>
>> The field is a UUID in the database, so it's definitely valid and without prefix. Where can I double check for myself of the DataImportHandler seralises an UUID in the code?
>>
>>
>>
>>> On Thu, 14 Nov 2019 at 13:38, Jörn Franke <[hidden email]> wrote:
>>> It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?
>>>
>>> > Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>>> >
>>> > 
>>> >
>>> > Hi,
>>> >
>>> > I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>>> >
>>> > I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>>> >
>>> > In schema.xml I have:
>>> >
>>> >     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>>> >     <fieldType name="uuid" class="solr.UUIDField"/>
>>> >
>>> > The data-config is a simple select, the uuid field is of UUID type in postgres.
>>> >
>>> > I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>>> >
>>> > Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>>> >
>>> > Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>>> >
>>> > 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
>>> > at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
>>> > at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
>>> > at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
>>> > ...
>>> > Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>>> >         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>>> >         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>>> >         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>>> >         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>>> >         ... 69 more
>>> > Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>>> >         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>>> >         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>>> >         ... 72 more
>>> >
>>> > Kind Regards,
>>> > Boris
>>> >
>>> > This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>>> >
>>> > Company Watch Limited is a company registered in England & Wales with company number 3597613
>>> > Centurion House, 37 Jewry Street, London, EC3N 2ER
>>> >
>>> > Please consider the environment before printing this email  
>>> >
>>
>>
>> --
>> Boris Chazalet
>> Senior developer and problem solver
>>
>> T: +44 (0)20 3740 9402
>> E: [hidden email]
>>
>>
>>
>>
>>
>>
>
>
> --
> Boris Chazalet
> Senior developer and problem solver
>
> T: +44 (0)20 3740 9402
> E: [hidden email]
>
>
>
>
>
>
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email  
>
Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
I'm currently re-indexing with the cast to string in the sql statement. It looks good so far.

On Thu, 14 Nov 2019 at 14:13, Jörn Franke <[hidden email]> wrote:
You can use an updateScript handler to do this kind of postprocessing or you can cast it in your sql Statement as string.


> Am 14.11.2019 um 14:09 schrieb Boris Chazalet <[hidden email]>:
>
> 
> I dug a little in the dataimport code, and there's a special case for BigDecimal in the JdbcDataSource class, here exactly:
> https://github.com/apache/lucene-solr/blob/faaee86efb01fa6e431fcb129cfb956c7d62d514/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/JdbcDataSource.java#L403
>
> I believe we might need the same kind of logic for a UUID object coming directly from the jdbc driver.
>
>> On Thu, 14 Nov 2019 at 13:46, Boris Chazalet <[hidden email]> wrote:
>> Thanks for your response Jörn. Yes, I saw the prefix and I suspect this is the problem. But I'm not doing anything special in the DIH config, this is a minimalized version of it:
>>
>> <dataConfig>
>>     <dataSource type="JdbcDataSource"
>>                 driver="org.postgresql.Driver"
>>                 url="jdbc:postgresql://redacted"
>>                 readOnly="false" autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED" holdability="CLOSE_CURSORS_AT_COMMIT" application_name="solr" prepareThreshold="0" />
>>     <document>
>>         <entity name="index" pk="cinumber" transformer="RegexTransformer"
>>                 query="
>>                 SELECT myuuidfield, mypk
>>                 FROM {{solr__datasource}}
>>                 WHERE
>>                     '${dataimporter.request.clean}' != 'false' OR lastmodified  > '${dataimporter.last_index_time}'::timestamp - '3 hours'::interval
>>                 ">
>>             <field column="myuuidfield" name="myuuidfield" />
>>             <field column="mypk" name="mypk" />
>>         </entity>
>>     </document>
>> </dataConfig>
>>
>> The field is a UUID in the database, so it's definitely valid and without prefix. Where can I double check for myself of the DataImportHandler seralises an UUID in the code?
>>
>>
>>
>>> On Thu, 14 Nov 2019 at 13:38, Jörn Franke <[hidden email]> wrote:
>>> It seems there is a prefix java.util.UUID: in front of your UUID. Any idea where it comes from? Is it also like this in the database? Is your import handler maybe receiving a java object java.util.UUID and it is not converted correctly to string?
>>>
>>> > Am 14.11.2019 um 11:52 schrieb Boris Chazalet <[hidden email]>:
>>> >
>>> > 
>>> >
>>> > Hi,
>>> >
>>> > I'm running into an issue with Solr 8.3.0: it fails at indexing a schema with UUID field.
>>> >
>>> > I'm using a SolrCloud setup with 3 instances, and I'm using the DIH to fetch and index the data from a postgres database.
>>> >
>>> > In schema.xml I have:
>>> >
>>> >     <field name="myuuidfield" type="uuid" uninvertible="false" indexed="true" stored="true" multiValued="false" required="false"/>
>>> >     <fieldType name="uuid" class="solr.UUIDField"/>
>>> >
>>> > The data-config is a simple select, the uuid field is of UUID type in postgres.
>>> >
>>> > I was running 7.7.2 until now, and first noticed the problem there. But given the number of things around UUIDs fixed in the latest version, I thought I'd try 8.3.0 first. The same problem arises while running the DIH.
>>> >
>>> > Note that I have another core with a uuid field, which I am indexing externally (i.e. not from the DIH) and I haven't had a problem there, so I'm suspecting the problem might be in the DIH logic, but with no certainty.
>>> >
>>> > Below is a truncated version of exception's stacktrace I see in the logs. I can provide the full one if necessary. Is this a legitimate bug? What can I do to help tracking down the problem?
>>> >
>>> > 2019-11-13 17:29:55.430 ERROR (qtp1990098664-15) [c:db_c s:shard1 r:core_node5 x:db_c_shard1_replica_n2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myuuidfield'='java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba' msg=Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:afa9cf35-0b2d-e811-89a7-0025900429ba'
>>> > at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:215)
>>> > at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:109)
>>> > at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:969)
>>> > ...
>>> > Caused by: org.apache.solr.common.SolrException: Error while creating field 'myuuidfield{type=uuid,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,useDocValuesAsStored}' from value 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>>> >         at org.apache.solr.schema.FieldType.createField(FieldType.java:291)
>>> >         at org.apache.solr.schema.StrField.createFields(StrField.java:48)
>>> >         at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:65)
>>> >         at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:171)
>>> >         ... 69 more
>>> > Caused by: org.apache.solr.common.SolrException: Invalid UUID String: 'java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba'
>>> >         at org.apache.solr.schema.UUIDField.toInternal(UUIDField.java:88)
>>> >         at org.apache.solr.schema.FieldType.createField(FieldType.java:289)
>>> >         ... 72 more
>>> >
>>> > Kind Regards,
>>> > Boris
>>> >
>>> > This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>>> >
>>> > Company Watch Limited is a company registered in England & Wales with company number 3597613
>>> > Centurion House, 37 Jewry Street, London, EC3N 2ER
>>> >
>>> > Please consider the environment before printing this email 
>>> >
>>
>>
>> --
>> Boris Chazalet
>> Senior developer and problem solver
>>
>> T: +44 (0)20 3740 9402
>> E: [hidden email]
>>
>>
>>
>>
>>
>>
>
>
> --
> Boris Chazalet
> Senior developer and problem solver
>
> T: +44 (0)20 3740 9402
> E: [hidden email]
>
>
>
>
>
>
>
> This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you.
>
> Company Watch Limited is a company registered in England & Wales with company number 3597613
> Centurion House, 37 Jewry Street, London, EC3N 2ER
>
> Please consider the environment before printing this email 
>


--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Erick Erickson
But now your uuid fields will look like this, right?

java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba

This looks like somewhere in DIH it’s doing a cast from an object….

Which will be a real head-scratcher for anyone looking at these. There are three other alternatives I can think of:

1> make your SQL statement output this as some kind of string.
2> The aforementioned ScriptUpdateProcessor can transform this into "4ee3992e-0b2d-e811-89a7-0025900429ba” with your favorite scripting language
3> use a PatternReplaceCharFilter to transform this before it gets to the indexing process. I’m not totally sure this’ll work, I’m not sure where this check is done, but it’d be the easiest if it does.

Best,
Erick



> On Nov 14, 2019, at 8:17 AM, Boris Chazalet <[hidden email]> wrote:
>
> java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
Thanks both for the advice.

Erick, which message were you referring to when you said "But now your uuid fields will look like this, right?"?

I finished indexing my 45 millions documents successfully by casting the UUID in the SQL itself like this (that's for a postgres db):
SELECT myuuidfield::text, mypk FROM {{solr__datasource}}

I'm happy with the workaround as I can keep my UUID type in the solr schema and in my database, but it still feels that something in the data import logic isn't handling the UUID type coming from the JDBC driver correctly.

On Thu, 14 Nov 2019 at 14:43, Erick Erickson <[hidden email]> wrote:
But now your uuid fields will look like this, right?

java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba

This looks like somewhere in DIH it’s doing a cast from an object….

Which will be a real head-scratcher for anyone looking at these. There are three other alternatives I can think of:

1> make your SQL statement output this as some kind of string.
2> The aforementioned ScriptUpdateProcessor can transform this into "4ee3992e-0b2d-e811-89a7-0025900429ba” with your favorite scripting language
3> use a PatternReplaceCharFilter to transform this before it gets to the indexing process. I’m not totally sure this’ll work, I’m not sure where this check is done, but it’d be the easiest if it does.

Best,
Erick



> On Nov 14, 2019, at 8:17 AM, Boris Chazalet <[hidden email]> wrote:
>
> java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba



--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email
  

Reply | Threaded
Open this post in threaded view
|

Re: 8.3.0: Invalid UUID String while indexing document with a UUID field

Boris Chazalet
Last thing to note, that doesn't happen on the standalone 8.3.0 version of solr, as this is where I did my preliminary testing without any problem.

On Thu, 14 Nov 2019 at 15:25, Boris Chazalet <[hidden email]> wrote:
Thanks both for the advice.

Erick, which message were you referring to when you said "But now your uuid fields will look like this, right?"?

I finished indexing my 45 millions documents successfully by casting the UUID in the SQL itself like this (that's for a postgres db):
SELECT myuuidfield::text, mypk FROM {{solr__datasource}}

I'm happy with the workaround as I can keep my UUID type in the solr schema and in my database, but it still feels that something in the data import logic isn't handling the UUID type coming from the JDBC driver correctly.

On Thu, 14 Nov 2019 at 14:43, Erick Erickson <[hidden email]> wrote:
But now your uuid fields will look like this, right?

java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba

This looks like somewhere in DIH it’s doing a cast from an object….

Which will be a real head-scratcher for anyone looking at these. There are three other alternatives I can think of:

1> make your SQL statement output this as some kind of string.
2> The aforementioned ScriptUpdateProcessor can transform this into "4ee3992e-0b2d-e811-89a7-0025900429ba” with your favorite scripting language
3> use a PatternReplaceCharFilter to transform this before it gets to the indexing process. I’m not totally sure this’ll work, I’m not sure where this check is done, but it’d be the easiest if it does.

Best,
Erick



> On Nov 14, 2019, at 8:17 AM, Boris Chazalet <[hidden email]> wrote:
>
> java.util.UUID:4ee3992e-0b2d-e811-89a7-0025900429ba



--

Boris Chazalet
Senior developer and problem solver
Co_watch_signature
T: +44 (0)20 3740 9402
E: [hidden email]






--

Boris Chazalet
Senior developer and problem solver

T: +44 (0)20 3740 9402
E: [hidden email]





This message is intended only for the addressee and unless otherwise stated is commercial in confidence and may contain information that is privileged.  Where all recipients are in the companywatch.net domain, this communication is classified as Confidential.  Unauthorised use is strictly prohibited and may be unlawful. If you are not the addressee, you should not read, copy, disclose or otherwise use this message, except for the purpose of delivery to the addressee. If you have received this in error, please delete and advise us immediately. Although Company Watch makes every reasonable effort to keep its network and systems free from viruses, the company accepts no responsibility for computer viruses transmitted through this mail or in any attachments. It is your responsibility to virus scan any attachments we send to you. 

Company Watch Limited is a company registered in England & Wales with company number 3597613
Centurion House, 37 Jewry Street, London, EC3N 2ER

Please consider the environment before printing this email