NullPointerException in DataImportHandler

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

NullPointerException in DataImportHandler

Andrew Clegg
First of all, apologies if you get this twice. I posted it by email an hour ago but it hasn't appeared in any of the archives, so I'm worried it's got junked somewhere.

I'm trying to use a DataImportHandler to merge some data from a database with some other fields from a collection of XML files, rather like the example in the Architecture section here:

http://wiki.apache.org/solr/DataImportHandler

... so a given document is built from some fields from the database and some from the XML.

My dataconfig.xml looks like this:


<dataConfig>
   <dataSource name="database" driver="org.postgresql.Driver" url="jdbc:postgresql://cathdb.info/cathdb_v3_3_0" user="cathreader" password="cathreader" />

   <dataSource name="filesystem" type="FileDataSource" basePath="/cath/people/cathdata/v3_3_0/pdb-XML-noatom/" encoding="UTF-8" connectionTimeout="5000" readTimeout="10000"/>

   <document name="domain">

       <entity name="domain" dataSource="database" query="select domain_id as id, 'PDB code ' || pdb_code || ', chain ' || chain_code || ', domain ' || domain_code as title, 'some keywords go here' as
keywords, pdb_code || ' ' || chain_id as related_ids, 'domain' as doc_type, pdb_code from domain">

           <entity dataSource="filesystem" name="domain_pdb" url="${domain.pdb_code}-noatom.xml">
               <field column="content" xpath="//*[local-name()='structCategory']/*[local-name()='struct']/*[local-name()='title']" />
           </entity>

       </entity>

   </document>
</dataConfig>


This works if I comment out the inner entity, but when I uncomment it, I get this error:


30-Jul-2009 14:32:50 org.apache.solr.handler.dataimport.DocBuilder buildDocument
SEVERE: Exception while processing: domain document :
SolrInputDocument[{id=id(1.0)={1s32D00}, title=title(1.0)={PDB code
1s32, chain D, domain 00}, keywords=keywords(1.0)={some ke
ywords go here}, pdb_code=pdb_code(1.0)={1s32},
doc_type=doc_type(1.0)={domain}, related_ids=related_ids(1.0)={1s32
1s32D}}]
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NullPointerException
       at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:64)
       at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71)
       at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
       at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:344)
       at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:372)
       at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225)
       at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
       at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
       at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
       at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.NullPointerException
       at java.io.File.<init>(File.java:222)
       at org.apache.solr.handler.dataimport.FileDataSource.getData(FileDataSource.java:75)
       at org.apache.solr.handler.dataimport.FileDataSource.getData(FileDataSource.java:44)
       at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58)
       ... 9 more


I have checked that the file /cath/people/cathdata/v3_3_0/pdb-XML-noatom/1s32-noatom.xml is readable, so maybe the full path to the file isn't being constructed properly or something?

I also tried with the full path template for the file in the entity url attribute, instead of using a basePath in the dataSource, but I get exactly the same exception.

This is with the 2009-07-30 nightly build. See attached for schema. schema.xml

Any ideas? Thanks in advance!

Andrew.


--
:: http://biotext.org.uk/ ::
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Chantal Ackermann
Hi Andrew,

your inner entity uses an XML type datasource. The default entity
processor is the SQL one, however.

For your inner entity, you have to specify the correct entity processor
explicitly. You do that by adding the attribute "processor", and the
value is the classname of the processor you want to use.

e.g. <entity dataSource="filesystem" name="domain_pdb"
processor="XPathEntityProcessor" ....

(See the wikipedia example on the DataImportHandler wiki page.)

Cheers,
Chantal

Andrew Clegg schrieb:

> First of all, apologies if you get this twice. I posted it by email an hour
> ago but it hasn't appeared in any of the archives, so I'm worried it's got
> junked somewhere.
>
> I'm trying to use a DataImportHandler to merge some data from a database
> with some other fields from a collection of XML files, rather like the
> example in the Architecture section here:
>
> http://wiki.apache.org/solr/DataImportHandler
>
> ... so a given document is built from some fields from the database and some
> from the XML.
>
> My dataconfig.xml looks like this:
>
>
> <dataConfig>
>    <dataSource name="database" driver="org.postgresql.Driver"
> url="jdbc:postgresql://cathdb.info/cathdb_v3_3_0" user="cathreader"
> password="cathreader" />
>
>    <dataSource name="filesystem" type="FileDataSource"
> basePath="/cath/people/cathdata/v3_3_0/pdb-XML-noatom/" encoding="UTF-8"
> connectionTimeout="5000" readTimeout="10000"/>
>
>    <document name="domain">
>
>        <entity name="domain" dataSource="database" query="select domain_id
> as id, 'PDB code ' || pdb_code || ', chain ' || chain_code || ', domain ' ||
> domain_code as title, 'some keywords go here' as
> keywords, pdb_code || ' ' || chain_id as related_ids, 'domain' as doc_type,
> pdb_code from domain">
>
>            <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml">
>                <field column="content"
> xpath="//*[local-name()='structCategory']/*[local-name()='struct']/*[local-name()='title']"
> />
>            </entity>
>
>        </entity>
>
>    </document>
> </dataConfig>
>
>
> This works if I comment out the inner entity, but when I uncomment it, I get
> this error:
>
>
> 30-Jul-2009 14:32:50 org.apache.solr.handler.dataimport.DocBuilder
> buildDocument
> SEVERE: Exception while processing: domain document :
> SolrInputDocument[{id=id(1.0)={1s32D00}, title=title(1.0)={PDB code
> 1s32, chain D, domain 00}, keywords=keywords(1.0)={some ke
> ywords go here}, pdb_code=pdb_code(1.0)={1s32},
> doc_type=doc_type(1.0)={domain}, related_ids=related_ids(1.0)={1s32
> 1s32D}}]
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
>        at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:64)
>        at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71)
>        at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:344)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:372)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
>        at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
>        at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
>        at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.lang.NullPointerException
>        at java.io.File.<init>(File.java:222)
>        at
> org.apache.solr.handler.dataimport.FileDataSource.getData(FileDataSource.java:75)
>        at
> org.apache.solr.handler.dataimport.FileDataSource.getData(FileDataSource.java:44)
>        at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58)
>        ... 9 more
>
>
> I have checked that the file
> /cath/people/cathdata/v3_3_0/pdb-XML-noatom/1s32-noatom.xml is readable, so
> maybe the full path to the file isn't being constructed properly or
> something?
>
> I also tried with the full path template for the file in the entity url
> attribute, instead of using a basePath in the dataSource, but I get exactly
> the same exception.
>
> This is with the 2009-07-30 nightly build. See attached for schema.
> http://www.nabble.com/file/p24739580/schema.xml schema.xml
>
> Any ideas? Thanks in advance!
>
> Andrew.
>
>
> --
> :: http://biotext.org.uk/ ::
> --
> View this message in context: http://www.nabble.com/NullPointerException-in-DataImportHandler-tp24739580p24739580.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Andrew Clegg
<quote author="Chantal Ackermann">
Hi Andrew,

your inner entity uses an XML type datasource. The default entity
processor is the SQL one, however.

For your inner entity, you have to specify the correct entity processor
explicitly. You do that by adding the attribute "processor", and the
value is the classname of the processor you want to use.

e.g. <entity dataSource="filesystem" name="domain_pdb"
processor="XPathEntityProcessor" ....
</quote>

Thanks -- I was also missing a forEach expression -- in my case, just "/" since each XML file contains the information for no more than one document.

However, I'm now getting a different exception:


30-Jul-2009 16:48:52 org.apache.solr.handler.dataimport.DocBuilder buildDocument
SEVERE: Exception while processing: domain document : SolrInputDocument[{id=id(1.0)={1udaA02}, title=title(1.0)={PDB code 1uda, chain A, domain 02}, pdb_code=pdb_code(1.0)={1uda},
doc_type=doc_type(1.0)={domain}, related_ids=related_ids(1.0)={1uda,1udaA}}]
org.apache.solr.handler.dataimport.DataImportHandlerException: Exception while reading xpaths for fields Processing Document # 1
        at org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:135)
        at org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:76)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:71)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:307)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:372)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.LinkedList.entry(LinkedList.java:365)
        at java.util.LinkedList.get(LinkedList.java:315)
        at org.apache.solr.handler.dataimport.XPathRecordReader.addField0(XPathRecordReader.java:71)
        at org.apache.solr.handler.dataimport.XPathRecordReader.<init>(XPathRecordReader.java:50)
        at org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:121)
        ... 9 more


My data config now looks like this:


<dataConfig>

   

    <dataSource name="database" driver="org.postgresql.Driver" url="jdbc:postgresql://cathdb.info/cathdb_v3_2_0" user="***" password="***" />

    <dataSource name="filesystem" type="FileDataSource" basePath="/cath/people/cathdata/v3_3_0/pdb-XML-noatom/" encoding="UTF-8" connectionTimeout="5000" readTimeout="10000"/>

    <document name="domain">

        <entity name="domain" dataSource="database" query="select domain_id as id, 'PDB code ' || pdb_code || ', chain ' || chain_code || ', domain ' || domain_code as title, pdb_code || ',' || chain_id as related_ids, 'domain' as doc_type, pdb_code from domain">

            <entity dataSource="filesystem" name="domain_pdb" url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor" forEach="/">
                <field column="content" xpath="//*[local-name()='structCategory']/*[local-name()='struct']/*[local-name()='title']" />
            </entity>


        </entity>

    </document>

</dataConfig>


Thanks in advance, again :-)

Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Erik Hatcher

On Jul 30, 2009, at 11:54 AM, Andrew Clegg wrote:
>            <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
> forEach="/">
>                <field column="content"
> xpath="//*[local-name()='structCategory']/*[local-name()='struct']/
> *[local-name()='title']"
> />

The XPathEntityProcessor doesn't support that fancy of an xpath - it  
supports only a limited subset.  Try /structCategory/struct/title  
perhaps?

        Erik

Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Chantal Ackermann
In reply to this post by Andrew Clegg
Hi Andrew,

my experience with XPathEntityProcessor is non-existent. ;-)

Just after a quick look at the method that throws the exception:

   private void addField0(String xpath, String name, boolean multiValued,
                          boolean isRecord) {
     List<String> paths = new
LinkedList<String>(Arrays.asList(xpath.split("/")));
     if ("".equals(paths.get(0).trim()))
       paths.remove(0);
     rootNode.build(paths, name, multiValued, isRecord);
   }

and your foreach attribute value in combination with the xpath:
 > forEach="/">
 >                 <field column="content"
 >
xpath="//*[local-name()='structCategory']/*[local-name()='struct']/*[local-name()='title']"
 > />

I would guess that the double slash at the beginning is not working with
your foreach regex. I don't know whether this is something the processor
should expect and handle correctly or whether you have to take care of
in your configuration.

Cheers,
Chantal

Andrew Clegg schrieb:

>
> Chantal Ackermann wrote:
>> Hi Andrew,
>>
>> your inner entity uses an XML type datasource. The default entity
>> processor is the SQL one, however.
>>
>> For your inner entity, you have to specify the correct entity processor
>> explicitly. You do that by adding the attribute "processor", and the
>> value is the classname of the processor you want to use.
>>
>> e.g. <entity dataSource="filesystem" name="domain_pdb"
>> processor="XPathEntityProcessor" ....
>>
>
> Thanks -- I was also missing a forEach expression -- in my case, just "/"
> since each XML file contains the information for no more than one document.
>
> However, I'm now getting a different exception:
>
>
> 30-Jul-2009 16:48:52 org.apache.solr.handler.dataimport.DocBuilder
> buildDocument
> SEVERE: Exception while processing: domain document :
> SolrInputDocument[{id=id(1.0)={1udaA02}, title=title(1.0)={PDB code 1uda,
> chain A, domain 02}, pdb_code=pdb_code(1.0)={1uda},
> doc_type=doc_type(1.0)={domain}, related_ids=related_ids(1.0)={1uda,1udaA}}]
> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
> while reading xpaths for fields Processing Document # 1
>         at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:135)
>         at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:76)
>         at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:71)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:307)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:372)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
>         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
>         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
>         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>         at java.util.LinkedList.entry(LinkedList.java:365)
>         at java.util.LinkedList.get(LinkedList.java:315)
>         at
> org.apache.solr.handler.dataimport.XPathRecordReader.addField0(XPathRecordReader.java:71)
>         at
> org.apache.solr.handler.dataimport.XPathRecordReader.<init>(XPathRecordReader.java:50)
>         at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initXpathReader(XPathEntityProcessor.java:121)
>         ... 9 more
>
>
> My data config now looks like this:
>
>
> <dataConfig>
>
>     <!-- TODO  change this back to v3.3.0 when the appropriate mapping
> tables are available there -->
>
>     <dataSource name="database" driver="org.postgresql.Driver"
> url="jdbc:postgresql://cathdb.info/cathdb_v3_2_0" user="***" password="***"
> />
>
>     <dataSource name="filesystem" type="FileDataSource"
> basePath="/cath/people/cathdata/v3_3_0/pdb-XML-noatom/" encoding="UTF-8"
> connectionTimeout="5000" readTimeout="10000"/>
>
>     <document name="domain">
>
>         <entity name="domain" dataSource="database" query="select domain_id
> as id, 'PDB code ' || pdb_code || ', chain ' || chain_code || ', domain ' ||
> domain_code as title, pdb_code || ',' || chain_id as related_ids, 'domain'
> as doc_type, pdb_code from domain">
>
>             <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
> forEach="/">
>                 <field column="content"
> xpath="//*[local-name()='structCategory']/*[local-name()='struct']/*[local-name()='title']"
> />
>             </entity>
>
>
>         </entity>
>
>     </document>
>
> </dataConfig>
>
>
> Thanks in advance, again :-)
>
> Andrew.
>
> --
> View this message in context: http://www.nabble.com/NullPointerException-in-DataImportHandler-tp24739580p24741292.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

--
Chantal Ackermann
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Andrew Clegg
In reply to this post by Erik Hatcher
Erik Hatcher wrote
On Jul 30, 2009, at 11:54 AM, Andrew Clegg wrote:
>            <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
> forEach="/">
>                <field column="content"
> xpath="//*[local-name()='structCategory']/*[local-name()='struct']/
> *[local-name()='title']"
> />

The XPathEntityProcessor doesn't support that fancy of an xpath - it  
supports only a limited subset.  Try /structCategory/struct/title  
perhaps?
Sadly not...

I tried with:

                <field column="content" xpath="/datablock/structCategory/struct/title" />

(full path from root)

and

                <field column="content" xpath="//structCategory/struct/title" />

Same ArrayIndex error each time.

Doesn't it use javax.xml then? I was using the complex local-name expressions to make it namespace-agnostic -- is it agnostic anyway?

Thanks,

Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Andrew Clegg
In reply to this post by Chantal Ackermann

Chantal Ackermann wrote
my experience with XPathEntityProcessor is non-existent. ;-)
Don't worry -- your hints put me on the right track :-)

I got it working with:

            <entity dataSource="filesystem" name="domain_pdb" url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor" forEach="/datablock">
                <field column="content" xpath="/datablock/structCategory/struct/title" />
            </entity>

Now, to get it to ignore missing files without an error... Hmm...

Cheers,

Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Erik Hatcher

On Jul 30, 2009, at 12:19 PM, Andrew Clegg wrote:

> Don't worry -- your hints put me on the right track :-)
>
> I got it working with:
>
>            <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
> forEach="/datablock">
>                <field column="content"
> xpath="/datablock/structCategory/struct/title" />
>            </entity>
>
> Now, to get it to ignore missing files without an error... Hmm...

     onError="skip"  or abort, or continue

        Erik


Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Chantal Ackermann
In reply to this post by Andrew Clegg
It's very easy to write your own entity processor. At least, that is my
experience with extending the SQLEntityProcessor to my needs. So, maybe
you'd be better off subclassing the xpath processor and handling the
xpath in a way you can keep your configuration straight forward.


Andrew Clegg schrieb:

>
>
> Chantal Ackermann wrote:
>>
>> my experience with XPathEntityProcessor is non-existent. ;-)
>>
>>
>
> Don't worry -- your hints put me on the right track :-)
>
> I got it working with:
>
>             <entity dataSource="filesystem" name="domain_pdb"
> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
> forEach="/datablock">
>                 <field column="content"
> xpath="/datablock/structCategory/struct/title" />
>             </entity>
>
> Now, to get it to ignore missing files without an error... Hmm...
>
> Cheers,
>
> Andrew.
>
> --
> View this message in context: http://www.nabble.com/NullPointerException-in-DataImportHandler-tp24739580p24741772.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

--
Chantal Ackermann
Consultant

mobil    +49 (176) 10 00 09 45
email    [hidden email]

--------------------------------------------------------------------------------------------------------

b.telligent GmbH & Co. KG
Lichtenbergstraße 8
D-85748 Garching / München

fon       +49 (89) 54 84 25 60
fax        +49 (89) 54 84 25 69
web      www.btelligent.de

Registered in Munich: HRA 84393
Managing Director: b.telligent Verwaltungs GmbH, HRB 153164 represented
by Sebastian Amtage and Klaus Blaschek
USt.Id.-Nr. DE814054803



Confidentiality Note
This email is intended only for the use of the individual or entity to
which it is addressed, and may contain information that is privileged,
confidential and exempt from disclosure under applicable law. If the
reader of this email message is not the intended recipient, or the
employee or agent responsible for delivery of the message to the
intended recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is prohibited. If you have
received this email in error, please notify us immediately by telephone
at +49 (0) 89 54 84 25 60. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in DataImportHandler

Noble Paul നോബിള്‍  नोब्ळ्-2
In reply to this post by Andrew Clegg
On Thu, Jul 30, 2009 at 9:45 PM, Andrew Clegg<[hidden email]> wrote:

>
>
> Erik Hatcher wrote:
>>
>>
>> On Jul 30, 2009, at 11:54 AM, Andrew Clegg wrote:
>>>            <entity dataSource="filesystem" name="domain_pdb"
>>> url="${domain.pdb_code}-noatom.xml" processor="XPathEntityProcessor"
>>> forEach="/">
>>>                <field column="content"
>>> xpath="//*[local-name()='structCategory']/*[local-name()='struct']/
>>> *[local-name()='title']"
>>> />
>>
>> The XPathEntityProcessor doesn't support that fancy of an xpath - it
>> supports only a limited subset.  Try /structCategory/struct/title
>> perhaps?
>>
>>
>
> Sadly not...
>
> I tried with:
>
>                <field column="content"
> xpath="/datablock/structCategory/struct/title" />
>
> (full path from root)
>
> and
>
>                <field column="content"
> xpath="//structCategory/struct/title" />
>
> Same ArrayIndex error each time.
>
> Doesn't it use javax.xml then? I was using the complex local-name
> expressions to make it namespace-agnostic -- is it agnostic anyway?
it does not use javax.xml because those work on a DOM tree which is
not usable for large xml files.

This only supports a subset of xpath. The supported syntax is given here

http://wiki.apache.org/solr/DataImportHandler#head-5ced7c797f1014ef6e8326a34c23f541ebbaadf1-2


>
> Thanks,
>
> Andrew.
>
> --
> View this message in context: http://www.nabble.com/NullPointerException-in-DataImportHandler-tp24739580p24741696.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com