Solr UIMA integration

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr UIMA integration

Tommaso Teofili
Hi all,
I am working on integrating Apache UIMA as un UpdateRequestProcessor for
Apache Solr and I am now at the first working snapshot.
I put the code on GoogleCode [1] and you can take a look at the tutorial
[2].

I would be glad to donate it to the Apache Solr project, as I think it could
be a useful module to trigger automatic content extraction while indexing
documents.

At the moment the UIMAUpdateRequestProcessor base implementation can
automatically extract document's sentences, language, keywords, concepts and
named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
AlchemyAPIAnnotator components (but it can be easily expanded).

Any feedback is welcome.
Have a nice day.
Tommaso

[1] : http://code.google.com/p/solr-uima/
[2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Jan Høydahl / Cominvent
Hi Tommaso,

Really cool what you've done. Looking forward to testing it, and I'm sure it's a welcome contribution to Solr.
You can easily contribute your code by opening a JIRA issue and attaching a patch file.

BTW
Have you considered making the output field names configurable on a per instance basis? It could be done as follows:
<processor class="org.apache.solr.uima.processor.UIMAProcessorFactory">
  <str name="concept_field">concept</str>
  <str name="language_field">concept</str>
  <str name="keyword_field">concept</str>
  ...
</processor>

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 20. sep. 2010, at 12.35, Tommaso Teofili wrote:

> Hi all,
> I am working on integrating Apache UIMA as un UpdateRequestProcessor for
> Apache Solr and I am now at the first working snapshot.
> I put the code on GoogleCode [1] and you can take a look at the tutorial
> [2].
>
> I would be glad to donate it to the Apache Solr project, as I think it could
> be a useful module to trigger automatic content extraction while indexing
> documents.
>
> At the moment the UIMAUpdateRequestProcessor base implementation can
> automatically extract document's sentences, language, keywords, concepts and
> named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
> AlchemyAPIAnnotator components (but it can be easily expanded).
>
> Any feedback is welcome.
> Have a nice day.
> Tommaso
>
> [1] : http://code.google.com/p/solr-uima/
> [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial

Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

gearond
In reply to this post by Tommaso Teofili
Looks like a great scraping engine technology :-)
Dennis Gearon

Signature Warning
----------------
EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Mon, 9/20/10, Tommaso Teofili <[hidden email]> wrote:

> From: Tommaso Teofili <[hidden email]>
> Subject: Solr UIMA integration
> To: [hidden email]
> Date: Monday, September 20, 2010, 3:35 AM
> Hi all,
> I am working on integrating Apache UIMA as un
> UpdateRequestProcessor for
> Apache Solr and I am now at the first working snapshot.
> I put the code on GoogleCode [1] and you can take a look at
> the tutorial
> [2].
>
> I would be glad to donate it to the Apache Solr project, as
> I think it could
> be a useful module to trigger automatic content extraction
> while indexing
> documents.
>
> At the moment the UIMAUpdateRequestProcessor base
> implementation can
> automatically extract document's sentences, language,
> keywords, concepts and
> named entities using Apache UIMA's HMMTagger,
> OpenCalaisAnnotator and
> AlchemyAPIAnnotator components (but it can be easily
> expanded).
>
> Any feedback is welcome.
> Have a nice day.
> Tommaso
>
> [1] : http://code.google.com/p/solr-uima/
> [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
In reply to this post by Jan Høydahl / Cominvent
2010/9/20 Dennis Gearon <[hidden email]>

> Looks like a great scraping engine technology :-)
> Dennis Gearon



>

2010/9/20 Jan Høydahl / Cominvent <[hidden email]>

> Really cool what you've done. Looking forward to testing it, and I'm sure
> it's a welcome contribution to Solr.
> You can easily contribute your code by opening a JIRA issue and attaching a
> patch file.
>

Thanks Dennis and Jan, I am happy you appreciate it.
I will make the patch and open the related issue.


>
> BTW
> Have you considered making the output field names configurable on a per
> instance basis? It could be done as follows:
> <processor class="org.apache.solr.uima.processor.UIMAProcessorFactory">
>  <str name="concept_field">concept</str>
>  <str name="language_field">concept</str>
>  <str name="keyword_field">concept</str>
>  ...
> </processor>
>
>
Thanks for this nice suggestion, I will put it in the TODO list :-)
Regards,
Tommaso






> On 20. sep. 2010, at 12.35, Tommaso Teofili wrote:
>
> > Hi all,
> > I am working on integrating Apache UIMA as un UpdateRequestProcessor for
> > Apache Solr and I am now at the first working snapshot.
> > I put the code on GoogleCode [1] and you can take a look at the tutorial
> > [2].
> >
> > I would be glad to donate it to the Apache Solr project, as I think it
> could
> > be a useful module to trigger automatic content extraction while indexing
> > documents.
> >
> > At the moment the UIMAUpdateRequestProcessor base implementation can
> > automatically extract document's sentences, language, keywords, concepts
> and
> > named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
> > AlchemyAPIAnnotator components (but it can be easily expanded).
> >
> > Any feedback is welcome.
> > Have a nice day.
> > Tommaso
> >
> > [1] : http://code.google.com/p/solr-uima/
> > [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Grant Ingersoll-2
In reply to this post by Tommaso Teofili

On Sep 20, 2010, at 6:35 AM, Tommaso Teofili wrote:

> Hi all,
> I am working on integrating Apache UIMA as un UpdateRequestProcessor for
> Apache Solr and I am now at the first working snapshot.
> I put the code on GoogleCode [1] and you can take a look at the tutorial
> [2].
>
> I would be glad to donate it to the Apache Solr project,

I think this would be a great addition.

> as I think it could
> be a useful module to trigger automatic content extraction while indexing
> documents.
>
> At the moment the UIMAUpdateRequestProcessor base implementation can
> automatically extract document's sentences, language, keywords, concepts and
> named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
> AlchemyAPIAnnotator components (but it can be easily expanded).
>
> Any feedback is welcome.
> Have a nice day.
> Tommaso
>
> [1] : http://code.google.com/p/solr-uima/
> [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial

--------------------------
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8

Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
In reply to this post by Tommaso Teofili
I have configured solr and uima has described by you.
I have the following dependency jars also
AlchemyAPIAnnotator.jar
commons-beanutils-1.7.0.jar
commons-digester-2.0.jar
commons-lang-2.4.jar
OpenCalaisAnnotator.jar
slf4j-api-1.5.5.jar
slf4j-jdk14-1.5.5.jar
solr-uima.jar
Tagger.jar
uima-core.jar
WhitespaceTokenizer.jar


But i am getting this error
SEVERE: java.lang.RuntimeException: org.apache.uima.resource.ResourceInitializationException
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:67)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.uima.resource.ResourceInitializationException
        at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:68)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:88)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
        ... 19 more
Caused by: java.lang.IllegalStateException: unread block data
        at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2375)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1361)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1945)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1869)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
        at org.apache.uima.examples.tagger.ModelResource.load(ModelResource.java:59)
        at org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:584)
        at org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:423)
        at org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:146)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:125)
        at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
        at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
        at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
        at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:361)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
        at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
        at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
        at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
        at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:361)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
        at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
        at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
        at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
        at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:335)
        at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:60)
        ... 21 more

Request your inputs on this.
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
Hi Maheshkumar,
I never had this one before, which version of UIMA dependencies (uima-core,
AlchemyAPIAnnotator, OpenCalaisAnnotator, Tagger, WhitespaceTokenizer) are
you using? It should be 2.3.1-SNAPSHOT.
Which version of Solr?
It seems that there is a problem in Tagger reading its model (to generate
POS tags and sentences), so it may be a resource loading issue; which
container or application server are you running Solr on?
Regards,
Tommaso


2010/9/24 maheshkumar <[hidden email]>

>
> I have configured solr and uima has described by you.
> I have the following dependency jars also
> AlchemyAPIAnnotator.jar
> commons-beanutils-1.7.0.jar
> commons-digester-2.0.jar
> commons-lang-2.4.jar
> OpenCalaisAnnotator.jar
> slf4j-api-1.5.5.jar
> slf4j-jdk14-1.5.5.jar
> solr-uima.jar
> Tagger.jar
> uima-core.jar
> WhitespaceTokenizer.jar
>
>
> But i am getting this error
> SEVERE: java.lang.RuntimeException:
> org.apache.uima.resource.ResourceInitializationException
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:67)
>        at
> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>        at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
>        at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>        at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
>        at
>
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>        at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.uima.resource.ResourceInitializationException
>        at
>
> org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:68)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:88)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
>        ... 19 more
> Caused by: java.lang.IllegalStateException: unread block data
>        at
>
> java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2375)
>        at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1361)
>        at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1945)
>        at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1869)
>        at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>        at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
>        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>        at
> org.apache.uima.examples.tagger.ModelResource.load(ModelResource.java:59)
>        at
>
> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:584)
>        at
>
> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:423)
>        at
>
> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:146)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:125)
>        at
>
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>        at
>
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>        at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
>        at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:361)
>        at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
>        at
>
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>        at
>
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>        at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
>        at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:361)
>        at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
>        at
>
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>        at
>
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>        at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
>        at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:335)
>        at
>
> org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:60)
>        ... 21 more
>
> Request your inputs on this.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-tp1528253p1573577.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
Hi Tommaso,

All UIMA dependencies (uima-core,AlchemyAPIAnnotator, OpenCalaisAnnotator, Tagger, WhitespaceTokenizer) are 2.3.1-SNAPSHOT. All are checkout from svn

AlchemyAPIAnnotator: http://svn.apache.org/repos/asf/uima/sandbox/trunk/AlchemyAPIAnnotator
OpenCalaisAnnotator: http://svn.apache.org/repos/asf/uima/sandbox/trunk/OpenCalaisAnnotator
Tagger: http://svn.apache.org/repos/asf/uima/sandbox/trunk/Tagger
WhitespaceTokenizer: http://svn.apache.org/repos/asf/uima/sandbox/trunk/WhitespaceTokenizer

solr-uima: http://solr-uima.googlecode.com/svn/trunk/solr-uima

I am using the the latest Solr version checkout from svn i guess it is greater than 1.4.1.

Tommaso, is it possible for you to upload all the dependency jar @ http://code.google.com/p/solr-uima/downloads/list.

Thanks
Mahesh



Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
Hi Maheshkumar,
I attached a patch for inclusion of this project as a Solr contrib module
[1] , there you can find the patch to apply to the Solr trunk along with
needed jars (attached as a zip archive).
I think that your issue could be related to the fact that GC project
dependency is from Solr 1.4.1, not from trunk, so the patch should fix it.
Hope this helps,
Tommaso

[1] : https://issues.apache.org/jira/browse/SOLR-2129

2010/9/27 maheshkumar <[hidden email]>

>
> Hi Tommaso,
>
> All UIMA dependencies (uima-core,AlchemyAPIAnnotator, OpenCalaisAnnotator,
> Tagger, WhitespaceTokenizer) are 2.3.1-SNAPSHOT. All are checkout from svn
>
> AlchemyAPIAnnotator:
> http://svn.apache.org/repos/asf/uima/sandbox/trunk/AlchemyAPIAnnotator
> OpenCalaisAnnotator:
> http://svn.apache.org/repos/asf/uima/sandbox/trunk/OpenCalaisAnnotator
> Tagger: http://svn.apache.org/repos/asf/uima/sandbox/trunk/Tagger
> WhitespaceTokenizer:
> http://svn.apache.org/repos/asf/uima/sandbox/trunk/WhitespaceTokenizer
>
> solr-uima: http://solr-uima.googlecode.com/svn/trunk/solr-uima
>
> I am using the the latest Solr version checkout from svn i guess it is
> greater than 1.4.1.
>
> Tommaso, is it possible for you to upload all the dependency jar @
> http://code.google.com/p/solr-uima/downloads/list.
>
> Thanks
> Mahesh
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-tp1528253p1587660.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
Hi Tommaso,

Thanks a lot for uploading the relevant dependencies jars. The issue was bcoz of java heap size i increased the heap and the issue was resolved.

Now i am getting 403 error while connecting to http://api.opencalais.com/enlighten/calais.asmx/Enlighten webservice. Do i need to registry in opencalais.com and get the api keys or how to go about.

Thanks
Mahesh
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
Hi Mahesh ,

2010/10/1 maheshkumar <[hidden email]>

>
> Thanks a lot for uploading the relevant dependencies jars. The issue was
> bcoz of java heap size i increased the heap and the issue was resolved.
>
>
I am happy it solved your issue.


> Now i am getting 403 error while connecting to
> http://api.opencalais.com/enlighten/calais.asmx/Enlighten webservice. Do i
> need to registry in opencalais.com and get the api keys or how to go
> about.
>

Yes, you need to register both an OpenCalais [1] and an AlchemyAPI [2] key
to exploit such services and place such keys as explained in [3] at point 6.
Hope this helps,
Tommaso

[1] : http://www.opencalais.com/apikey
[2] : http://www.alchemyapi.com/api/register.html
[3] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
Hi Tommaso,

I have register in the both sites and got the api keys.
But i am getting a new error.

Oct 4, 2010 6:15:04 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(405)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.uima.alchemy.digester.exception.ResultDigestingException: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
        ... 31 more
Caused by: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
        ... 31 more
Oct 4, 2010 6:15:04 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(275)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.uima.alchemy.digester.exception.ResultDigestingException: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
        ... 31 more
Caused by: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
        ... 31 more
org.apache.uima.analysis_engine.AnalysisEngineProcessException
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.uima.alchemy.digester.exception.ResultDigestingException: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
        ... 31 more
Caused by: org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException: ERROR
        at org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
        ... 31 more
Oct 4, 2010 6:15:04 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=1
        commit{dir=C:\WorkSpace\Solr\data\Education\index,segFN=segments_1,version=1286196269852,generation=1,filenames=[segments_1]
Oct 4, 2010 6:15:04 PM org.apache.solr.core.SolrDeletionPolicy updateCommits

Thanks
Mahesh
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
Hi Mahesh,
here your AlchemyAPI calls are failing, in fact their status is ERROR (sent
by AlchemyAPI webservice itself) so you should try your service call outside
Solr/UIMA, for example from their website and see if and why it's failing
with the text you're trying to enrich.
However you can post here the text field value(s) that is causing such an
error and I will try to inspect myself; some more information regarding your
Solr environment could be useful too.
Regards,
Tommaso

2010/10/4 maheshkumar <[hidden email]>

>
> Hi Tommaso,
>
> I have register in the both sites and got the api keys.
> But i am getting a new error.
>
> Oct 4, 2010 6:15:04 PM
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
> callAnalysisComponentProcess(405)
> SEVERE: Exception occurred
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
>        at
>
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
>         at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
>         at
> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>        at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
>        at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>        at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
>        at
>
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>        at java.lang.Thread.run(Thread.java:619)
> Caused by:
> org.apache.uima.alchemy.digester.exception.ResultDigestingException:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
> : ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
>        ... 31 more
> Caused by:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException:
> ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
>        ... 31 more
> Oct 4, 2010 6:15:04 PM
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl
> processAndOutputNewCASes(275)
> SEVERE: Exception occurred
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
>        at
>
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
>         at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
>         at
> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>        at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
>        at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>        at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
>        at
>
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>        at java.lang.Thread.run(Thread.java:619)
> Caused by:
> org.apache.uima.alchemy.digester.exception.ResultDigestingException:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
> : ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
>        ... 31 more
> Caused by:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException:
> ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
>        ... 31 more
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
>        at
>
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
>        at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
>        at
>
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
>        at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
>        at
>
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
>        at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
>         at
>
> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
>         at
> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>        at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
>        at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
>        at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>        at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
>        at
>
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
>        at java.lang.Thread.run(Thread.java:619)
> Caused by:
> org.apache.uima.alchemy.digester.exception.ResultDigestingException:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
> : ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
>        ... 31 more
> Caused by:
> org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException:
> ERROR
>        at
>
> org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
>        ... 31 more
> Oct 4, 2010 6:15:04 PM org.apache.solr.core.SolrDeletionPolicy onInit
> INFO: SolrDeletionPolicy.onInit: commits:num=1
>
>
> commit{dir=C:\WorkSpace\Solr\data\Education\index,segFN=segments_1,version=1286196269852,generation=1,filenames=[segments_1]
> Oct 4, 2010 6:15:04 PM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
>
> Thanks
> Mahesh
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-tp1528253p1629397.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
Hi Tommaso,

I will try the service call outside Solr/UIMA.

And the text i am using is

FileName: Entity.xml
<add>
<doc>
  <field name="reference">Entity.xml</field>
  <field name="content">Senator Dick Durbin (D-IL)  Chicago , March 3, 2007.</field>
  <field name="title">Entity Extraction</field> 
</doc>
</add>

and using curl to index it curl http://localhost:8080/solr/update -F solr.body=@Entity.xml

Thanks
Mahesh
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

Tommaso Teofili
Hi Mahesh,
the issue here is that you're not sending a <field name="text">...</field>
to Solr from which UIMAUpdateRequestProcessor extracts text to analyze :)
Infact by default UIMAUpdateRequestProcessor extracts text to analyze from
that field and send that value to a UIMA pipeline.
Obviously you could choose to customize this behavior making
UIMAUpdateRequestProcessor read values from each field that is being indexed
in the document or another field.

However this made me realize that in such situations that field value is a
String "null" and not a null object, as I expected; so line 57 in
UIMAUpdateRequestProcessor should be changed as following to prevent such
errors:
...
if (textFieldValue != null && !"".equals(textFieldValue) &&
!"null".equals(textFieldValue)) {
...
Hope this helps,
Tommaso

2010/10/6 maheshkumar <[hidden email]>

>
> Hi Tommaso,
>
> I will try the service call outside Solr/UIMA.
>
> And the text i am using is
>
> FileName: Entity.xml
> <add>
> <doc>
>  <field name="reference">Entity.xml</field>
>  <field name="content">Senator Dick Durbin (D-IL)  Chicago , March 3,
> 2007.</field>
>  <field name="title">Entity Extraction</field>
> </doc>
> </add>
>
> and using curl to index it curl http://localhost:8080/solr/update -F
> solr.body=@Entity.xml
>
> Thanks
> Mahesh
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-tp1528253p1642093.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr UIMA integration

maheshkumar
Hi Tommaso,

Thanks a lot i am able index the content and extract the entities has mentioned by you.
I have made the xml content like this

<add>
<doc>
 <field name="reference">Entity.xml</field>
 <field name="<b>text">Senator Dick Durbin (D-IL)  Chicago , March 3,2007.</field>
 <field name="title">Entity Extraction</field>
</doc>
</add> 

and it worked.

For benefit of others the procedure which i followed is:
Step1:

Get these dependency jars
AlchemyAPIAnnotator.jar
commons-beanutils-1.7.0.jar
commons-digester-2.0.jar
commons-lang-2.4.jar
OpenCalaisAnnotator.jar
slf4j-api-1.5.5.jar
slf4j-jdk14-1.5.5.jar
solr-uima.jar
Tagger.jar
uima-core.jar
WhitespaceTokenizer.jar


and source of them are
AlchemyAPIAnnotator: http://svn.apache.org/repos/asf/uima/sandbox/trunk/AlchemyAPIAnnotator
OpenCalaisAnnotator: http://svn.apache.org/repos/asf/uima/sandbox/trunk/OpenCalaisAnnotator
Tagger: http://svn.apache.org/repos/asf/uima/sandbox/trunk/Tagger
WhitespaceTokenizer: http://svn.apache.org/repos/asf/uima/sandbox/trunk/WhitespaceTokenizer
solr-uima: http://solr-uima.googlecode.com/svn/trunk/solr-uima


Step 2:
Register in http://www.opencalais.com/apikey & http://www.alchemyapi.com/api/register.html and get the api keys

Step 3: as mentioned by Tommaso in http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
modify your schema.xml adding the following fields:
 <field name="language" type="string" indexed="true" stored="true" required="false"/>
  <field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
  <field name="keyword" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
  <field name="suggested_category" type="string" indexed="true" stored="true" multiValued="false" required="false"/>
  <field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />
  <dynamicField name="entity*" type="text" indexed="true" stored="true" />

<field name="text" type="text" indexed="true" stored="true"/>   
<field name="reference" type="string" indexed="true" stored="true" required="true" />   
<field name="title" type="text" indexed="true" stored="true" multiValued="false"/>   


modify your solrconfig.xml adding the UIMA config with the following :
 <uimaConfig>
  <runtimeParameters>
      <keyword_apikey>VALID_ALCHEMYAPI_KEY</keyword_apikey>
      <concept_apikey>VALID_ALCHEMYAPI_KEY</concept_apikey>
      <lang_apikey>VALID_ALCHEMYAPI_KEY</lang_apikey>
      <cat_apikey>VALID_ALCHEMYAPI_KEY</cat_apikey>
      <entities_apikey>VALID_ALCHEMYAPI_KEY</entities_apikey>
      <oc_licenseID>VALID_OPENCALAIS_KEY</oc_licenseID>
  </runtimeParameters>
</uimaConfig>

 <updateRequestProcessorChain name="uima">
    <processor class="org.apache.solr.uima.processor.UIMAProcessorFactory"/>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>

replace your existing default UpdateRequestHandler (<requestHandler name="/update"...) with the following:
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>
    </lst>
  </requestHandler


<b>Step 4:
Increase the tomcat heap size : set JAVA_OPTS=%JAVA_OPTS% -Xmx256m for windows or  JAVA_OPTS=%JAVA_OPTS% -Xmx256m for linux.

Step 5:
Index using a sample data

File name:
<add>
<doc>
 <field name="reference">Entity.xml</field>
 <field name="<b>text">Senator Dick Durbin (D-IL)  Chicago , March 3,2007.</field>
 <field name="title">Entity Extraction</field>
</doc>
</add> 

use curl to index curl http://127.0.0.1:8080/solr/update -F solr.body=@Entity.xml
followed by a http://127.0.0.1:8080/solr/update?stream.body=<commit/>

and you are done.

Tommaso, thanks a lot once again for all your support. Please add any steps if i have missed one.

Thanks
Mahesh