Exception during integration of Solr with UIMA

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Exception during integration of Solr with UIMA

aruninfo100
Hi All,

I am trying to integrate UIMA with Solr.I am following the steps mentioned in https://cwiki.apache.org/confluence/display/solr/UIMA+Integration .But when I try to index the  documents,exceptions are thrown in terminal and solr log is also logged with error traces.I have been trying to work around for some time,but unable to get a proper solution for the issue.

I am using solr 6.1.0
                       
I have included all the jars mentioned in the document.
analyze field:

         <arr name="fields">
          <str>content</str>
         </arr>
This field hold all the extracted content(text content) from respective documents indexed(some are large documents).
content field is of field type text_general.It is not a copy field.The field holds the respective document contents.
          <field name="content" type="text_general" indexed="true" termOffsets="true" stored="true"
          termPositions="true" termVectors="true" multiValued="true" required="true"/>

I have created the three fields too in the config file(referred the document).

I have generated valid keys for the API's. Internet connection is available.

solrconfig.xml:
                       
                        <updateRequestProcessorChain name="uima" >
                        <processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
                        <lst name="uimaConfig">
                        <lst name="runtimeParameters">
                        <str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
                        <str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
                        <str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
                        <str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
                        <str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
                        <str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
                        </lst>
                        <str name="analysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str>
                       
                        <bool name="ignoreErrors">true</bool>
                        <str name="logField">fileName</str>
                       
                        <lst name="analyzeFields">
                        <bool name="merge">false</bool>
                        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
          <lst name="mapping">
            <str name="feature">text</str>
            <str name="field">concept</str>
          </lst>
        </lst>
        <lst name="type">
          <str name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
          <lst name="mapping">
            <str name="feature">language</str>
            <str name="field">language</str>
          </lst>
        </lst>
        <lst name="type">
          <str name="name">org.apache.uima.SentenceAnnotation</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">sentence</str>
          </lst>
        </lst>
      </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>


 <requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">uima</str>
  </lst>
</requestHandler>



terminal error trace:

Mar 19, 2017 10:46:16 AM WhitespaceTokenizer typeSystemInit
INFO: "Whitespace tokenizer typesystem initialized"
Mar 19, 2017 10:46:16 AM WhitespaceTokenizer process
INFO: "Whitespace tokenizer starts processing"
Mar 19, 2017 10:46:16 AM WhitespaceTokenizer process
INFO: "Whitespace tokenizer finished processing"
Mar 19, 2017 10:46:16 AM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisE
ngine_impl callAnalysisComponentProcess(405)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
        at org.apache.uima.annotator.calais.OpenCalaisAnnotator.process(OpenCala
isAnnotator.java:206)
        at org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasA
nnotator_ImplBase.java:56)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.cal
lAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.pro
cessAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.processUntilNextOutputCas(ASB_impl.java:567)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterato
r.<init>(ASB_impl.java:409)
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.ja
va:342)
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.pro
cessAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:267)
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(A
nalysisEngineImplBase.java:280)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText
(UIMAUpdateRequestProcessor.java:176)
        at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(
UIMAUpdateRequestProcessor.java:78)
        at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.j
ava:97)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.read
OuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.read
Iterator(JavaBinUpdateRequestCodec.java:135)
        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:27
4)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.read
NamedList(JavaBinUpdateRequestCodec.java:121)
        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:23
9)
        at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:
157)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmars
hal(JavaBinUpdateRequestCodec.java:186)
        at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(Javabin
Loader.java:107)
        at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:
54)
        at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHand
ler.java:97)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
ntentStreamHandlerBase.java:69)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
erBase.java:156)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:257)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:208)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1668)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
:581)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
ava:143)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.jav
a:548)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
er.java:226)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
er.java:1160)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
511)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
r.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
r.java:1092)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
ava:141)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
extHandlerCollection.java:213)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
ection.java:119)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:518)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.jav
a:244)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(Abstra
ctConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoin
t.java:93)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceA
ndRun(ExecuteProduceConsume.java:246)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(Exec
uteProduceConsume.java:156)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
l.java:654)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
.java:572)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: api.opencalais.com
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java
:184)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
        at sun.net.www.http.HttpClient.New(HttpClient.java:308)
        at sun.net.www.http.HttpClient.New(HttpClient.java:326)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLC

.....

solr.log:

2017-03-19 05:41:24.466 WARN  (qtp1389647288-13) [   x:star] o.a.s.u.p.UIMAUpdateRequestProcessor skip the text processing due to null. id=3aedc166-c9ad-4b30-8bcb-d27177d2ae16,  text="nullget acquainted with ams application release readiness  confidential – not for distribution    1 ..."
2017-03-19 05:41:24.492 INFO  (qtp1389647288-13) [   x:star] o.a.s.u.p.LogUpdateProcessorFactory [star]  webapp=/solr path=/update params={wt=javabin&version=2}{add=[3aedc166-c9ad-4b30-8bcb-d27177d2ae16 (1562275568121020416)]} 0 12088
2017-03-19 05:41:39.493 INFO  (commitScheduler-10-thread-1) [   x:star] o.a.s.u.DirectUpdateHandler2 start
...

Kindly let me know what I am doing wrong here and the possible solution for it.I have been spending time to rectify this issue.

Thanks