Solr on HDInsight to write to Active Data Lake

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr on HDInsight to write to Active Data Lake

Abhi Basu
MS Azure does not support Solr 4.9 on HDI, so I am posting here. I would
like to write index collection data to HDFS (hosted on ADL).

Note: I am able to get to ADL from hadoop fs command like, so hadoop is
configured correctly to get to ADL:
hadoop fs -ls adl://

This is what I have done so far:
1. Copied all required jars to sol ext lib folder:
sudo cp -f /usr/hdp/current/hadoop-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f
/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
/usr/hdp/current/solr/example/lib/ext

This includes the Azure active data lake jars also.

2. Edited my solr-config.xml file for my collection:

<dataDir>${solr.core.name}/data/</dataDir>

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
  <str name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/</str>
  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
  <str name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
  <bool name="solr.hdfs.blockcache.enabled">true</bool>
  <int name="solr.hdfs.blockcache.slab.count">1</int>
  <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
</directoryFactory>


When this collection is deployed to solr, I see this error message:

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2189</int></lst>
<lst name="failure">
<str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard2_replica2':
Unable to create core: ems-collection_shard2_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
core: ems-collection_shard2_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
core: ems-collection_shard1_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
core: ems-collection_shard1_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
</lst>
</response>


Has anyone done this and can help me out?

Thanks,

Abhi


--
Abhi Basu
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Rick Leir-2
Abhi
Check your lib directives.
https://lucene.apache.org/solr/guide/6_6/lib-directives-in-solrconfig.html#lib-directives-in-solrconfig

I suspect your jars are not in a lib dir mentioned in solrconfig.xml
Cheers -- Rick

On March 23, 2018 11:12:17 AM EDT, Abhi Basu <[hidden email]> wrote:

>MS Azure does not support Solr 4.9 on HDI, so I am posting here. I
>would
>like to write index collection data to HDFS (hosted on ADL).
>
>Note: I am able to get to ADL from hadoop fs command like, so hadoop is
>configured correctly to get to ADL:
>hadoop fs -ls adl://
>
>This is what I have done so far:
>1. Copied all required jars to sol ext lib folder:
>sudo cp -f /usr/hdp/current/hadoop-client/*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f
>/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
>/usr/hdp/current/solr/example/lib/ext
>sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
>/usr/hdp/current/solr/example/lib/ext
>
>This includes the Azure active data lake jars also.
>
>2. Edited my solr-config.xml file for my collection:
>
><dataDir>${solr.core.name}/data/</dataDir>
>
><directoryFactory name="DirectoryFactory"
>class="solr.HdfsDirectoryFactory">
><str
>name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/</str>
>  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
><str
>name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
>  <bool name="solr.hdfs.blockcache.enabled">true</bool>
>  <int name="solr.hdfs.blockcache.slab.count">1</int>
> <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
>  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
>  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
>  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
>  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
></directoryFactory>
>
>
>When this collection is deployed to solr, I see this error message:
>
><response>
><lst name="responseHeader">
><int name="status">0</int>
><int name="QTime">2189</int></lst>
><lst name="failure">
><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>CREATEing SolrCore 'ems-collection_shard2_replica2':
>Unable to create core: ems-collection_shard2_replica2 Caused by: Class
>org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
>core: ems-collection_shard2_replica1 Caused by: Class
>org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
>core: ems-collection_shard1_replica1 Caused by: Class
>org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
>core: ems-collection_shard1_replica2 Caused by: Class
>org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
></lst>
></response>
>
>
>Has anyone done this and can help me out?
>
>Thanks,
>
>Abhi
>
>
>--
>Abhi Basu

--
Sorry for being brief. Alternate email is rickleir at yahoo dot com
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Abhi Basu
I'll try it out.

Thanks

Abhi

On Fri, Mar 23, 2018, 6:22 PM Rick Leir <[hidden email]> wrote:

> Abhi
> Check your lib directives.
>
> https://lucene.apache.org/solr/guide/6_6/lib-directives-in-solrconfig.html#lib-directives-in-solrconfig
>
> I suspect your jars are not in a lib dir mentioned in solrconfig.xml
> Cheers -- Rick
>
> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <[hidden email]> wrote:
> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I
> >would
> >like to write index collection data to HDFS (hosted on ADL).
> >
> >Note: I am able to get to ADL from hadoop fs command like, so hadoop is
> >configured correctly to get to ADL:
> >hadoop fs -ls adl://
> >
> >This is what I have done so far:
> >1. Copied all required jars to sol ext lib folder:
> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f
> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
> >/usr/hdp/current/solr/example/lib/ext
> >
> >This includes the Azure active data lake jars also.
> >
> >2. Edited my solr-config.xml file for my collection:
> >
> ><dataDir>${solr.core.name}/data/</dataDir>
> >
> ><directoryFactory name="DirectoryFactory"
> >class="solr.HdfsDirectoryFactory">
> ><str
> >name="solr.hdfs.home">adl://
> esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/</str>
> >  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
> ><str
>
> >name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
> >  <bool name="solr.hdfs.blockcache.enabled">true</bool>
> >  <int name="solr.hdfs.blockcache.slab.count">1</int>
> > <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
> >  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
> >  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
> >  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
> >  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
> ></directoryFactory>
> >
> >
> >When this collection is deployed to solr, I see this error message:
> >
> ><response>
> ><lst name="responseHeader">
> ><int name="status">0</int>
> ><int name="QTime">2189</int></lst>
> ><lst name="failure">
>
> ><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> >CREATEing SolrCore 'ems-collection_shard2_replica2':
> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class
> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>
> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
> >core: ems-collection_shard2_replica1 Caused by: Class
> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>
> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
> >core: ems-collection_shard1_replica1 Caused by: Class
> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>
> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
> >core: ems-collection_shard1_replica2 Caused by: Class
> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
> ></lst>
> ></response>
> >
> >
> >Has anyone done this and can help me out?
> >
> >Thanks,
> >
> >Abhi
> >
> >
> >--
> >Abhi Basu
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Abhi Basu
Adding this to solrconfig.xml did not work. I put all the azure and hadoop
jars in the ext folder.

<lib dir="../../../example/lib/ext" regex=".*\.jar" />

Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found

Thanks,

Abhi

On Fri, Mar 23, 2018 at 7:40 PM, Abhi Basu <[hidden email]> wrote:

> I'll try it out.
>
> Thanks
>
> Abhi
>
> On Fri, Mar 23, 2018, 6:22 PM Rick Leir <[hidden email]> wrote:
>
>> Abhi
>> Check your lib directives.
>> https://lucene.apache.org/solr/guide/6_6/lib-directives-
>> in-solrconfig.html#lib-directives-in-solrconfig
>>
>> I suspect your jars are not in a lib dir mentioned in solrconfig.xml
>> Cheers -- Rick
>>
>> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <[hidden email]> wrote:
>> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I
>> >would
>> >like to write index collection data to HDFS (hosted on ADL).
>> >
>> >Note: I am able to get to ADL from hadoop fs command like, so hadoop is
>> >configured correctly to get to ADL:
>> >hadoop fs -ls adl://
>> >
>> >This is what I have done so far:
>> >1. Copied all required jars to sol ext lib folder:
>> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f
>> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
>> >/usr/hdp/current/solr/example/lib/ext
>> >
>> >This includes the Azure active data lake jars also.
>> >
>> >2. Edited my solr-config.xml file for my collection:
>> >
>> ><dataDir>${solr.core.name}/data/</dataDir>
>> >
>> ><directoryFactory name="DirectoryFactory"
>> >class="solr.HdfsDirectoryFactory">
>> ><str
>> >name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/
>> clusters/esohadoopdeveus2/solr/</str>
>> >  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
>> ><str
>> >name="solr.hdfs.blockcache.global">${solr.hdfs.
>> blockcache.global:true}</str>
>> >  <bool name="solr.hdfs.blockcache.enabled">true</bool>
>> >  <int name="solr.hdfs.blockcache.slab.count">1</int>
>> > <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
>> >  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
>> >  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
>> >  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
>> >  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
>> ></directoryFactory>
>> >
>> >
>> >When this collection is deployed to solr, I see this error message:
>> >
>> ><response>
>> ><lst name="responseHeader">
>> ><int name="status">0</int>
>> ><int name="QTime">2189</int></lst>
>> ><lst name="failure">
>> ><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >CREATEing SolrCore 'ems-collection_shard2_replica2':
>> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class
>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
>> >core: ems-collection_shard2_replica1 Caused by: Class
>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
>> >core: ems-collection_shard1_replica1 Caused by: Class
>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
>> >core: ems-collection_shard1_replica2 Caused by: Class
>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
>> ></lst>
>> ></response>
>> >
>> >
>> >Has anyone done this and can help me out?
>> >
>> >Thanks,
>> >
>> >Abhi
>> >
>> >
>> >--
>> >Abhi Basu
>>
>> --
>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>
>


--
Abhi Basu
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Erick Erickson
Several things:

1> I often start with an absolute path, knowing the exact relative
path from where Solr starts can be confusing. If you've pathed
properly and the jar file is in the path, it'll be found.

2> Are you sure HdiAdlFileSystem is in one of the jars?

3> did you restart the JVM?

Best,
Erick

On Mon, Mar 26, 2018 at 6:49 AM, Abhi Basu <[hidden email]> wrote:

> Adding this to solrconfig.xml did not work. I put all the azure and hadoop
> jars in the ext folder.
>
> <lib dir="../../../example/lib/ext" regex=".*\.jar" />
>
> Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found
>
> Thanks,
>
> Abhi
>
> On Fri, Mar 23, 2018 at 7:40 PM, Abhi Basu <[hidden email]> wrote:
>
>> I'll try it out.
>>
>> Thanks
>>
>> Abhi
>>
>> On Fri, Mar 23, 2018, 6:22 PM Rick Leir <[hidden email]> wrote:
>>
>>> Abhi
>>> Check your lib directives.
>>> https://lucene.apache.org/solr/guide/6_6/lib-directives-
>>> in-solrconfig.html#lib-directives-in-solrconfig
>>>
>>> I suspect your jars are not in a lib dir mentioned in solrconfig.xml
>>> Cheers -- Rick
>>>
>>> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <[hidden email]> wrote:
>>> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I
>>> >would
>>> >like to write index collection data to HDFS (hosted on ADL).
>>> >
>>> >Note: I am able to get to ADL from hadoop fs command like, so hadoop is
>>> >configured correctly to get to ADL:
>>> >hadoop fs -ls adl://
>>> >
>>> >This is what I have done so far:
>>> >1. Copied all required jars to sol ext lib folder:
>>> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f
>>> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
>>> >/usr/hdp/current/solr/example/lib/ext
>>> >
>>> >This includes the Azure active data lake jars also.
>>> >
>>> >2. Edited my solr-config.xml file for my collection:
>>> >
>>> ><dataDir>${solr.core.name}/data/</dataDir>
>>> >
>>> ><directoryFactory name="DirectoryFactory"
>>> >class="solr.HdfsDirectoryFactory">
>>> ><str
>>> >name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/
>>> clusters/esohadoopdeveus2/solr/</str>
>>> >  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
>>> ><str
>>> >name="solr.hdfs.blockcache.global">${solr.hdfs.
>>> blockcache.global:true}</str>
>>> >  <bool name="solr.hdfs.blockcache.enabled">true</bool>
>>> >  <int name="solr.hdfs.blockcache.slab.count">1</int>
>>> > <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
>>> >  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
>>> >  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
>>> >  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
>>> >  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
>>> ></directoryFactory>
>>> >
>>> >
>>> >When this collection is deployed to solr, I see this error message:
>>> >
>>> ><response>
>>> ><lst name="responseHeader">
>>> ><int name="status">0</int>
>>> ><int name="QTime">2189</int></lst>
>>> ><lst name="failure">
>>> ><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>>> RemoteSolrException:Error
>>> >CREATEing SolrCore 'ems-collection_shard2_replica2':
>>> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class
>>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>>> RemoteSolrException:Error
>>> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
>>> >core: ems-collection_shard2_replica1 Caused by: Class
>>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>>> RemoteSolrException:Error
>>> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
>>> >core: ems-collection_shard1_replica1 Caused by: Class
>>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
>>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
>>> RemoteSolrException:Error
>>> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
>>> >core: ems-collection_shard1_replica2 Caused by: Class
>>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
>>> ></lst>
>>> ></response>
>>> >
>>> >
>>> >Has anyone done this and can help me out?
>>> >
>>> >Thanks,
>>> >
>>> >Abhi
>>> >
>>> >
>>> >--
>>> >Abhi Basu
>>>
>>> --
>>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>>
>>
>
>
> --
> Abhi Basu
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Abhi Basu
Yes, I copied the jars to all nodes and restarted Solr service.



<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">4212</int></lst><lst
name="failure"><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection-700_shard1_replica2': Unable to create
core: ems-collection-700_shard1_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection-700_shard2_replica1': Unable to create
core: ems-collection-700_shard2_replica1 Caused by:
org.apache.hadoop.fs.FileSystem: Provider
org.apache.hadoop.fs.azure.NativeAzureFileSystem not a
subtype</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection-700_shard2_replica2': Unable to create
core: ems-collection-700_shard2_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection-700_shard1_replica1': Unable to create
core: ems-collection-700_shard1_replica1 Caused by:
org.apache.hadoop.fs.FileSystem: Provider
org.apache.hadoop.fs.azure.NativeAzureFileSystem not a subtype</str></lst>
</response>

Here is an excerpt from the logs:

ERROR - 2018-03-26 18:09:45.033; org.apache.solr.core.CoreContainer; Unable
to create core: ems-collection-700_shard2_replica1
org.apache.solr.common.SolrException: org.apache.hadoop.fs.FileSystem:
Provider org.apache.hadoop.fs.azure.NativeAzureFileSystem not a subtype
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:868)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:643)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:556)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:569)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:198)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.ServiceConfigurationError:
org.apache.hadoop.fs.FileSystem: Provider
org.apache.hadoop.fs.azure.NativeAzureFileSystem not a subtype





On Mon, Mar 26, 2018 at 11:28 AM, Erick Erickson <[hidden email]>
wrote:

> Several things:
>
> 1> I often start with an absolute path, knowing the exact relative
> path from where Solr starts can be confusing. If you've pathed
> properly and the jar file is in the path, it'll be found.
>
> 2> Are you sure HdiAdlFileSystem is in one of the jars?
>
> 3> did you restart the JVM?
>
> Best,
> Erick
>
> On Mon, Mar 26, 2018 at 6:49 AM, Abhi Basu <[hidden email]> wrote:
> > Adding this to solrconfig.xml did not work. I put all the azure and
> hadoop
> > jars in the ext folder.
> >
> > <lib dir="../../../example/lib/ext" regex=".*\.jar" />
> >
> > Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found
> >
> > Thanks,
> >
> > Abhi
> >
> > On Fri, Mar 23, 2018 at 7:40 PM, Abhi Basu <[hidden email]> wrote:
> >
> >> I'll try it out.
> >>
> >> Thanks
> >>
> >> Abhi
> >>
> >> On Fri, Mar 23, 2018, 6:22 PM Rick Leir <[hidden email]> wrote:
> >>
> >>> Abhi
> >>> Check your lib directives.
> >>> https://lucene.apache.org/solr/guide/6_6/lib-directives-
> >>> in-solrconfig.html#lib-directives-in-solrconfig
> >>>
> >>> I suspect your jars are not in a lib dir mentioned in solrconfig.xml
> >>> Cheers -- Rick
> >>>
> >>> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <[hidden email]>
> wrote:
> >>> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I
> >>> >would
> >>> >like to write index collection data to HDFS (hosted on ADL).
> >>> >
> >>> >Note: I am able to get to ADL from hadoop fs command like, so hadoop
> is
> >>> >configured correctly to get to ADL:
> >>> >hadoop fs -ls adl://
> >>> >
> >>> >This is what I have done so far:
> >>> >1. Copied all required jars to sol ext lib folder:
> >>> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f
> >>> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
> >>> >/usr/hdp/current/solr/example/lib/ext
> >>> >
> >>> >This includes the Azure active data lake jars also.
> >>> >
> >>> >2. Edited my solr-config.xml file for my collection:
> >>> >
> >>> ><dataDir>${solr.core.name}/data/</dataDir>
> >>> >
> >>> ><directoryFactory name="DirectoryFactory"
> >>> >class="solr.HdfsDirectoryFactory">
> >>> ><str
> >>> >name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/
> >>> clusters/esohadoopdeveus2/solr/</str>
> >>> >  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
> >>> ><str
> >>> >name="solr.hdfs.blockcache.global">${solr.hdfs.
> >>> blockcache.global:true}</str>
> >>> >  <bool name="solr.hdfs.blockcache.enabled">true</bool>
> >>> >  <int name="solr.hdfs.blockcache.slab.count">1</int>
> >>> > <bool name="solr.hdfs.blockcache.direct.memory.allocation">
> true</bool>
> >>> >  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
> >>> >  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
> >>> >  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
> >>> >  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
> >>> ></directoryFactory>
> >>> >
> >>> >
> >>> >When this collection is deployed to solr, I see this error message:
> >>> >
> >>> ><response>
> >>> ><lst name="responseHeader">
> >>> ><int name="status">0</int>
> >>> ><int name="QTime">2189</int></lst>
> >>> ><lst name="failure">
> >>> ><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
> >>> RemoteSolrException:Error
> >>> >CREATEing SolrCore 'ems-collection_shard2_replica2':
> >>> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class
> >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
> >>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
> >>> RemoteSolrException:Error
> >>> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
> >>> >core: ems-collection_shard2_replica1 Caused by: Class
> >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
> >>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
> >>> RemoteSolrException:Error
> >>> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
> >>> >core: ems-collection_shard1_replica1 Caused by: Class
> >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not
> >>> >found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$
> >>> RemoteSolrException:Error
> >>> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
> >>> >core: ems-collection_shard1_replica2 Caused by: Class
> >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
> >>> ></lst>
> >>> ></response>
> >>> >
> >>> >
> >>> >Has anyone done this and can help me out?
> >>> >
> >>> >Thanks,
> >>> >
> >>> >Abhi
> >>> >
> >>> >
> >>> >--
> >>> >Abhi Basu
> >>>
> >>> --
> >>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
> >>
> >>
> >
> >
> > --
> > Abhi Basu
>



--
Abhi Basu
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Rick Leir-2
Hi,
The class that is not found is likely in the Azure related libraries. As Erick said, are you sure that you have a library containing it?
Cheers
Rick
--
Sorry for being brief. Alternate email is rickleir at yahoo dot com
Reply | Threaded
Open this post in threaded view
|

Re: Solr on HDInsight to write to Active Data Lake

Abhi Basu
Yes, for the life of me, cannot find info on azure data lake jars and MS
has not been much help either.

Maybe they dont want us to use Solr on ADLS,

Thanks,

Abhi

On Wed, Mar 28, 2018 at 10:59 AM, Rick Leir <[hidden email]> wrote:

> Hi,
> The class that is not found is likely in the Azure related libraries. As
> Erick said, are you sure that you have a library containing it?
> Cheers
> Rick
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>



--
Abhi Basu