Connecting Solr to Nutch

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Connecting Solr to Nutch

Timeka Cobb
Hello out there! I'm trying to create a small search engine and have
installed Nutch 1.15 and Solr 7.5.0..issue now is connecting the 2
primarily because the files required to create the Nutch core in Solr
doesn't exist i.e. basicconfig. How do I go about connecting the 2 so I can
begin crawling websites for the engine? Please help 😊

💗💗,
Timeka Cobb
Reply | Threaded
Open this post in threaded view
|

Re: Connecting Solr to Nutch

Jan Høydahl / Cominvent
This is more a questions for the Nutch community to answer.
Googling, I found a Tutorial which seems fairly up to date (2018-09-10), perhaps try to follow that one?
https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search <https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search>

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 5. okt. 2018 kl. 03:53 skrev Timeka Cobb <[hidden email]>:
>
> Hello out there! I'm trying to create a small search engine and have
> installed Nutch 1.15 and Solr 7.5.0..issue now is connecting the 2
> primarily because the files required to create the Nutch core in Solr
> doesn't exist i.e. basicconfig. How do I go about connecting the 2 so I can
> begin crawling websites for the engine? Please help 😊
>
> 💗💗,
> Timeka Cobb

Reply | Threaded
Open this post in threaded view
|

Re: Connecting Solr to Nutch

Timeka Cobb
Good morning! The Nutch community doesn't help much..the problem I notice
is where they say install Solr the first step create resources: the
basicconfig file does not exist at all in the Solr packet..I can't connect
because Solr is missing files that are required in the setup process. Maybe
try to install this in the Nutch directory..I don't know but I'm going to
figure it out. Thank you for your help😊

On Fri, Oct 5, 2018, 3:36 AM Jan Høydahl <[hidden email]> wrote:

> This is more a questions for the Nutch community to answer.
> Googling, I found a Tutorial which seems fairly up to date (2018-09-10),
> perhaps try to follow that one?
> https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search <
> https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search>
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 5. okt. 2018 kl. 03:53 skrev Timeka Cobb <[hidden email]>:
> >
> > Hello out there! I'm trying to create a small search engine and have
> > installed Nutch 1.15 and Solr 7.5.0..issue now is connecting the 2
> > primarily because the files required to create the Nutch core in Solr
> > doesn't exist i.e. basicconfig. How do I go about connecting the 2 so I
> can
> > begin crawling websites for the engine? Please help 😊
> >
> > 💗💗,
> > Timeka Cobb
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Connecting Solr to Nutch

Shawn Heisey-2
On 10/5/2018 7:24 AM, Timeka Cobb wrote:
> Good morning! The Nutch community doesn't help much..the problem I notice
> is where they say install Solr the first step create resources: the
> basicconfig file does not exist at all in the Solr packet..I can't connect
> because Solr is missing files that are required in the setup process. Maybe
> try to install this in the Nutch directory..I don't know but I'm going to
> figure it out. Thank you for your help😊

The config example included with Solr that is considered "default" used
to be called basic_configs.  In more recent versions, it has been
renamed to _default. The name does include the leading underscore as I
have written it.

Nutch should not be relying on example configs included with Solr. 
Those can easily change in new versions to be something that's not
compatible with their software.  They should be completely supplying the
entire configuration (the "nutch" configset).  This includes the schema
and solrconfig.xml, as well as any other config files referenced by
those two.  Different configs for different Solr versions might become
necessary ... they will need to be prepared for that.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Connecting Solr to Nutch

Timeka Cobb
Thank you so very much for the help!!

On Fri, Oct 5, 2018 at 10:53 AM Shawn Heisey <[hidden email]> wrote:

> On 10/5/2018 7:24 AM, Timeka Cobb wrote:
> > Good morning! The Nutch community doesn't help much..the problem I notice
> > is where they say install Solr the first step create resources: the
> > basicconfig file does not exist at all in the Solr packet..I can't
> connect
> > because Solr is missing files that are required in the setup process.
> Maybe
> > try to install this in the Nutch directory..I don't know but I'm going to
> > figure it out. Thank you for your help😊
>
> The config example included with Solr that is considered "default" used
> to be called basic_configs.  In more recent versions, it has been
> renamed to _default. The name does include the leading underscore as I
> have written it.
>
> Nutch should not be relying on example configs included with Solr.
> Those can easily change in new versions to be something that's not
> compatible with their software.  They should be completely supplying the
> entire configuration (the "nutch" configset).  This includes the schema
> and solrconfig.xml, as well as any other config files referenced by
> those two.  Different configs for different Solr versions might become
> necessary ... they will need to be prepared for that.
>
> Thanks,
> Shawn
>
>