Basic conceptual questions about solr

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Basic conceptual questions about solr

Shaun McArthur
I'm looking for a Google search appliance look-a-like. We have a file share with 1000's of documents in a hierarchy that makes it ridiculously difficult to locate documents.

Here are some basic questions:

Is the idea to install Solr on separate hardware and have it crawl the file system?
Can crawls be scheduled?
If installed on a remote server, can it be configured to insert users' local content in search results?
I assumed that once it's functioning, users surf to a web page for results?

Appreciate any input, and I have started to RTFJavadocs :)
Shaun


Shaun McArthur
Dir. Technical Operations
Autodata Solutions
Mobile : (226) 268-6458
Skype :shaun-mcarthur

Reply | Threaded
Open this post in threaded view
|

Re: Basic conceptual questions about solr

Jan Høydahl / Cominvent
Hi,

You can place Solr wherever you want, but if your data is veery large, you'd want dedicated box.

Have a look at DIH (http://wiki.apache.org/solr/DataImportHandler). It can both crawl a file share periodically, indexing only files changed since a timestamp (can be e.g. NOW-1HOUR) and extract resulting text using Tika.

However if you require security, have a look at LCF (http://incubator.apache.org/connectors/) which adds security but may lack a powerful file crawler..

You choose how the results are presented back to the user, but normally it's a traditional web page with links which when clicked will point to that resource in some way.

Wrt. user's local content - what is that? Sounds like you want to hook in to a local search on the laptop like Google does. To do that you'd have to develop a local service sitting in the system tray on each computer, exposing some API on some port. And then when user searches your search portal, e.g. search.mycompany.com/?q=foo, the GUI uses some AJAX to reach out to the local search service and filter that in to the results...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 19. aug. 2010, at 21.31, Shaun McArthur wrote:

> I'm looking for a Google search appliance look-a-like. We have a file share with 1000's of documents in a hierarchy that makes it ridiculously difficult to locate documents.
>
> Here are some basic questions:
>
> Is the idea to install Solr on separate hardware and have it crawl the file system?
> Can crawls be scheduled?
> If installed on a remote server, can it be configured to insert users' local content in search results?
> I assumed that once it's functioning, users surf to a web page for results?
>
> Appreciate any input, and I have started to RTFJavadocs :)
> Shaun
>
>
> Shaun McArthur
> Dir. Technical Operations
> Autodata Solutions
> Mobile : (226) 268-6458
> Skype :shaun-mcarthur
>

Reply | Threaded
Open this post in threaded view
|

RE: Basic conceptual questions about solr

Shaun McArthur
Very useful - thanks very much. I'll have a look at DIH too.


Best,
Shaun



-----Original Message-----
From: Jan Høydahl / Cominvent [mailto:[hidden email]]
Sent: Thursday, August 19, 2010 8:02 PM
To: [hidden email]
Subject: Re: Basic conceptual questions about solr

Hi,

You can place Solr wherever you want, but if your data is veery large, you'd want dedicated box.

Have a look at DIH (http://wiki.apache.org/solr/DataImportHandler). It can both crawl a file share periodically, indexing only files changed since a timestamp (can be e.g. NOW-1HOUR) and extract resulting text using Tika.

However if you require security, have a look at LCF (http://incubator.apache.org/connectors/) which adds security but may lack a powerful file crawler..

You choose how the results are presented back to the user, but normally it's a traditional web page with links which when clicked will point to that resource in some way.

Wrt. user's local content - what is that? Sounds like you want to hook in to a local search on the laptop like Google does. To do that you'd have to develop a local service sitting in the system tray on each computer, exposing some API on some port. And then when user searches your search portal, e.g. search.mycompany.com/?q=foo, the GUI uses some AJAX to reach out to the local search service and filter that in to the results...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 19. aug. 2010, at 21.31, Shaun McArthur wrote:

> I'm looking for a Google search appliance look-a-like. We have a file share with 1000's of documents in a hierarchy that makes it ridiculously difficult to locate documents.
>
> Here are some basic questions:
>
> Is the idea to install Solr on separate hardware and have it crawl the file system?
> Can crawls be scheduled?
> If installed on a remote server, can it be configured to insert users' local content in search results?
> I assumed that once it's functioning, users surf to a web page for results?
>
> Appreciate any input, and I have started to RTFJavadocs :)
> Shaun
>
>
> Shaun McArthur
> Dir. Technical Operations
> Autodata Solutions
> Mobile : (226) 268-6458
> Skype :shaun-mcarthur
>