Questions regarding IT search solution

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Questions regarding IT search solution

Silent Surfer
Hi,
I am new to Lucene forum and it is my first question.I need a clarification from you.
Requirement:------------------1. Build a IT search tool for logs similar to that of Splunk(Only wrt searching logs but not in terms of reporting, graphs etc) using solr/lucene. The log files are mainly the server logs like JBoss, Custom application server logs (May or may not be log4j logs) and the files size can go potentially upto 100 MB2. The logs are spread across multiple servers (25 to 30 servers)2. Capability to be do search almost realtime3. Support  distributed search

Our search criterion can be based on a keyword or timestamp or IP address etc.
Can anyone throw some light if solr/lucene is right solution for this ?
Appreciate any quick help in this regard.
Thanks,Surfer



Thanks,Tiru


Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Silent Surfer
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer

--- On Tue, 6/2/09, Silent Surfer <[hidden email]> wrote:

From: Silent Surfer <[hidden email]>
Subject: Questions regarding IT search solution
To: [hidden email]
Date: Tuesday, June 2, 2009, 5:45 PM

Hi,
I am new to Lucene forum and it is my first question.I need a clarification from you.
Requirement:------------------1. Build a IT search tool for logs similar to that of Splunk(Only wrt searching logs but not in terms of reporting, graphs etc) using solr/lucene. The log files are mainly the server logs like JBoss, Custom application server logs (May or may not be log4j logs) and the files size can go potentially upto 100 MB2. The logs are spread across multiple servers (25 to 30 servers)2. Capability to be do search almost realtime3. Support  distributed search

Our search criterion can be based on a keyword or timestamp or IP address etc.
Can anyone throw some light if solr/lucene is right solution for this ?
Appreciate any quick help in this regard.
Thanks,Surfer

     


Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Silent Surfer
In reply to this post by Silent Surfer
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer

--- On Tue, 6/2/09, Silent Surfer <[hidden email]> wrote:

From: Silent Surfer <[hidden email]>
Subject: Questions regarding IT search solution
To: [hidden email]
Date: Tuesday, June 2, 2009, 5:45 PM

Hi,
I am new to Lucene forum and it is my first question.I need a clarification from you.
Requirement:------------------1. Build a IT search tool for logs similar to that of Splunk(Only wrt searching logs but not in terms of reporting, graphs etc) using solr/lucene. The log files are mainly the server logs like JBoss, Custom application server logs (May or may not be log4j logs) and the files size can go potentially upto 100 MB2. The logs are spread across multiple servers (25 to 30 servers)2. Capability to be do search almost realtime3. Support  distributed search

Our search criterion can be based on a keyword or timestamp or IP address etc.
Can anyone throw some light if solr/lucene is right solution for this ?
Appreciate any quick help in this regard.
Thanks,Surfer

     


Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Walter Underwood, Netflix
In reply to this post by Silent Surfer
Why build one? Don't those already exist?

Personally, I'd start with Hadoop instead of Solr. Putting logs in a
search index is guaranteed to not scale. People were already trying
different approaches ten years ago.

wunder

On 6/4/09 8:41 AM, "Silent Surfer" <[hidden email]> wrote:

> Hi,
> Any help/pointers on the following message would really help me..
> Thanks,Surfer
>
> --- On Tue, 6/2/09, Silent Surfer <[hidden email]> wrote:
>
> From: Silent Surfer <[hidden email]>
> Subject: Questions regarding IT search solution
> To: [hidden email]
> Date: Tuesday, June 2, 2009, 5:45 PM
>
> Hi,
> I am new to Lucene forum and it is my first question.I need a clarification
> from you.
> Requirement:------------------1. Build a IT search tool for logs similar to
> that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
> etc) using solr/lucene. The log files are mainly the server logs like JBoss,
> Custom application server logs (May or may not be log4j logs) and the files
> size can go potentially upto 100 MB2. The logs are spread across multiple
> servers (25 to 30 servers)2. Capability to be do search almost realtime3.
> Support  distributed search
>
> Our search criterion can be based on a keyword or timestamp or IP address etc.
> Can anyone throw some light if solr/lucene is right solution for this ?
> Appreciate any quick help in this regard.
> Thanks,Surfer
>
>      
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Alexandre Rafalovitch
I would also be interested to know what other existing solutions exist.

Splunk's advantage is that it does extraction of the fields with
advanced searching functionality (it has lexers/parsers for multiple
content types). I believe that's the Solr's function desired in
original posting. At the time they came out (2004), I was not aware of
any good open source solutions to do what they did. And I would have
loved one, as I was analyzing multi-gigabite logs.

Hadoop might be a way to process the files, but what would do the
indexing and searching?

Regards,
    Alex.

On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwood<[hidden email]> wrote:

> Why build one? Don't those already exist?
>
> Personally, I'd start with Hadoop instead of Solr. Putting logs in a
> search index is guaranteed to not scale. People were already trying
> different approaches ten years ago.
>
> wunder
>
> On 6/4/09 8:41 AM, "Silent Surfer" <[hidden email]> wrote:
>
>> Hi,
>> Any help/pointers on the following message would really help me..
>> Thanks,Surfer
>>
>> --- On Tue, 6/2/09, Silent Surfer <[hidden email]> wrote:
>>
>> From: Silent Surfer <[hidden email]>
>> Subject: Questions regarding IT search solution
>> To: [hidden email]
>> Date: Tuesday, June 2, 2009, 5:45 PM
>>
>> Hi,
>> I am new to Lucene forum and it is my first question.I need a clarification
>> from you.
>> Requirement:------------------1. Build a IT search tool for logs similar to
>> that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
>> etc) using solr/lucene. The log files are mainly the server logs like JBoss,
>> Custom application server logs (May or may not be log4j logs) and the files
>> size can go potentially upto 100 MB2. The logs are spread across multiple
>> servers (25 to 30 servers)2. Capability to be do search almost realtime3.
>> Support  distributed search
>>
>> Our search criterion can be based on a keyword or timestamp or IP address etc.
>> Can anyone throw some light if solr/lucene is right solution for this ?
>> Appreciate any quick help in this regard.
>> Thanks,Surfer
Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Silent Surfer
In reply to this post by Silent Surfer
Hi,
As Alex correctly pointed out my main intention is to figure out whether Solr/lucene offer functionalities to replicate what Splunk is doing in terms of building indexes etc for enabling search capabilities.
We evaluated Splunk, but it is not very cost effective solution for us as we may have logs running into few GBs per day as there can be around 25-20 servers running, and Splunk licensing model is based of size of logs per day that too, the license valid for only 1 year.
With this back ground, any further inputs on this are greatly appreciated.
Thanks,Surfer 

--- On Thu, 6/4/09, Alexandre Rafalovitch <[hidden email]> wrote:

From: Alexandre Rafalovitch <[hidden email]>
Subject: Re: Questions regarding IT search solution
To: [hidden email]
Date: Thursday, June 4, 2009, 9:27 PM

I would also be interested to know what other existing solutions exist.

Splunk's advantage is that it does extraction of the fields with
advanced searching functionality (it has lexers/parsers for multiple
content types). I believe that's the Solr's function desired in
original posting. At the time they came out (2004), I was not aware of
any good open source solutions to do what they did. And I would have
loved one, as I was analyzing multi-gigabite logs.

Hadoop might be a way to process the files, but what would do the
indexing and searching?

Regards,
    Alex.

On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwood<[hidden email]> wrote:

> Why build one? Don't those already exist?
>
> Personally, I'd start with Hadoop instead of Solr. Putting logs in a
> search index is guaranteed to not scale. People were already trying
> different approaches ten years ago.
>
> wunder
>
> On 6/4/09 8:41 AM, "Silent Surfer" <[hidden email]> wrote:
>
>> Hi,
>> Any help/pointers on the following message would really help me..
>> Thanks,Surfer
>>
>> --- On Tue, 6/2/09, Silent Surfer <[hidden email]> wrote:
>>
>> From: Silent Surfer <[hidden email]>
>> Subject: Questions regarding IT search solution
>> To: [hidden email]
>> Date: Tuesday, June 2, 2009, 5:45 PM
>>
>> Hi,
>> I am new to Lucene forum and it is my first question.I need a clarification
>> from you.
>> Requirement:------------------1. Build a IT search tool for logs similar to
>> that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
>> etc) using solr/lucene. The log files are mainly the server logs like JBoss,
>> Custom application server logs (May or may not be log4j logs) and the files
>> size can go potentially upto 100 MB2. The logs are spread across multiple
>> servers (25 to 30 servers)2. Capability to be do search almost realtime3.
>> Support  distributed search
>>
>> Our search criterion can be based on a keyword or timestamp or IP address etc.
>> Can anyone throw some light if solr/lucene is right solution for this ?
>> Appreciate any quick help in this regard.
>> Thanks,Surfer



Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Otis Gospodnetic-2

My guess is Solr/Lucene would work.  Not sure how well/fast, but it would, esp. if you avoid range queries (or use tdate), and esp. if you shard/segment indices smartly, so that at query time you send (or distribute if you have to) the query to only those shards that have the data (if your query is for a limited time period).

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----

> From: Silent Surfer <[hidden email]>
> To: [hidden email]
> Sent: Thursday, June 4, 2009 5:52:21 PM
> Subject: Re: Questions regarding IT search solution
>
> Hi,
> As Alex correctly pointed out my main intention is to figure out whether
> Solr/lucene offer functionalities to replicate what Splunk is doing in terms of
> building indexes etc for enabling search capabilities.
> We evaluated Splunk, but it is not very cost effective solution for us as we may
> have logs running into few GBs per day as there can be around 25-20 servers
> running, and Splunk licensing model is based of size of logs per day that too,
> the license valid for only 1 year.
> With this back ground, any further inputs on this are greatly appreciated.
> Thanks,Surfer
>
> --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
>
> From: Alexandre Rafalovitch
> Subject: Re: Questions regarding IT search solution
> To: [hidden email]
> Date: Thursday, June 4, 2009, 9:27 PM
>
> I would also be interested to know what other existing solutions exist.
>
> Splunk's advantage is that it does extraction of the fields with
> advanced searching functionality (it has lexers/parsers for multiple
> content types). I believe that's the Solr's function desired in
> original posting. At the time they came out (2004), I was not aware of
> any good open source solutions to do what they did. And I would have
> loved one, as I was analyzing multi-gigabite logs.
>
> Hadoop might be a way to process the files, but what would do the
> indexing and searching?
>
> Regards,
>     Alex.
>
> On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
> > Why build one? Don't those already exist?
> >
> > Personally, I'd start with Hadoop instead of Solr. Putting logs in a
> > search index is guaranteed to not scale. People were already trying
> > different approaches ten years ago.
> >
> > wunder
> >
> > On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> >
> >> Hi,
> >> Any help/pointers on the following message would really help me..
> >> Thanks,Surfer
> >>
> >> --- On Tue, 6/2/09, Silent Surfer wrote:
> >>
> >> From: Silent Surfer
> >> Subject: Questions regarding IT search solution
> >> To: [hidden email]
> >> Date: Tuesday, June 2, 2009, 5:45 PM
> >>
> >> Hi,
> >> I am new to Lucene forum and it is my first question.I need a clarification
> >> from you.
> >> Requirement:------------------1. Build a IT search tool for logs similar to
> >> that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
> >> etc) using solr/lucene. The log files are mainly the server logs like JBoss,
> >> Custom application server logs (May or may not be log4j logs) and the files
> >> size can go potentially upto 100 MB2. The logs are spread across multiple
> >> servers (25 to 30 servers)2. Capability to be do search almost realtime3.
> >> Support  distributed search
> >>
> >> Our search criterion can be based on a keyword or timestamp or IP address
> etc.
> >> Can anyone throw some light if solr/lucene is right solution for this ?
> >> Appreciate any quick help in this regard.
> >> Thanks,Surfer

Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Silent Surfer
In reply to this post by Silent Surfer
Hi,
This is encouraging to know that solr/lucene solution may work.
Can anyone using solr/lucene for such scenario can confirm that the solution is used and working fine? That would be really helpful, as I just started looking into the solr/lucene solution only couple of days back and might be difficult to be 100% confident before proposing the solution approach in next couple of days.
Thanks,Surfer

--- On Thu, 6/4/09, Otis Gospodnetic <[hidden email]> wrote:

From: Otis Gospodnetic <[hidden email]>
Subject: Re: Questions regarding IT search solution
To:
 [hidden email]
Date: Thursday, June 4, 2009, 10:26 PM


My guess is Solr/Lucene would work.  Not sure how well/fast, but it would, esp. if you avoid range queries (or use tdate), and esp. if you shard/segment indices smartly, so that at query time you send (or distribute if you have to) the query to only those shards that have the data (if your query is for a limited time period).

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Silent Surfer <[hidden email]>
> To: [hidden email]
> Sent: Thursday, June 4, 2009 5:52:21 PM
> Subject: Re:
 Questions regarding IT search solution

>
> Hi,
> As Alex correctly pointed out my main intention is to figure out whether
> Solr/lucene offer functionalities to replicate what Splunk is doing in terms of
> building indexes etc for enabling search capabilities.
> We evaluated Splunk, but it is not very cost effective solution for us as we may
> have logs running into few GBs per day as there can be around 25-20 servers
> running, and Splunk licensing model is based of size of logs per day that too,
> the license valid for only 1 year.
> With this back ground, any further inputs on this are greatly appreciated.
> Thanks,Surfer
>
> --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
>
> From: Alexandre Rafalovitch
> Subject: Re: Questions regarding IT search solution
> To: [hidden email]
> Date: Thursday, June 4, 2009, 9:27 PM
>
> I would also be interested to know what other existing solutions exist.
>
> Splunk's advantage is that it does extraction of the fields with
> advanced searching functionality (it has lexers/parsers for multiple
> content types). I believe that's the Solr's function desired in
> original posting. At the time they came out (2004), I was not aware of
> any good open source solutions to do what they did. And I would have
> loved one, as I was analyzing multi-gigabite logs.
>
> Hadoop might be a way to process the files, but what would do the
> indexing and searching?
>
> Regards,
>     Alex.
>
> On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
> > Why build one? Don't those already exist?
> >
> > Personally, I'd start with Hadoop instead of Solr. Putting
 logs in a

> > search index is guaranteed to not scale. People were already trying
> > different approaches ten years ago.
> >
> > wunder
> >
> > On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> >
> >> Hi,
> >> Any help/pointers on the following message would really help me..
> >> Thanks,Surfer
> >>
> >> --- On Tue, 6/2/09, Silent Surfer wrote:
> >>
> >> From: Silent Surfer
> >> Subject: Questions regarding IT search solution
> >> To: [hidden email]
> >> Date: Tuesday, June 2, 2009, 5:45 PM
> >>
> >> Hi,
> >> I am new to Lucene forum and it is my first question.I need a clarification
> >> from you.
> >> Requirement:------------------1. Build a IT search tool for logs similar to
> >> that of Splunk(Only wrt searching logs but not in terms of reporting, graphs
> >> etc) using
 solr/lucene. The log files are mainly the server logs like JBoss,

> >> Custom application server logs (May or may not be log4j logs) and the files
> >> size can go potentially upto 100 MB2. The logs are spread across multiple
> >> servers (25 to 30 servers)2. Capability to be do search almost realtime3.
> >> Support  distributed search
> >>
> >> Our search criterion can be based on a keyword or timestamp or IP address
> etc.
> >> Can anyone throw some light if solr/lucene is right solution for this ?
> >> Appreciate any quick help in this regard.
> >> Thanks,Surfer




Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Jeff Hammerbacher
Hey,

Your system sounds similar to the work don by Stu Hood at Rackspace in their
Mailtrust unit. See
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-datafor
more details and inspiration.

Regards,
Jeff

On Thu, Jun 4, 2009 at 4:58 PM, <[hidden email]> wrote:

> Hi,
> This is encouraging to know that solr/lucene solution may work.
> Can anyone using solr/lucene for such scenario can confirm that the
> solution is used and working fine? That would be really helpful, as I just
> started looking into the solr/lucene solution only couple of days back and
> might be difficult to be 100% confident before proposing the solution
> approach in next couple of days.
> Thanks,Surfer
>
> --- On Thu, 6/4/09, Otis Gospodnetic <[hidden email]> wrote:
>
> From: Otis Gospodnetic <[hidden email]>
> Subject: Re: Questions regarding IT search solution
> To:
>  [hidden email]
> Date: Thursday, June 4, 2009, 10:26 PM
>
>
> My guess is Solr/Lucene would work.  Not sure how well/fast, but it would,
> esp. if you avoid range queries (or use tdate), and esp. if you
> shard/segment indices smartly, so that at query time you send (or distribute
> if you have to) the query to only those shards that have the data (if your
> query is for a limited time period).
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Silent Surfer <[hidden email]>
> > To: [hidden email]
> > Sent: Thursday, June 4, 2009 5:52:21 PM
> > Subject: Re:
>  Questions regarding IT search solution
> >
> > Hi,
> > As Alex correctly pointed out my main intention is to figure out whether
> > Solr/lucene offer functionalities to replicate what Splunk is doing in
> terms of
> > building indexes etc for enabling search capabilities.
> > We evaluated Splunk, but it is not very cost effective solution for us as
> we may
> > have logs running into few GBs per day as there can be around 25-20
> servers
> > running, and Splunk licensing model is based of size of logs per day that
> too,
> > the license valid for only 1 year.
> > With this back ground, any further inputs on this are greatly
> appreciated.
> > Thanks,Surfer
> >
> > --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
> >
> > From: Alexandre Rafalovitch
> > Subject: Re: Questions regarding IT search solution
> > To: [hidden email]
> > Date: Thursday, June 4, 2009, 9:27 PM
> >
> > I would also be interested to know what other existing solutions exist.
> >
> > Splunk's advantage is that it does extraction of the fields with
> > advanced searching functionality (it has lexers/parsers for multiple
> > content types). I believe that's the Solr's function desired in
> > original posting. At the time they came out (2004), I was not aware of
> > any good open source solutions to do what they did. And I would have
> > loved one, as I was analyzing multi-gigabite logs.
> >
> > Hadoop might be a way to process the files, but what would do the
> > indexing and searching?
> >
> > Regards,
> >     Alex.
> >
> > On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
> > > Why build one? Don't those already exist?
> > >
> > > Personally, I'd start with Hadoop instead of Solr. Putting
>  logs in a
> > > search index is guaranteed to not scale. People were already trying
> > > different approaches ten years ago.
> > >
> > > wunder
> > >
> > > On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> > >
> > >> Hi,
> > >> Any help/pointers on the following message would really help me..
> > >> Thanks,Surfer
> > >>
> > >> --- On Tue, 6/2/09, Silent Surfer wrote:
> > >>
> > >> From: Silent Surfer
> > >> Subject: Questions regarding IT search solution
> > >> To: [hidden email]
> > >> Date: Tuesday, June 2, 2009, 5:45 PM
> > >>
> > >> Hi,
> > >> I am new to Lucene forum and it is my first question.I need a
> clarification
> > >> from you.
> > >> Requirement:------------------1. Build a IT search tool for logs
> similar to
> > >> that of Splunk(Only wrt searching logs but not in terms of reporting,
> graphs
> > >> etc) using
>  solr/lucene. The log files are mainly the server logs like JBoss,
> > >> Custom application server logs (May or may not be log4j logs) and the
> files
> > >> size can go potentially upto 100 MB2. The logs are spread across
> multiple
> > >> servers (25 to 30 servers)2. Capability to be do search almost
> realtime3.
> > >> Support  distributed search
> > >>
> > >> Our search criterion can be based on a keyword or timestamp or IP
> address
> > etc.
> > >> Can anyone throw some light if solr/lucene is right solution for this
> ?
> > >> Appreciate any quick help in this regard.
> > >> Thanks,Surfer
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Questions regarding IT search solution

Silent Surfer
In reply to this post by Silent Surfer
Hi Jeff,
Thanks for the link.  You are my lifesaver :)This is exactly simillar to what I am looking for.
Thanks,Surfer

--- On Fri, 6/5/09, Jeff Hammerbacher <[hidden email]> wrote:

From: Jeff Hammerbacher <[hidden email]>
Subject: Re: Questions regarding IT search solution
To: [hidden email], [hidden email]
Date: Friday, June 5, 2009, 12:15 AM

Hey,

Your system sounds similar to the work don by Stu Hood at Rackspace in their
Mailtrust unit. See
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-datafor
more details and inspiration.

Regards,
Jeff

On Thu, Jun 4, 2009 at 4:58 PM, <[hidden email]> wrote:

> Hi,
> This is encouraging to know that solr/lucene solution may work.
> Can anyone using solr/lucene for such scenario can confirm that the
> solution is used and working fine? That would be really helpful, as I just
> started looking into the solr/lucene solution only couple of days back and
> might be difficult to be 100% confident before proposing the solution
> approach in next couple of days.
> Thanks,Surfer
>
> --- On Thu, 6/4/09, Otis Gospodnetic <[hidden email]> wrote:
>
> From: Otis Gospodnetic <[hidden email]>
> Subject: Re: Questions regarding IT search solution
> To:
[hidden email]
> Date: Thursday, June 4, 2009, 10:26 PM
>
>
> My guess is Solr/Lucene would work.  Not sure how well/fast, but it would,
> esp. if you avoid range queries (or use tdate), and esp. if you
> shard/segment indices smartly, so that at query time you send (or distribute
> if you have to) the query to only those shards that have the data (if your
> query is for a limited time period).
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Silent Surfer <[hidden email]>
> > To: [hidden email]
> > Sent: Thursday, June 4, 2009 5:52:21 PM
> > Subject: Re:
>  Questions regarding IT search solution
> >
> > Hi,
> > As Alex correctly pointed out my main intention is to figure out whether
> > Solr/lucene offer functionalities to replicate what Splunk is doing in
> terms of
> > building indexes etc for enabling search capabilities.
> > We evaluated Splunk, but it is not very cost effective solution for us as
> we may
> > have logs running into few GBs per day as there can be around 25-20
> servers
> > running, and Splunk licensing model is based of size of logs per day that
> too,
> > the license valid for only 1 year.
> > With this back ground, any further inputs on this are greatly
> appreciated.
> > Thanks,Surfer
> >
> > --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
> >
> > From: Alexandre Rafalovitch
> > Subject: Re: Questions regarding IT search solution
> > To: [hidden email]
> > Date: Thursday, June 4, 2009, 9:27 PM
> >
> > I would also be interested to know what other existing solutions exist.
> >
> > Splunk's advantage is that it does extraction of the fields with
> > advanced searching functionality (it has lexers/parsers for multiple
> > content types). I believe that's the Solr's function desired in
> > original posting. At the time they came out (2004), I was not aware of
> > any good open source solutions to do what they did. And I would have
> > loved one, as I was analyzing multi-gigabite logs.
> >
> > Hadoop might be a way to process the files, but what would do the
> > indexing and searching?
> >
> > Regards,
> >     Alex.
> >
> > On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
> > > Why build one? Don't those already exist?
> > >
> > > Personally, I'd start with Hadoop instead of Solr. Putting
>  logs in a
> > > search index is guaranteed to not scale. People were already trying
> > > different approaches ten years ago.
> > >
> > > wunder
> > >
> > > On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> > >
> > >> Hi,
> > >> Any help/pointers on the following message would really help me..
> > >> Thanks,Surfer
> > >>
> > >> --- On Tue, 6/2/09, Silent Surfer wrote:
> > >>
> > >> From: Silent Surfer
> > >> Subject: Questions regarding IT search solution
> > >> To: [hidden email]
> > >> Date: Tuesday, June 2, 2009, 5:45 PM
> > >>
> > >> Hi,
> > >> I am new to Lucene forum and it is my first question.I need a
> clarification
> > >> from you.
> > >> Requirement:------------------1. Build a IT search tool for logs
> similar to
> > >> that of Splunk(Only wrt searching logs but not in terms of reporting,
> graphs
> > >> etc) using
>  solr/lucene. The log files are mainly the server logs like JBoss,
> > >> Custom application server logs (May or may not be log4j logs) and the
> files
> > >> size can go potentially upto 100 MB2. The logs are spread across
> multiple
> > >> servers (25 to 30 servers)2. Capability to be do search almost
> realtime3.
> > >> Support  distributed search
> > >>
> > >> Our search criterion can be based on a keyword or timestamp or IP
> address
> > etc.
> > >> Can anyone throw some light if solr/lucene is right solution for this
> ?
> > >> Appreciate any quick help in this regard.
> > >> Thanks,Surfer
>
>
>
>
>
>