Quantcast

Splunk + Hadoop

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Splunk + Hadoop

Shreya.Pal
Hi ,

Has anyone used Hadoop and splunk, or any other real-time processing tool over Hadoop?

Regards,
Shreya



This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Russell Jurney
I'm playing with using Hadoop and Pig to load MongoDB with data for Cube to
consume. Cube <https://github.com/square/cube/wiki> is a realtime tool...
but we'll be replaying events from the past.  Does that count?  It is nice
to batch backfill metrics into 'real-time' systems in bulk.

On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:

> Hi ,
>
> Has anyone used Hadoop and splunk, or any other real-time processing tool
> over Hadoop?
>
> Regards,
> Shreya
>
>
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to the
> sender and destroy all copies of the original message. Any unauthorized
> review, use, disclosure, dissemination, forwarding, printing or copying of
> this email, and/or any action taken in reliance on the contents of this
> e-mail is strictly prohibited and may be unlawful.
>

Russell Jurney twitter.com/rjurney [hidden email] datasyndrome.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Ravishankar Nair
Why not Hbase with Hadoop?
It's a best bet.
Rgds, Ravi

Sent from my Beethoven


On May 18, 2012, at 3:29 PM, Russell Jurney <[hidden email]> wrote:

> I'm playing with using Hadoop and Pig to load MongoDB with data for Cube to
> consume. Cube <https://github.com/square/cube/wiki> is a realtime tool...
> but we'll be replaying events from the past.  Does that count?  It is nice
> to batch backfill metrics into 'real-time' systems in bulk.
>
> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
>
>> Hi ,
>>
>> Has anyone used Hadoop and splunk, or any other real-time processing tool
>> over Hadoop?
>>
>> Regards,
>> Shreya
>>
>>
>>
>> This e-mail and any files transmitted with it are for the sole use of the
>> intended recipient(s) and may contain confidential and privileged
>> information. If you are not the intended recipient(s), please reply to the
>> sender and destroy all copies of the original message. Any unauthorized
>> review, use, disclosure, dissemination, forwarding, printing or copying of
>> this email, and/or any action taken in reliance on the contents of this
>> e-mail is strictly prohibited and may be unlawful.
>>
>
> Russell Jurney twitter.com/rjurney [hidden email] datasyndrome.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Russell Jurney
Because that isn't Cube.

Russell Jurney
twitter.com/rjurney
[hidden email]
datasyndrome.com

On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
<[hidden email]> wrote:

> Why not Hbase with Hadoop?
> It's a best bet.
> Rgds, Ravi
>
> Sent from my Beethoven
>
>
> On May 18, 2012, at 3:29 PM, Russell Jurney <[hidden email]> wrote:
>
>> I'm playing with using Hadoop and Pig to load MongoDB with data for Cube to
>> consume. Cube <https://github.com/square/cube/wiki> is a realtime tool...
>> but we'll be replaying events from the past.  Does that count?  It is nice
>> to batch backfill metrics into 'real-time' systems in bulk.
>>
>> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
>>
>>> Hi ,
>>>
>>> Has anyone used Hadoop and splunk, or any other real-time processing tool
>>> over Hadoop?
>>>
>>> Regards,
>>> Shreya
>>>
>>>
>>>
>>> This e-mail and any files transmitted with it are for the sole use of the
>>> intended recipient(s) and may contain confidential and privileged
>>> information. If you are not the intended recipient(s), please reply to the
>>> sender and destroy all copies of the original message. Any unauthorized
>>> review, use, disclosure, dissemination, forwarding, printing or copying of
>>> this email, and/or any action taken in reliance on the contents of this
>>> e-mail is strictly prohibited and may be unlawful.
>>>
>>
>> Russell Jurney twitter.com/rjurney [hidden email] datasyndrome.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Abhishek Pratap Singh
I have used Hadoop and Splunk both. Can you please let me know what is your
requirement?
Real time processing with hadoop depends upon What defines "Real time" in
particular scenario. Based on requirement, Real time (near real time) can
be achieved.

~Abhishek

On Fri, May 18, 2012 at 3:58 PM, Russell Jurney <[hidden email]>wrote:

> Because that isn't Cube.
>
> Russell Jurney
> twitter.com/rjurney
> [hidden email]
> datasyndrome.com
>
> On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
> <[hidden email]> wrote:
>
> > Why not Hbase with Hadoop?
> > It's a best bet.
> > Rgds, Ravi
> >
> > Sent from my Beethoven
> >
> >
> > On May 18, 2012, at 3:29 PM, Russell Jurney <[hidden email]>
> wrote:
> >
> >> I'm playing with using Hadoop and Pig to load MongoDB with data for
> Cube to
> >> consume. Cube <https://github.com/square/cube/wiki> is a realtime
> tool...
> >> but we'll be replaying events from the past.  Does that count?  It is
> nice
> >> to batch backfill metrics into 'real-time' systems in bulk.
> >>
> >> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
> >>
> >>> Hi ,
> >>>
> >>> Has anyone used Hadoop and splunk, or any other real-time processing
> tool
> >>> over Hadoop?
> >>>
> >>> Regards,
> >>> Shreya
> >>>
> >>>
> >>>
> >>> This e-mail and any files transmitted with it are for the sole use of
> the
> >>> intended recipient(s) and may contain confidential and privileged
> >>> information. If you are not the intended recipient(s), please reply to
> the
> >>> sender and destroy all copies of the original message. Any unauthorized
> >>> review, use, disclosure, dissemination, forwarding, printing or
> copying of
> >>> this email, and/or any action taken in reliance on the contents of this
> >>> e-mail is strictly prohibited and may be unlawful.
> >>>
> >>
> >> Russell Jurney twitter.com/rjurney [hidden email]
> datasyndrome.com
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Edward Capriolo
So a while back their was an article:
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

I recently did my own take on full text searching your logs with
solandra, though I have prototyped using solr inside datastax
enterprise as well.

http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/more_taco_bell_programming_with

Splunk has a graphical front end with a good deal of sophistication,
but I am quite happy just being able to solr search everything, and
providing my own front ends built in solr.

On Mon, May 21, 2012 at 5:13 PM, Abhishek Pratap Singh
<[hidden email]> wrote:

> I have used Hadoop and Splunk both. Can you please let me know what is your
> requirement?
> Real time processing with hadoop depends upon What defines "Real time" in
> particular scenario. Based on requirement, Real time (near real time) can
> be achieved.
>
> ~Abhishek
>
> On Fri, May 18, 2012 at 3:58 PM, Russell Jurney <[hidden email]>wrote:
>
>> Because that isn't Cube.
>>
>> Russell Jurney
>> twitter.com/rjurney
>> [hidden email]
>> datasyndrome.com
>>
>> On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
>> <[hidden email]> wrote:
>>
>> > Why not Hbase with Hadoop?
>> > It's a best bet.
>> > Rgds, Ravi
>> >
>> > Sent from my Beethoven
>> >
>> >
>> > On May 18, 2012, at 3:29 PM, Russell Jurney <[hidden email]>
>> wrote:
>> >
>> >> I'm playing with using Hadoop and Pig to load MongoDB with data for
>> Cube to
>> >> consume. Cube <https://github.com/square/cube/wiki> is a realtime
>> tool...
>> >> but we'll be replaying events from the past.  Does that count?  It is
>> nice
>> >> to batch backfill metrics into 'real-time' systems in bulk.
>> >>
>> >> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
>> >>
>> >>> Hi ,
>> >>>
>> >>> Has anyone used Hadoop and splunk, or any other real-time processing
>> tool
>> >>> over Hadoop?
>> >>>
>> >>> Regards,
>> >>> Shreya
>> >>>
>> >>>
>> >>>
>> >>> This e-mail and any files transmitted with it are for the sole use of
>> the
>> >>> intended recipient(s) and may contain confidential and privileged
>> >>> information. If you are not the intended recipient(s), please reply to
>> the
>> >>> sender and destroy all copies of the original message. Any unauthorized
>> >>> review, use, disclosure, dissemination, forwarding, printing or
>> copying of
>> >>> this email, and/or any action taken in reliance on the contents of this
>> >>> e-mail is strictly prohibited and may be unlawful.
>> >>>
>> >>
>> >> Russell Jurney twitter.com/rjurney [hidden email]
>> datasyndrome.com
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Splunk + Hadoop

Shreya.Pal
In reply to this post by Abhishek Pratap Singh
Hi Abhishek,

I am looking for a scenario where the customer representative needs to respond back to the customers on call.
They need to search on huge data and then respond back in few seconds.

Thanks and Regards,
Shreya Pal
Architect Technology
Cognizant Technology Pvt Ltd
Vnet - 205594
Mobile - +91-9766310680


-----Original Message-----
From: Abhishek Pratap Singh [mailto:[hidden email]]
Sent: Tuesday, May 22, 2012 2:44 AM
To: [hidden email]
Subject: Re: Splunk + Hadoop

I have used Hadoop and Splunk both. Can you please let me know what is your requirement?
Real time processing with hadoop depends upon What defines "Real time" in particular scenario. Based on requirement, Real time (near real time) can be achieved.

~Abhishek

On Fri, May 18, 2012 at 3:58 PM, Russell Jurney <[hidden email]>wrote:

> Because that isn't Cube.
>
> Russell Jurney
> twitter.com/rjurney
> [hidden email]
> datasyndrome.com
>
> On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
> <[hidden email]> wrote:
>
> > Why not Hbase with Hadoop?
> > It's a best bet.
> > Rgds, Ravi
> >
> > Sent from my Beethoven
> >
> >
> > On May 18, 2012, at 3:29 PM, Russell Jurney
> > <[hidden email]>
> wrote:
> >
> >> I'm playing with using Hadoop and Pig to load MongoDB with data for
> Cube to
> >> consume. Cube <https://github.com/square/cube/wiki> is a realtime
> tool...
> >> but we'll be replaying events from the past.  Does that count?  It
> >> is
> nice
> >> to batch backfill metrics into 'real-time' systems in bulk.
> >>
> >> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
> >>
> >>> Hi ,
> >>>
> >>> Has anyone used Hadoop and splunk, or any other real-time
> >>> processing
> tool
> >>> over Hadoop?
> >>>
> >>> Regards,
> >>> Shreya
> >>>
> >>>
> >>>
> >>> This e-mail and any files transmitted with it are for the sole use
> >>> of
> the
> >>> intended recipient(s) and may contain confidential and privileged
> >>> information. If you are not the intended recipient(s), please
> >>> reply to
> the
> >>> sender and destroy all copies of the original message. Any
> >>> unauthorized review, use, disclosure, dissemination, forwarding,
> >>> printing or
> copying of
> >>> this email, and/or any action taken in reliance on the contents of
> >>> this e-mail is strictly prohibited and may be unlawful.
> >>>
> >>
> >> Russell Jurney twitter.com/rjurney [hidden email]
> datasyndrome.com
>
This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Splunk + Hadoop

Nitin Pawar
Hi Shreya,

if you are looking at data locality, then you may or may not use hadoop out
of the box.
It will all depend on how you design the data layout on top of hdfs and how
do you implement search based on the customer queries.

a good idea might be have hop-in queryable database like mysql inbetween
where you can store the results of your data being processed on hadoop and
then use solr search for fast access and search.

Thanks,
Nitin

On Mon, May 28, 2012 at 12:41 PM, <[hidden email]> wrote:

> Hi Abhishek,
>
> I am looking for a scenario where the customer representative needs to
> respond back to the customers on call.
> They need to search on huge data and then respond back in few seconds.
>
> Thanks and Regards,
> Shreya Pal
> Architect Technology
> Cognizant Technology Pvt Ltd
> Vnet - 205594
> Mobile - +91-9766310680
>
>
> -----Original Message-----
> From: Abhishek Pratap Singh [mailto:[hidden email]]
> Sent: Tuesday, May 22, 2012 2:44 AM
> To: [hidden email]
> Subject: Re: Splunk + Hadoop
>
> I have used Hadoop and Splunk both. Can you please let me know what is
> your requirement?
> Real time processing with hadoop depends upon What defines "Real time" in
> particular scenario. Based on requirement, Real time (near real time) can
> be achieved.
>
> ~Abhishek
>
> On Fri, May 18, 2012 at 3:58 PM, Russell Jurney <[hidden email]
> >wrote:
>
> > Because that isn't Cube.
> >
> > Russell Jurney
> > twitter.com/rjurney
> > [hidden email]
> > datasyndrome.com
> >
> > On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
> > <[hidden email]> wrote:
> >
> > > Why not Hbase with Hadoop?
> > > It's a best bet.
> > > Rgds, Ravi
> > >
> > > Sent from my Beethoven
> > >
> > >
> > > On May 18, 2012, at 3:29 PM, Russell Jurney
> > > <[hidden email]>
> > wrote:
> > >
> > >> I'm playing with using Hadoop and Pig to load MongoDB with data for
> > Cube to
> > >> consume. Cube <https://github.com/square/cube/wiki> is a realtime
> > tool...
> > >> but we'll be replaying events from the past.  Does that count?  It
> > >> is
> > nice
> > >> to batch backfill metrics into 'real-time' systems in bulk.
> > >>
> > >> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
> > >>
> > >>> Hi ,
> > >>>
> > >>> Has anyone used Hadoop and splunk, or any other real-time
> > >>> processing
> > tool
> > >>> over Hadoop?
> > >>>
> > >>> Regards,
> > >>> Shreya
> > >>>
> > >>>
> > >>>
> > >>> This e-mail and any files transmitted with it are for the sole use
> > >>> of
> > the
> > >>> intended recipient(s) and may contain confidential and privileged
> > >>> information. If you are not the intended recipient(s), please
> > >>> reply to
> > the
> > >>> sender and destroy all copies of the original message. Any
> > >>> unauthorized review, use, disclosure, dissemination, forwarding,
> > >>> printing or
> > copying of
> > >>> this email, and/or any action taken in reliance on the contents of
> > >>> this e-mail is strictly prohibited and may be unlawful.
> > >>>
> > >>
> > >> Russell Jurney twitter.com/rjurney [hidden email]
> > datasyndrome.com
> >
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to the
> sender and destroy all copies of the original message. Any unauthorized
> review, use, disclosure, dissemination, forwarding, printing or copying of
> this email, and/or any action taken in reliance on the contents of this
> e-mail is strictly prohibited and may be unlawful.
>



--
Nitin Pawar
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Splunk + Hadoop

Tom Deutsch
In reply to this post by Shreya.Pal
Shreya - there are two major considerations here. First, can the system
process the required information, make it easily accessible, and do that
with the required accuracy for a user based search paradigm . Second, can
the system do that fast enough to meet the time window of the use case.

It is unclear what type/source of information needs to be processed and
then made available for retrieval, how long a search can take and still be
considered OK, or the total latency (not just retrieval during the search
phase) from information acquisition to being searchable. If you can share
those details the group can help provide more specific/better coaching.

------------------------------------------------
Tom Deutsch
Program Director
Information Management
Big Data Technologies
IBM
3565 Harbor Blvd
Costa Mesa, CA 92626-1420
[hidden email]

Twitter: @thomasdeutsch
Data Management Blog: ibmdatamag.com/author/tdeutsch/
LinkedIn: http://www.linkedin.com/profile/view?id=833160
Quora: http://www.quora.com/Tom-Deutsch
Smarter Computing Blog:
http://www.smartercomputingblog.com/contributorsprofile/?user_id=223
Big Data for Business Executives Group:
http://www.linkedin.com/groups?gid=4455695




From:   <[hidden email]>
To:     <[hidden email]>,
Date:   05/28/2012 12:12 AM
Subject:        RE: Splunk + Hadoop



Hi Abhishek,

I am looking for a scenario where the customer representative needs to
respond back to the customers on call.
They need to search on huge data and then respond back in few seconds.

Thanks and Regards,
Shreya Pal
Architect Technology
Cognizant Technology Pvt Ltd
Vnet - 205594
Mobile - +91-9766310680


-----Original Message-----
From: Abhishek Pratap Singh [mailto:[hidden email]]
Sent: Tuesday, May 22, 2012 2:44 AM
To: [hidden email]
Subject: Re: Splunk + Hadoop

I have used Hadoop and Splunk both. Can you please let me know what is
your requirement?
Real time processing with hadoop depends upon What defines "Real time" in
particular scenario. Based on requirement, Real time (near real time) can
be achieved.

~Abhishek

On Fri, May 18, 2012 at 3:58 PM, Russell Jurney
<[hidden email]>wrote:

> Because that isn't Cube.
>
> Russell Jurney
> twitter.com/rjurney
> [hidden email]
> datasyndrome.com
>
> On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
> <[hidden email]> wrote:
>
> > Why not Hbase with Hadoop?
> > It's a best bet.
> > Rgds, Ravi
> >
> > Sent from my Beethoven
> >
> >
> > On May 18, 2012, at 3:29 PM, Russell Jurney
> > <[hidden email]>
> wrote:
> >
> >> I'm playing with using Hadoop and Pig to load MongoDB with data for
> Cube to
> >> consume. Cube <https://github.com/square/cube/wiki> is a realtime
> tool...
> >> but we'll be replaying events from the past.  Does that count?  It
> >> is
> nice
> >> to batch backfill metrics into 'real-time' systems in bulk.
> >>
> >> On Fri, May 18, 2012 at 12:11 PM, <[hidden email]> wrote:
> >>
> >>> Hi ,
> >>>
> >>> Has anyone used Hadoop and splunk, or any other real-time
> >>> processing
> tool
> >>> over Hadoop?
> >>>
> >>> Regards,
> >>> Shreya
> >>>
> >>>
> >>>
> >>> This e-mail and any files transmitted with it are for the sole use
> >>> of
> the
> >>> intended recipient(s) and may contain confidential and privileged
> >>> information. If you are not the intended recipient(s), please
> >>> reply to
> the
> >>> sender and destroy all copies of the original message. Any
> >>> unauthorized review, use, disclosure, dissemination, forwarding,
> >>> printing or
> copying of
> >>> this email, and/or any action taken in reliance on the contents of
> >>> this e-mail is strictly prohibited and may be unlawful.
> >>>
> >>
> >> Russell Jurney twitter.com/rjurney [hidden email]
> datasyndrome.com
>
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. If you are not the intended recipient(s), please reply to the
sender and destroy all copies of the original message. Any unauthorized
review, use, disclosure, dissemination, forwarding, printing or copying of
this email, and/or any action taken in reliance on the contents of this
e-mail is strictly prohibited and may be unlawful.


Loading...