How to Index IP address

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to Index IP address

Nga564
Hi All,

I have a txt file, that captured all of my network traffic.  How can I use
Solr to filter out a particular IP address?

Thank you,
Nga.
Reply | Threaded
Open this post in threaded view
|

Re: How to Index IP address

Matthew Runo
I don't think that Solr is the best thing to use for searching a text  
file. I'd use grep myself, if you're on a unix-like system.

To use solr, you'd need to throw each network 'event' (GET, POST, etc  
etc) into an XML document, and post those into Solr so it could  
generate the index. You could then do things like
ip:10.206.158.154 to find a specific IP address, or even ip:
10.206.158* to get a subnet.

Perhaps the thing that's building your text file could post to Solr  
instead?

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
[hidden email] - 702-943-7833

On Mar 24, 2009, at 9:32 AM, nga pham wrote:

> Hi All,
>
> I have a txt file, that captured all of my network traffic.  How can  
> I use
> Solr to filter out a particular IP address?
>
> Thank you,
> Nga.

Reply | Threaded
Open this post in threaded view
|

Re: How to Index IP address

Nga564
Do you think luence is better to filter out a particular IP address from a
txt file?

Thank you Runo,
Nga

On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <[hidden email]> wrote:

> I don't think that Solr is the best thing to use for searching a text file.
> I'd use grep myself, if you're on a unix-like system.
>
> To use solr, you'd need to throw each network 'event' (GET, POST, etc etc)
> into an XML document, and post those into Solr so it could generate the
> index. You could then do things like
> ip:10.206.158.154 to find a specific IP address, or even ip:10.206.158* to
> get a subnet.
>
> Perhaps the thing that's building your text file could post to Solr
> instead?
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> [hidden email] - 702-943-7833
>
>
> On Mar 24, 2009, at 9:32 AM, nga pham wrote:
>
> Hi All,
>>
>> I have a txt file, that captured all of my network traffic.  How can I use
>> Solr to filter out a particular IP address?
>>
>> Thank you,
>> Nga.
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How to Index IP address

Matthew Runo
Well, I think you'll have the same problem. Lucene, and Solr (since  
it's built on Lucene) are both going to expect a structured document  
as input. Once you send in a bunch of documents, you can then query  
them for whatever you want to find.

A quick search of the internets found me this Apache Labs project -  
called Pinpoint. It's designed to take log data in, and build an index  
out of it. I'm not sure how developed it is, but it might be a good  
starting point for you. There are probably other projects out there  
along the same lines.. Here's Pinpoint: http://svn.apache.org/repos/asf/labs/pinpoint/trunk/

Why do you want to use Solr / Lucene to look through your files? If  
you have a huge dataset, some people are using Hadoop (a version of  
Google's MapReduce) to look through very large sets of logfiles: http://www.lexemetech.com/2008/01/hadoop-and-log-file-analysis.html

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
[hidden email] - 702-943-7833

On Mar 24, 2009, at 10:28 AM, nga pham wrote:

> Do you think luence is better to filter out a particular IP address  
> from a
> txt file?
>
> Thank you Runo,
> Nga
>
> On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <[hidden email]>  
> wrote:
>
>> I don't think that Solr is the best thing to use for searching a  
>> text file.
>> I'd use grep myself, if you're on a unix-like system.
>>
>> To use solr, you'd need to throw each network 'event' (GET, POST,  
>> etc etc)
>> into an XML document, and post those into Solr so it could generate  
>> the
>> index. You could then do things like
>> ip:10.206.158.154 to find a specific IP address, or even ip:
>> 10.206.158* to
>> get a subnet.
>>
>> Perhaps the thing that's building your text file could post to Solr
>> instead?
>>
>> Thanks for your time!
>>
>> Matthew Runo
>> Software Engineer, Zappos.com
>> [hidden email] - 702-943-7833
>>
>>
>> On Mar 24, 2009, at 9:32 AM, nga pham wrote:
>>
>> Hi All,
>>>
>>> I have a txt file, that captured all of my network traffic.  How  
>>> can I use
>>> Solr to filter out a particular IP address?
>>>
>>> Thank you,
>>> Nga.
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: How to Index IP address

Alexandre Rafalovitch
Well,

A log file is theoretically structured. Every log record is a - very -
flat set of fields. So, every log file line would be a Lucene
document. Then, one could use Solr to search, filter and facet
records.

Of course, this requires parsing log file back into record components.
Most log files were created for output, not for re-input. But if you
can parse it back, you might be able to do custom data import. Or, if
you can intercept log file before it hits serialization, you might be
able to index the fields directly.

Or you could just buy Splunk ( http://www.splunk.com/ ) and be done
with it. Parsing and visualizing log files is exactly what they set
out to deal with. No (great) open source solution yet.

Regards,
    Alex.
Personal blog: http://blog.outerthoughts.com/
Research group: http://www.clt.mq.edu.au/Research/
- I think age is a very high price to pay for maturity (Tom Stoppard)


On Tue, Mar 24, 2009 at 2:40 PM, Matthew Runo <[hidden email]> wrote:
> Well, I think you'll have the same problem. Lucene, and Solr (since it's
> built on Lucene) are both going to expect a structured document as input.
> Once you send in a bunch of documents, you can then query them for whatever
> you want to find.