nutch slide in lucene presentation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

nutch slide in lucene presentation

Yonik Seeley-2
I'll be giving an intro Lucene presentation at ApacheCon EU, and I'm
planning on including a slide each about Nutch and Solr... a "don't
reinvent the wheel" type of thing.

I'd appreciate any suggestions on what to present for Nutch... here is
my current slide:

        Nutch
* Open source web search application
* Crawlers
* Link-graph database
* Document parsers (HTML, word, pdf, etc)
* Language + charset detection
* Utilizes Hadoop (DFS + MapReduce) for massive scalability

Is a single slide sufficient?  I could throw in a screen shot too if
someone provided something spiffy.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: nutch slide in lucene presentation

Tim Archambault
Yonik,

Any chance this'll be available on the web? No chance I'll get there.

Also, I'm at OSCMS at Yahoo! and I see so much upside to integrating SOLR
with Drupal.  Faceted search capability in Drupal would be absolutely
killer. Has this conversation ever come up in the SOLR community?

Thanks for your time and good luck!

Tim Archambault
Online Manager
Bangordailynews.com

On 3/22/07, Yonik Seeley <[hidden email]> wrote:

>
> I'll be giving an intro Lucene presentation at ApacheCon EU, and I'm
> planning on including a slide each about Nutch and Solr... a "don't
> reinvent the wheel" type of thing.
>
> I'd appreciate any suggestions on what to present for Nutch... here is
> my current slide:
>
>         Nutch
> * Open source web search application
> * Crawlers
> * Link-graph database
> * Document parsers (HTML, word, pdf, etc)
> * Language + charset detection
> * Utilizes Hadoop (DFS + MapReduce) for massive scalability
>
> Is a single slide sufficient?  I could throw in a screen shot too if
> someone provided something spiffy.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: nutch slide in lucene presentation

Yonik Seeley-2
On 3/23/07, Tim Archambault <[hidden email]> wrote:
> Any chance this'll be available on the web? No chance I'll get there.

Yes, I'll make it available eventually (not before the conference...
that wouldn't be fair to the ASF or ApacheCon producers).  I'll try
and add some content in the notes sections so you're not left with a
bunch of bullets that don't make sense :-)

> Also, I'm at OSCMS at Yahoo! and I see so much upside to integrating SOLR
> with Drupal.  Faceted search capability in Drupal would be absolutely
> killer. Has this conversation ever come up in the SOLR community?

I don't know anything about Drupal, and I'm not sure if the current plugin
http://drupal.org/project/solr
supports faceting or not... but it might be a place to start.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: nutch slide in lucene presentation

Tim Archambault
Damn, I would really have liked to meet up. Are you around Saturday.


On 3/23/07, Yonik Seeley <[hidden email]> wrote:

> On 3/23/07, Tim Archambault <[hidden email]> wrote:
> > Any chance this'll be available on the web? No chance I'll get there.
>
> Yes, I'll make it available eventually (not before the conference...
> that wouldn't be fair to the ASF or ApacheCon producers).  I'll try
> and add some content in the notes sections so you're not left with a
> bunch of bullets that don't make sense :-)
>
> > Also, I'm at OSCMS at Yahoo! and I see so much upside to integrating SOLR
> > with Drupal.  Faceted search capability in Drupal would be absolutely
> > killer. Has this conversation ever come up in the SOLR community?
>
> I don't know anything about Drupal, and I'm not sure if the current plugin
> http://drupal.org/project/solr
> supports faceting or not... but it might be a place to start.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: nutch slide in lucene presentation

Sami Siren-2
In reply to this post by Yonik Seeley-2
Yonik Seeley wrote:
> I'd appreciate any suggestions on what to present for Nutch... here is
> my current slide:
>

some minor suggestions (addition of some jargon or buzzwords):

        Nutch
* Highly customizable Open source web search application
* Crawlers
* Link-graph database
* Document parsers (HTML, word, pdf, etc)
* Language + charset detection, MicroFormats Rel-Tag,
  OpenSearch 1.0, Creative Commons
* Utilizes Hadoop (DFS + MapReduce) for massive scalability yet
  maintaining a good performance on single server

Also it would be nice to mention the availability of the 0.9.0 (if it is
out by then ;)

--
 Sami Siren