Why is Nutch not involved in Google Summer of Code - 2008?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Why is Nutch not involved in Google Summer of Code - 2008?

Susam Pal
Hi,

I was wondering why Nutch project is not involved in Google SoC:
http://code.google.com/soc/2008/ Many Apache projects including
Commons, Hadoop and Mahout have put up the ideas here:
http://wiki.apache.org/general/SummerOfCode2008

Wouldn't it be great to have students helping the project out with
some of the work which noone has found time for? For example, many
people have requested for a POST based authentication support in
Nutch. I personally wanted to do it after adding HTTP Authentication
Schemes, but unfortunately I could never manage my time well to do it
since it would require a good deal of effort. I am sure, there are
many such ideas which have not been done because the contributors did
not get time. IMHO, it would be great if students are given
opportunity to contribute through GSoC 2008. The mentors can guide
them through the work for a few hours every week and some valuable
work can be done. What do you say?

Regards,
Susam Pal
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Andrew York
Well Susam I agree with you. I can dedicate some time to the POST
based authentication(something i've been working on).

Also, i've noticed there's no book about nutch, which makes things
extremely hard  if you want to dive in.  Well, I know it takes time to
do such a thing but maybe we can put our efforts to create something
closer to it.

So, here are the things I miss the most:

- Supported Solr Integration
- POST based authentication

Regards,
           Yoanis







On 3/22/08, Susam Pal <[hidden email]> wrote:

> Hi,
>
> I was wondering why Nutch project is not involved in Google SoC:
> http://code.google.com/soc/2008/ Many Apache projects including
> Commons, Hadoop and Mahout have put up the ideas here:
> http://wiki.apache.org/general/SummerOfCode2008
>
> Wouldn't it be great to have students helping the project out with
> some of the work which noone has found time for? For example, many
> people have requested for a POST based authentication support in
> Nutch. I personally wanted to do it after adding HTTP Authentication
> Schemes, but unfortunately I could never manage my time well to do it
> since it would require a good deal of effort. I am sure, there are
> many such ideas which have not been done because the contributors did
> not get time. IMHO, it would be great if students are given
> opportunity to contribute through GSoC 2008. The mentors can guide
> them through the work for a few hours every week and some valuable
> work can be done. What do you say?
>
> Regards,
> Susam Pal
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Dingding Ye
I'm also looking forward to solr integration to nutch.

On Mon, Mar 24, 2008 at 2:39 AM, All day coders <[hidden email]>
wrote:

> Well Susam I agree with you. I can dedicate some time to the POST
> based authentication(something i've been working on).
>
> Also, i've noticed there's no book about nutch, which makes things
> extremely hard  if you want to dive in.  Well, I know it takes time to
> do such a thing but maybe we can put our efforts to create something
> closer to it.
>
> So, here are the things I miss the most:
>
> - Supported Solr Integration
> - POST based authentication
>
> Regards,
>           Yoanis
>
>
>
>
>
>
>
> On 3/22/08, Susam Pal <[hidden email]> wrote:
> > Hi,
> >
> > I was wondering why Nutch project is not involved in Google SoC:
> > http://code.google.com/soc/2008/ Many Apache projects including
> > Commons, Hadoop and Mahout have put up the ideas here:
> > http://wiki.apache.org/general/SummerOfCode2008
> >
> > Wouldn't it be great to have students helping the project out with
> > some of the work which noone has found time for? For example, many
> > people have requested for a POST based authentication support in
> > Nutch. I personally wanted to do it after adding HTTP Authentication
> > Schemes, but unfortunately I could never manage my time well to do it
> > since it would require a good deal of effort. I am sure, there are
> > many such ideas which have not been done because the contributors did
> > not get time. IMHO, it would be great if students are given
> > opportunity to contribute through GSoC 2008. The mentors can guide
> > them through the work for a few hours every week and some valuable
> > work can be done. What do you say?
> >
> > Regards,
> > Susam Pal
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Andrew York
Sishen:
I'm not very good at organizing things, but I'm looking forward to do it.
Are you a student?

Susam, would I be asking too much if I ask you to share your experiences
about how to came up with the HTTP Authentication for Nutch? I spent a
couple of days struggling with the code, but I didn't make much progress. I
guess I'm missing the big picture (something that happens quite often when
trying to extend Nutch, at least for me).



On Mon, Mar 24, 2008 at 4:04 AM, sishen <[hidden email]> wrote:

> I'm also looking forward to solr integration to nutch.
>
> On Mon, Mar 24, 2008 at 2:39 AM, All day coders <[hidden email]>
> wrote:
>
> > Well Susam I agree with you. I can dedicate some time to the POST
> > based authentication(something i've been working on).
> >
> > Also, i've noticed there's no book about nutch, which makes things
> > extremely hard  if you want to dive in.  Well, I know it takes time to
> > do such a thing but maybe we can put our efforts to create something
> > closer to it.
> >
> > So, here are the things I miss the most:
> >
> > - Supported Solr Integration
> > - POST based authentication
> >
> > Regards,
> >           Yoanis
> >
> >
> >
> >
> >
> >
> >
> > On 3/22/08, Susam Pal <[hidden email]> wrote:
> > > Hi,
> > >
> > > I was wondering why Nutch project is not involved in Google SoC:
> > > http://code.google.com/soc/2008/ Many Apache projects including
> > > Commons, Hadoop and Mahout have put up the ideas here:
> > > http://wiki.apache.org/general/SummerOfCode2008
> > >
> > > Wouldn't it be great to have students helping the project out with
> > > some of the work which noone has found time for? For example, many
> > > people have requested for a POST based authentication support in
> > > Nutch. I personally wanted to do it after adding HTTP Authentication
> > > Schemes, but unfortunately I could never manage my time well to do it
> > > since it would require a good deal of effort. I am sure, there are
> > > many such ideas which have not been done because the contributors did
> > > not get time. IMHO, it would be great if students are given
> > > opportunity to contribute through GSoC 2008. The mentors can guide
> > > them through the work for a few hours every week and some valuable
> > > work can be done. What do you say?
> > >
> > > Regards,
> > > Susam Pal
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Dingding Ye
Hi, rac.nosotros.

I'm not a student.
But i'm eager to do the work. Maybe I can work with some guys if there are
to do that.

I think it's very meaningful to integrate the solr into nutch.


On Tue, Mar 25, 2008 at 4:26 AM, All day coders <[hidden email]>
wrote:

> Sishen:
> I'm not very good at organizing things, but I'm looking forward to do it.
> Are you a student?
>
> Susam, would I be asking too much if I ask you to share your experiences
> about how to came up with the HTTP Authentication for Nutch? I spent a
> couple of days struggling with the code, but I didn't make much progress.
> I
> guess I'm missing the big picture (something that happens quite often when
> trying to extend Nutch, at least for me).
>
>
>
> On Mon, Mar 24, 2008 at 4:04 AM, sishen <[hidden email]> wrote:
>
> > I'm also looking forward to solr integration to nutch.
> >
> > On Mon, Mar 24, 2008 at 2:39 AM, All day coders <[hidden email]>
> > wrote:
> >
> > > Well Susam I agree with you. I can dedicate some time to the POST
> > > based authentication(something i've been working on).
> > >
> > > Also, i've noticed there's no book about nutch, which makes things
> > > extremely hard  if you want to dive in.  Well, I know it takes time to
> > > do such a thing but maybe we can put our efforts to create something
> > > closer to it.
> > >
> > > So, here are the things I miss the most:
> > >
> > > - Supported Solr Integration
> > > - POST based authentication
> > >
> > > Regards,
> > >           Yoanis
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 3/22/08, Susam Pal <[hidden email]> wrote:
> > > > Hi,
> > > >
> > > > I was wondering why Nutch project is not involved in Google SoC:
> > > > http://code.google.com/soc/2008/ Many Apache projects including
> > > > Commons, Hadoop and Mahout have put up the ideas here:
> > > > http://wiki.apache.org/general/SummerOfCode2008
> > > >
> > > > Wouldn't it be great to have students helping the project out with
> > > > some of the work which noone has found time for? For example, many
> > > > people have requested for a POST based authentication support in
> > > > Nutch. I personally wanted to do it after adding HTTP Authentication
> > > > Schemes, but unfortunately I could never manage my time well to do
> it
> > > > since it would require a good deal of effort. I am sure, there are
> > > > many such ideas which have not been done because the contributors
> did
> > > > not get time. IMHO, it would be great if students are given
> > > > opportunity to contribute through GSoC 2008. The mentors can guide
> > > > them through the work for a few hours every week and some valuable
> > > > work can be done. What do you say?
> > > >
> > > > Regards,
> > > > Susam Pal
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Otis Gospodnetic-2-2
In reply to this post by Susam Pal
Hi Susam,

Good question, and I'm afraid we may be a little late:
    http://wiki.apache.org/general/SummerOfCodeMentor

I think the main problem is that nobody has time to be the mentor.

As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.

If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008

Does any committer have time?
Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Susam Pal <[hidden email]>
To: [hidden email]
Sent: Saturday, March 22, 2008 9:50:48 AM
Subject: Why is Nutch not involved in Google Summer of Code - 2008?

Hi,

I was wondering why Nutch project is not involved in Google SoC:
http://code.google.com/soc/2008/ Many Apache projects including
Commons, Hadoop and Mahout have put up the ideas here:
http://wiki.apache.org/general/SummerOfCode2008

Wouldn't it be great to have students helping the project out with
some of the work which noone has found time for? For example, many
people have requested for a POST based authentication support in
Nutch. I personally wanted to do it after adding HTTP Authentication
Schemes, but unfortunately I could never manage my time well to do it
since it would require a good deal of effort. I am sure, there are
many such ideas which have not been done because the contributors did
not get time. IMHO, it would be great if students are given
opportunity to contribute through GSoC 2008. The mentors can guide
them through the work for a few hours every week and some valuable
work can be done. What do you say?

Regards,
Susam Pal



Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Dennis Kubes-2
How much of a time commitment would we need to make?

Dennis

[hidden email] wrote:

> Hi Susam,
>
> Good question, and I'm afraid we may be a little late:
>     http://wiki.apache.org/general/SummerOfCodeMentor
>
> I think the main problem is that nobody has time to be the mentor.
>
> As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.
>
> If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008
>
> Does any committer have time?
> Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> ----- Original Message ----
> From: Susam Pal <[hidden email]>
> To: [hidden email]
> Sent: Saturday, March 22, 2008 9:50:48 AM
> Subject: Why is Nutch not involved in Google Summer of Code - 2008?
>
> Hi,
>
> I was wondering why Nutch project is not involved in Google SoC:
> http://code.google.com/soc/2008/ Many Apache projects including
> Commons, Hadoop and Mahout have put up the ideas here:
> http://wiki.apache.org/general/SummerOfCode2008
>
> Wouldn't it be great to have students helping the project out with
> some of the work which noone has found time for? For example, many
> people have requested for a POST based authentication support in
> Nutch. I personally wanted to do it after adding HTTP Authentication
> Schemes, but unfortunately I could never manage my time well to do it
> since it would require a good deal of effort. I am sure, there are
> many such ideas which have not been done because the contributors did
> not get time. IMHO, it would be great if students are given
> opportunity to contribute through GSoC 2008. The mentors can guide
> them through the work for a few hours every week and some valuable
> work can be done. What do you say?
>
> Regards,
> Susam Pal
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Susam Pal
I believe a couple of hours every week should be enough. Last year, I
signed up as a mentor for OSVDB and we managed to get some useful job
done. However, I hardly spent any time for the whole project. Though a
few people sign up us a mentor but it is mostly a community effort.
The students interact with the community and his assigned mentor
through the mailing list and since the whole community is there to
guide him, there is not much of a burden on the mentor.

Regards,
Susam Pal

On Sun, Mar 30, 2008 at 8:55 PM, Dennis Kubes <[hidden email]> wrote:

> How much of a time commitment would we need to make?
>
>  Dennis
>
>
>
>  [hidden email] wrote:
>  > Hi Susam,
>  >
>  > Good question, and I'm afraid we may be a little late:
>  >     http://wiki.apache.org/general/SummerOfCodeMentor
>  >
>  > I think the main problem is that nobody has time to be the mentor.
>  >
>  > As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.
>  >
>  > If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008
>  >
>  > Does any committer have time?
>  > Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing
>  >
>  > Otis
>  > --
>  > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>  >
>  > ----- Original Message ----
>  > From: Susam Pal <[hidden email]>
>  > To: [hidden email]
>  > Sent: Saturday, March 22, 2008 9:50:48 AM
>  > Subject: Why is Nutch not involved in Google Summer of Code - 2008?
>  >
>  > Hi,
>  >
>  > I was wondering why Nutch project is not involved in Google SoC:
>  > http://code.google.com/soc/2008/ Many Apache projects including
>  > Commons, Hadoop and Mahout have put up the ideas here:
>  > http://wiki.apache.org/general/SummerOfCode2008
>  >
>  > Wouldn't it be great to have students helping the project out with
>  > some of the work which noone has found time for? For example, many
>  > people have requested for a POST based authentication support in
>  > Nutch. I personally wanted to do it after adding HTTP Authentication
>  > Schemes, but unfortunately I could never manage my time well to do it
>  > since it would require a good deal of effort. I am sure, there are
>  > many such ideas which have not been done because the contributors did
>  > not get time. IMHO, it would be great if students are given
>  > opportunity to contribute through GSoC 2008. The mentors can guide
>  > them through the work for a few hours every week and some valuable
>  > work can be done. What do you say?
>  >
>  > Regards,
>  > Susam Pal
>  >
>  >
>  >
>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Andrzej Białecki-2
In reply to this post by Otis Gospodnetic-2-2
[hidden email] wrote:

> Hi Susam,
>
> Good question, and I'm afraid we may be a little late:
>     http://wiki.apache.org/general/SummerOfCodeMentor
>
> I think the main problem is that nobody has time to be the mentor.
>
> As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.
>
> If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008
>
> Does any committer have time?
> Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing

I agree it would be great, but unfortunately I'm much too busy to commit
myself to being a mentor.


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Dennis Kubes-2
In reply to this post by Susam Pal
Ok, I should be able to be a mentor.  Besides solr integration are there
other ideas for the project?  Also is it too late?

Dennis

Susam Pal wrote:

> I believe a couple of hours every week should be enough. Last year, I
> signed up as a mentor for OSVDB and we managed to get some useful job
> done. However, I hardly spent any time for the whole project. Though a
> few people sign up us a mentor but it is mostly a community effort.
> The students interact with the community and his assigned mentor
> through the mailing list and since the whole community is there to
> guide him, there is not much of a burden on the mentor.
>
> Regards,
> Susam Pal
>
> On Sun, Mar 30, 2008 at 8:55 PM, Dennis Kubes <[hidden email]> wrote:
>> How much of a time commitment would we need to make?
>>
>>  Dennis
>>
>>
>>
>>  [hidden email] wrote:
>>  > Hi Susam,
>>  >
>>  > Good question, and I'm afraid we may be a little late:
>>  >     http://wiki.apache.org/general/SummerOfCodeMentor
>>  >
>>  > I think the main problem is that nobody has time to be the mentor.
>>  >
>>  > As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.
>>  >
>>  > If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008
>>  >
>>  > Does any committer have time?
>>  > Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing
>>  >
>>  > Otis
>>  > --
>>  > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>  >
>>  > ----- Original Message ----
>>  > From: Susam Pal <[hidden email]>
>>  > To: [hidden email]
>>  > Sent: Saturday, March 22, 2008 9:50:48 AM
>>  > Subject: Why is Nutch not involved in Google Summer of Code - 2008?
>>  >
>>  > Hi,
>>  >
>>  > I was wondering why Nutch project is not involved in Google SoC:
>>  > http://code.google.com/soc/2008/ Many Apache projects including
>>  > Commons, Hadoop and Mahout have put up the ideas here:
>>  > http://wiki.apache.org/general/SummerOfCode2008
>>  >
>>  > Wouldn't it be great to have students helping the project out with
>>  > some of the work which noone has found time for? For example, many
>>  > people have requested for a POST based authentication support in
>>  > Nutch. I personally wanted to do it after adding HTTP Authentication
>>  > Schemes, but unfortunately I could never manage my time well to do it
>>  > since it would require a good deal of effort. I am sure, there are
>>  > many such ideas which have not been done because the contributors did
>>  > not get time. IMHO, it would be great if students are given
>>  > opportunity to contribute through GSoC 2008. The mentors can guide
>>  > them through the work for a few hours every week and some valuable
>>  > work can be done. What do you say?
>>  >
>>  > Regards,
>>  > Susam Pal
>>  >
>>  >
>>  >
>>
Reply | Threaded
Open this post in threaded view
|

Re: Why is Nutch not involved in Google Summer of Code - 2008?

Otis Gospodnetic-2-2
In reply to this post by Susam Pal
Hi Dennis,

Not too late, I think, just add Nutch + Solr idea to http://wiki.apache.org/general/SummerOfCode2008 on Monday.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Dennis Kubes <[hidden email]>
To: [hidden email]
Sent: Sunday, March 30, 2008 8:04:39 PM
Subject: Re: Why is Nutch not involved in Google Summer of Code - 2008?

Ok, I should be able to be a mentor.  Besides solr integration are there
other ideas for the project?  Also is it too late?

Dennis

Susam Pal wrote:

> I believe a couple of hours every week should be enough. Last year, I
> signed up as a mentor for OSVDB and we managed to get some useful job
> done. However, I hardly spent any time for the whole project. Though a
> few people sign up us a mentor but it is mostly a community effort.
> The students interact with the community and his assigned mentor
> through the mailing list and since the whole community is there to
> guide him, there is not much of a burden on the mentor.
>
> Regards,
> Susam Pal
>
> On Sun, Mar 30, 2008 at 8:55 PM, Dennis Kubes <[hidden email]> wrote:
>> How much of a time commitment would we need to make?
>>
>>  Dennis
>>
>>
>>
>>  [hidden email] wrote:
>>  > Hi Susam,
>>  >
>>  > Good question, and I'm afraid we may be a little late:
>>  >     http://wiki.apache.org/general/SummerOfCodeMentor
>>  >
>>  > I think the main problem is that nobody has time to be the mentor.
>>  >
>>  > As for ideas, I think Solr integration would be very nice to have.  Solr, with its recent support for distributed searching could possibly even become the default searcher for Nutch, so we don't have duplicated functionality in Nutch.
>>  >
>>  > If somebody volunteers to be a mentor, we can try quickly add the project idea+mentor name to http://wiki.apache.org/general/SummerOfCode2008
>>  >
>>  > Does any committer have time?
>>  > Andrzej BialeckiMike CafarellaJérôme CharronDoug CuttingDoğacan GüneyPiotr KosiorowskiDennis KubesChris A. MattmannSami SirenJohn Xing
>>  >
>>  > Otis
>>  > --
>>  > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>  >
>>  > ----- Original Message ----
>>  > From: Susam Pal <[hidden email]>
>>  > To: [hidden email]
>>  > Sent: Saturday, March 22, 2008 9:50:48 AM
>>  > Subject: Why is Nutch not involved in Google Summer of Code - 2008?
>>  >
>>  > Hi,
>>  >
>>  > I was wondering why Nutch project is not involved in Google SoC:
>>  > http://code.google.com/soc/2008/ Many Apache projects including
>>  > Commons, Hadoop and Mahout have put up the ideas here:
>>  > http://wiki.apache.org/general/SummerOfCode2008
>>  >
>>  > Wouldn't it be great to have students helping the project out with
>>  > some of the work which noone has found time for? For example, many
>>  > people have requested for a POST based authentication support in
>>  > Nutch. I personally wanted to do it after adding HTTP Authentication
>>  > Schemes, but unfortunately I could never manage my time well to do it
>>  > since it would require a good deal of effort. I am sure, there are
>>  > many such ideas which have not been done because the contributors did
>>  > not get time. IMHO, it would be great if students are given
>>  > opportunity to contribute through GSoC 2008. The mentors can guide
>>  > them through the work for a few hours every week and some valuable
>>  > work can be done. What do you say?
>>  >
>>  > Regards,
>>  > Susam Pal
>>  >
>>  >
>>  >
>>