Re: Google Summer of Code

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Isabel Drost-3
On Thursday 06 March 2008, Dawid Weiss wrote:
> > five hours per student per week in summer should be doable, at least for
> > me.
>
> I'm out, unfortunately -- my academic job already comes with a bunch of
> students to take care of. It's fun, but it's time-consuming.

What about encouraging your students to submit their work at Mahout? Just a
naive thought of mine.

Isabel

--
Immature poets imitate, mature poets steal. -- T. S. Eliot, "Philip
Massinger"
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2

On Mar 6, 2008, at 1:32 AM, Isabel Drost wrote:

> On Wednesday 05 March 2008, Simon Willnauer wrote:
>> You could have a look at the FAQ or the GSoC pages
>> http://code.google.com/soc/2008/ and
>> http://code.google.com/soc/2008/faqs.html respectively.
>
> Hmm, there is little about what mentors are expected apart from the  
> following
> rather general question, is there?
>
> | 2. What is the role of a mentoring organization?
>
> If we want to take part in GSoC, from that question, I guess we need  
> a little
> more than only mentors:
>
> | A pool of project ideas for students to choose from.
>
> Grant already asked for ideas.
>
> | An organization administrator to act as the project's main point  
> of contact
> | for Google;
>
> Any volunteers?

I think this is covered by the ASF.

>
>
> | A person or group responsible for review and ranking of student
> | applications,
>
> I'd be happy to help out here. Anyone else?

Cool
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
In reply to this post by Isabel Drost-3
I think we can split the duties a bit, too.  Simon, can you share your  
experience since you did a GSOC for Lucene a few years back?  I seem  
to recall there being a couple of Lucene mentors.


On Mar 6, 2008, at 1:47 AM, Isabel Drost wrote:

> Finally found a general answer to my question:
>
> | While the answer to this question will vary widely depending on  
> the number
> | of students a mentor works with, the difficulty of the proposals  
> and the
> | skill level of the students, most mentors have let us know that they
> | underestimated the amount of time they would need to invest in  
> GSoC. Five
> | hours per student per week is a reasonable estimate.
>
> Sounds like doing it part time after work is going to be a bit tough  
> - but
> five hours per student per week in summer should be doable, at least  
> for me.
>
> As these are summer projects and certainly some of us are going to  
> be on
> vacation during that time we should plan for having one backup for  
> each
> mentor that goes on vacation at a different time.
>
>
> Isabel
>
>
> --
> You were s'posed to laugh!
>  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
>  /,`.-'`'    -.  ;-;;,_
> |,4-  ) )-,_..;\ (  `'-'
> '---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>


Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
In reply to this post by Isabel Drost-3
I think the Mentoring Org is already setup.  After March 3, mentors  
can register.  See http://wiki.apache.org/general/SummerOfCode2008.  
I'm willing to mentor, but would like to share the load a bit too.

-Grant


On Mar 6, 2008, at 1:56 AM, Isabel Drost wrote:

> On Saturday 01 March 2008, Grant Ingersoll wrote:
>> Also, any thoughts on what we might want someone to do?  I think it
>> would be great to have someone implement one of the algorithms on our
>> wiki.
>
> Just as a general note, the deadline for applications:
>
> March 12: Mentoring organization application deadline (12 noon PDT/
> 19:00 UTC).
>
> I suppose we should identify interesing tasks until that deadline.  
> As a
> general guideline for mentors and for project proposals:
>
> http://code.google.com/p/google-summer-of-code/wiki/AdviceforMentors
>
> Isabel
>
> --
> Better late than never. -- Titus Livius (Livy)
>  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
>  /,`.-'`'    -.  ;-;;,_
> |,4-  ) )-,_..;\ (  `'-'
> '---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>


Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Matthew Riley
In reply to this post by Isabel Drost-3
Hey Dawid,

Is it information retrieval from visual data you're working on? We have

> recently
> had a presentation about a guy who implemented motion detection on GPUs
> with
> very impressive speedups (orders of magnitude compared to normal CPUs).
> I'm
> wondering if your expertise here could be used to implement map-reduce
> distributed jobs for running multiple GPUs in parallel. I know this sounds
> a bit
> crazy, but I've heard of bio-engineering companies doing just that --
> running a
> cluster of GPUs to speed up their computations. Just a wild thought. Back
> to
> your proposal though.


Yes, it is basically information retrieval that I'm performing on sets of
images- in fact, a lot of the best algorithms employed today for object
detection, object retrieval, etc. are adaptations of basic text-retrieval
approaches (e.g. tfidf-weighted vector space models). I've personally never
worked with GPUs for image processing, but I imagine the vector processing
abilities would be useful at almost every stage of the indexing and
retrieval processes. I would be interested in looking into those
possibilities in more details.


> > mostly focused around approximate k-means algorithms (since that's a
> problem
> > I've been working on lately). It sounds like you guys are already
> > implementing canopy clustering for k-means- Is there any interest in
> > developing another approximation algorithm based on randomized kd-trees
> for
> > high dimensional data? What about mean-shift clustering?
>
>  From my experience the largest challenge in data clustering is not
> figuring out
> a new clustering methodology, but finding the right existing one to tackle
> a
> particular problem. Isabel mentioned web spam detection challenge --  this
> is a
> good example of a multi-feature classification problem and I know people
> have
> tried clustering the host graph to come up with more coarse-grained
> features for
> hosts. From my own interest, a very interesting challenge is doing
> something
> like Google News does (event aggregation). This is less trivial than you
> might
> think at first -- most news are very similar to each other (copy/paste and
> editing changes), so it's trivial to find small clusters of near-clones.
> Then
> the problem becomes more difficult because all news speak about pretty
> much the
> same people/ events (take presidential election in the U.S.). I think the
> problems you could state here are:
>
> 1) approximating optimal clustering granularity (call it the number of
> clusters
> if you wish, although I think clustering should be driven by other factors
> rather than just the number of clusters),
>
> 2) coming up with clusters of news items _other_ than keyword-based
> similarity.
> One example here is grouping news by region (geolocation), sentiment
> (positive/
> negative news), people-related news, etc.
>
> 3) multilingual news matching and clustering.
>
> All the above issues are on the border of different domains -- NLP,
> clustering,
> classification. The tricky part is being able to put them together. What
> would
> be of interest to you?


These are all interesting problems, actually. I've done some research into
sentiment analysis, as you mentioned in (2), and I think it's still a wide
open problem. Oren Etzioni at UWash does some interesting related work:
www.cs.washington.edu/homes/etzioni/.

I would basically be interested in doing anything that fits in well with the
overall goals of the Mahout project. Whether that is implementing well known
algorithms within the Hadoop framework or working on some novel idea is up
to the mentors, I presume. Personally, if I'm going to be working on
something novel, I would like to relate it to my current research work...
and I'm happy to discuss that with anyone on the list who is interested.

Matt


>
>
> D.
>
> >
> > Again, I would be glad to help in any way I can.
> >
> > Matt
> >
> > On Thu, Mar 6, 2008 at 12:56 AM, Isabel Drost <[hidden email]>
> > wrote:
> >
> >> On Saturday 01 March 2008, Grant Ingersoll wrote:
> >>> Also, any thoughts on what we might want someone to do?  I think it
> >>> would be great to have someone implement one of the algorithms on our
> >>> wiki.
> >> Just as a general note, the deadline for applications:
> >>
> >> March 12: Mentoring organization application deadline (12 noon
> PDT/19:00
> >> UTC).
> >>
> >> I suppose we should identify interesing tasks until that deadline. As a
> >> general guideline for mentors and for project proposals:
> >>
> >> http://code.google.com/p/google-summer-of-code/wiki/AdviceforMentors
> >>
> >> Isabel
> >>
> >> --
> >> Better late than never.         -- Titus Livius (Livy)
> >>   |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
> >>  /,`.-'`'    -.  ;-;;,_
> >>  |,4-  ) )-,_..;\ (  `'-'
> >> '---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2

On Mar 6, 2008, at 4:36 PM, Matthew Riley wrote:

> I would basically be interested in doing anything that fits in well  
> with the
> overall goals of the Mahout project. Whether that is implementing  
> well known
> algorithms within the Hadoop framework or working on some novel idea  
> is up
> to the mentors, I presume. Personally, if I'm going to be working on
> something novel, I would like to relate it to my current research  
> work...
> and I'm happy to discuss that with anyone on the list who is  
> interested.
>

Please do share your research work.  As for novel, versus w/in the  
goals, we like both.  I think at this stage, however, we do want to  
focus on those approaches that have stood the test of time (as short  
as that is) to some extent.  Personally, I would love to see someone  
take on SVM on Hadoop, but I am open to pretty much anything, so...

-Grant
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Isabel Drost-3
In reply to this post by Grant Ingersoll-2
On Thursday 06 March 2008, Grant Ingersoll wrote:
> I think we can split the duties a bit, too.

I think the Apache FAQ also said that - according with the usual Apache way of
doing things - it would be ok if the GSoC students would receive help from
all community members. So the actual time spent for one mentor could very
well drop to about 3h per week.

Still I would not rely on that when accepting the duty to become a mentor -
after all, at least officially it is the mentor who is responsible for
encouraging the student.

Isabel



--
The bug stops here.
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Isabel Drost-3
In reply to this post by Matthew Riley
On Thursday 06 March 2008, Matthew Riley wrote:
> I would basically be interested in doing anything that fits in well with
> the overall goals of the Mahout project. Whether that is implementing well
> known algorithms within the Hadoop framework or working on some novel idea
> is up to the mentors, I presume.

I would be happy with both options: Working on well known algorithms within
the Hadoop framework certainly is one of our main goals. But at least me
personally am also interested in providing space for novel ideas. I consider
it really important for researchers to not only publish the data they
experimented on but also the implementation used. If working on the latter
within Mahout helps to maybe focus a little more than usual on scalability
and maintainability - great.

So if you have an idea that fits well with your day to day work as well as
with the overall goals of Mahout that would be fine. I would guess, this
makes it easier to find some spare time to work on the project ;)

Isabel

--
Each new user of a new system uncovers a new class of bugs. -- Kernighan
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Dawid Weiss
In reply to this post by Isabel Drost-3

> What about encouraging your students to submit their work at Mahout? Just a
> naive thought of mine.

Those students I'm in charge of have their area of interest defined already --
too late to change it. Good idea for the future, I have been thinking about it,
actually.

D.
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
In reply to this post by Isabel Drost-3

On Mar 7, 2008, at 3:08 AM, Isabel Drost wrote:

> On Thursday 06 March 2008, Grant Ingersoll wrote:
>> I think we can split the duties a bit, too.
>
> I think the Apache FAQ also said that - according with the usual  
> Apache way of
> doing things - it would be ok if the GSoC students would receive  
> help from
> all community members. So the actual time spent for one mentor could  
> very
> well drop to about 3h per week.
>
> Still I would not rely on that when accepting the duty to become a  
> mentor -
> after all, at least officially it is the mentor who is responsible for
> encouraging the student.

Sounds good.  I should also note that all mentoring should (barring  
personal conversation) should take place on the dev list.  That is,  
decisions, discussions on what to do should be done on the list so  
that we all benefit from the understanding.  Not that you were  
suggesting otherwise!

-Grant

Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Isabel Drost-3
On Friday 07 March 2008, Grant Ingersoll wrote:
> Sounds good.  I should also note that all mentoring should (barring
> personal conversation) should take place on the dev list.  That is,
> decisions, discussions on what to do should be done on the list so
> that we all benefit from the understanding.  Not that you were
> suggesting otherwise!

Sure, after all, GSoC is about integrating students into free software
projects - and making decisions offline certainly is not the way, Apache
projects work. Thanks for pointing that out.

Isabel


--
Never pay a compliment as if expecting a receipt.
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
Note, the deadline for project proposals is March 12.

I put an item up for us at: http://wiki.apache.org/general/SummerOfCode2008 
    I think it is probably general enough to cover all of the bases  
discussed here.  Please feel free to add your name to the list of  
mentors if you can.  Perhaps we can share duties.

-Grant



On Mar 7, 2008, at 1:43 PM, Isabel Drost wrote:

> On Friday 07 March 2008, Grant Ingersoll wrote:
>> Sounds good.  I should also note that all mentoring should (barring
>> personal conversation) should take place on the dev list.  That is,
>> decisions, discussions on what to do should be done on the list so
>> that we all benefit from the understanding.  Not that you were
>> suggesting otherwise!
>
> Sure, after all, GSoC is about integrating students into free software
> projects - and making decisions offline certainly is not the way,  
> Apache
> projects work. Thanks for pointing that out.
>
> Isabel

Reply | Threaded
Open this post in threaded view
|

RE: Google Summer of Code

Jeff Eastman-2
I'd be willing to contribute. The page shows as immutable to me, so
perhaps you could add my name next time you are there.

Jeff

-----Original Message-----
From: Grant Ingersoll [mailto:[hidden email]]
Sent: Saturday, March 08, 2008 1:42 PM
To: [hidden email]
Subject: Re: Google Summer of Code

Note, the deadline for project proposals is March 12.

I put an item up for us at:
http://wiki.apache.org/general/SummerOfCode2008 
    I think it is probably general enough to cover all of the bases  
discussed here.  Please feel free to add your name to the list of  
mentors if you can.  Perhaps we can share duties.

-Grant



On Mar 7, 2008, at 1:43 PM, Isabel Drost wrote:

> On Friday 07 March 2008, Grant Ingersoll wrote:
>> Sounds good.  I should also note that all mentoring should (barring
>> personal conversation) should take place on the dev list.  That is,
>> decisions, discussions on what to do should be done on the list so
>> that we all benefit from the understanding.  Not that you were
>> suggesting otherwise!
>
> Sure, after all, GSoC is about integrating students into free software
> projects - and making decisions offline certainly is not the way,  
> Apache
> projects work. Thanks for pointing that out.
>
> Isabel

Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
You have to create an id and login...

On Mar 9, 2008, at 7:58 PM, Jeff Eastman wrote:

> I'd be willing to contribute. The page shows as immutable to me, so
> perhaps you could add my name next time you are there.
>
> Jeff
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:[hidden email]]
> Sent: Saturday, March 08, 2008 1:42 PM
> To: [hidden email]
> Subject: Re: Google Summer of Code
>
> Note, the deadline for project proposals is March 12.
>
> I put an item up for us at:
> http://wiki.apache.org/general/SummerOfCode2008
>    I think it is probably general enough to cover all of the bases
> discussed here.  Please feel free to add your name to the list of
> mentors if you can.  Perhaps we can share duties.
>
> -Grant
>
>
>
> On Mar 7, 2008, at 1:43 PM, Isabel Drost wrote:
>
>> On Friday 07 March 2008, Grant Ingersoll wrote:
>>> Sounds good.  I should also note that all mentoring should (barring
>>> personal conversation) should take place on the dev list.  That is,
>>> decisions, discussions on what to do should be done on the list so
>>> that we all benefit from the understanding.  Not that you were
>>> suggesting otherwise!
>>
>> Sure, after all, GSoC is about integrating students into free  
>> software
>> projects - and making decisions offline certainly is not the way,
>> Apache
>> projects work. Thanks for pointing that out.
>>
>> Isabel
>

--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Ian Holsman (Lists)
In reply to this post by Grant Ingersoll-2
Hi Grant.
I'll be happy to mentor someone for this project.

regards
Ian
>>
>>
>> | A person or group responsible for review and ranking of student
>> | applications,
>>
>> I'd be happy to help out here. Anyone else?
>
> Cool
>

Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Isabel Drost-3
In reply to this post by Grant Ingersoll-2
On Saturday 08 March 2008, Grant Ingersoll wrote:
> Please feel free to add your name to the list of
> mentors if you can.

I have added my name and created an account at the Google SoC web application.
Anything else - apart from reading the GSoC documentation we should not
forget?

Isabel

--
A woman forgives the audacity of which her beauty has prompted us to be
guilty. -- LeSage
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[hidden email]>

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Grant Ingersoll-2
Wow, maybe w/ all of our mentors we could get 2 students...


On Mar 10, 2008, at 3:42 AM, Isabel Drost wrote:

> On Saturday 08 March 2008, Grant Ingersoll wrote:
>> Please feel free to add your name to the list of
>> mentors if you can.
>
> I have added my name and created an account at the Google SoC web  
> application.
> Anything else - apart from reading the GSoC documentation we should  
> not
> forget?
>
> Isabel
>
Reply | Threaded
Open this post in threaded view
|

Re: Google Summer of Code

Anush Shetty-2
On Mon, Mar 10, 2008 at 4:50 PM, Grant Ingersoll <[hidden email]>
wrote:

> Wow, maybe w/ all of our mentors we could get 2 students...
>

neat ++ :)



--
((Anush Shetty)) ((mail AT anushshetty DOT com))