0.8 release schedule (was Re: latest build throws error - critical)

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

0.8 release schedule (was Re: latest build throws error - critical)

Doug Cutting
TDLN wrote:
> I mean, how do others keep uptodate with the main codeline? Do you
> advice updating everyday?

Should we make a 0.8.0 release soon?  What features are still missing
that we'd like to get into this release?

Doug
Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Andrzej Białecki-2
Doug Cutting wrote:
> TDLN wrote:
>> I mean, how do others keep uptodate with the main codeline? Do you
>> advice updating everyday?
>
> Should we make a 0.8.0 release soon?  What features are still missing
> that we'd like to get into this release?

I think we should make a release soon - instabilities related to Hadoop
split are mostly gone now, and we need to endorse the new architecture
more officially...

The "adaptive fetch" and "scoring API" functionality are the top
priority for me. While the scoring API change is pretty innocuous, we
just need to clean it up, the adaptive fetch changes have a big
potential for wrecking the main re-fetch cycle ... ;)

We could do it in two ways: I could apply this patch and let people run
with it for a while, fixing bugs as they pop up - but then it will be
another 3-4 weeks I suppose. Or we could wait with this after the release.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

chrismattmann
+1 for a release sooner rather than later. Several interesting features
contributed since the 0.7 branch I believe are now tested and
production-worthy, at least in my environment. Hats off to the folks who
were able to split the MapReduce and NDFS into Hadoop -- I'm going to be
experimenting with that portion of the code over the next few weeks on a 16
node, 32 processor Opteron cluster at JPL that will be used as the
development machine for a large scale earth science data processing mission.
Because the Hadoop code is in its own project now, I can leverage and test
the Hadoop processing and HDFS capability without having to include all the
search engine specific stuff. Yayyyy! :-)

Cheers,
  Chris



On 4/6/06 12:59 PM, "Andrzej Bialecki" <[hidden email]> wrote:

> Doug Cutting wrote:
>> TDLN wrote:
>>> I mean, how do others keep uptodate with the main codeline? Do you
>>> advice updating everyday?
>>
>> Should we make a 0.8.0 release soon?  What features are still missing
>> that we'd like to get into this release?
>
> I think we should make a release soon - instabilities related to Hadoop
> split are mostly gone now, and we need to endorse the new architecture
> more officially...
>
> The "adaptive fetch" and "scoring API" functionality are the top
> priority for me. While the scoring API change is pretty innocuous, we
> just need to clean it up, the adaptive fetch changes have a big
> potential for wrecking the main re-fetch cycle ... ;)
>
> We could do it in two ways: I could apply this patch and let people run
> with it for a while, fixing bugs as they pop up - but then it will be
> another 3-4 weeks I suppose. Or we could wait with this after the release.

______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Dawid Weiss
In reply to this post by Doug Cutting

Could we have the clustering patch applied before the 0.8.0 release? I
know you're way busy with other things, Andrzej, maybe you'll forward it
to somebody else? It shouldn't be a difficult patch to review and apply.

D.

Doug Cutting wrote:
> TDLN wrote:
>> I mean, how do others keep uptodate with the main codeline? Do you
>> advice updating everyday?
>
> Should we make a 0.8.0 release soon?  What features are still missing
> that we'd like to get into this release?
>
> Doug
Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Andrzej Białecki-2
Dawid Weiss wrote:
>
> Could we have the clustering patch applied before the 0.8.0 release? I
> know you're way busy with other things, Andrzej, maybe you'll forward
> it to somebody else? It shouldn't be a difficult patch to review and
> apply.

No problem, I will take care of it before the release.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Doug Cutting
In reply to this post by chrismattmann
Chris Mattmann wrote:
> +1 for a release sooner rather than later.

I think this is a good plan.  There's no reason we can't do another
release in a month.  If it is back-compatbible we can call it 0.8.x and
if it's incompatible we can call it 0.9.0.

I'm going to make a Hadoop 0.1.1 release today that can be included in
Nutch 0.8.0.  (With Hadoop we're going to aim for monthly releases, with
potential bugfix releases between when serious bugs are found.  The big
bug in Hadoop 0.1.0 is http://issues.apache.org/jira/browse/HADOOP-117.)

So we could aim for a Nutch 0.8.0 release sometime next week.  Does that
work for folks?

Piotr, would you like to make this release, or should I?

Doug
Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

chrismattmann
+1


On 4/7/06 10:20 AM, "Doug Cutting" <[hidden email]> wrote:

> Chris Mattmann wrote:
>> +1 for a release sooner rather than later.
>
> I think this is a good plan.  There's no reason we can't do another
> release in a month.  If it is back-compatbible we can call it 0.8.x and
> if it's incompatible we can call it 0.9.0.
>
> I'm going to make a Hadoop 0.1.1 release today that can be included in
> Nutch 0.8.0.  (With Hadoop we're going to aim for monthly releases, with
> potential bugfix releases between when serious bugs are found.  The big
> bug in Hadoop 0.1.0 is http://issues.apache.org/jira/browse/HADOOP-117.)
>
> So we could aim for a Nutch 0.8.0 release sometime next week.  Does that
> work for folks?
>
> Piotr, would you like to make this release, or should I?
>
> Doug

______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Piotr Kosiorowski
In reply to this post by Doug Cutting
Doug Cutting wrote:

> Piotr, would you like to make this release, or should I?
>
I would prefer you would do it this time - I am not sure if I can find
some time next week. I would like to do some things before release though:
1) Commit clustering patch from Dawid (I took it over from Andrzej).
2) Commit pmd stuff as optional for this release. We will make it
required later.
3) Review tutorial - I saw some posts on user list with claims about
errors so I would like to check it before release.
4) It would be good to go through JIRA issues before - but I am not sure
if I will manage it.
Any comments?

Regards
Piotr
Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Andrzej Białecki-2
In reply to this post by Doug Cutting
Doug Cutting wrote:

> Chris Mattmann wrote:
>> +1 for a release sooner rather than later.
>
> I think this is a good plan.  There's no reason we can't do another
> release in a month.  If it is back-compatbible we can call it 0.8.x
> and if it's incompatible we can call it 0.9.0.
>
> I'm going to make a Hadoop 0.1.1 release today that can be included in
> Nutch 0.8.0.  (With Hadoop we're going to aim for monthly releases,
> with potential bugfix releases between when serious bugs are found.  
> The big bug in Hadoop 0.1.0 is
> http://issues.apache.org/jira/browse/HADOOP-117.)
>
> So we could aim for a Nutch 0.8.0 release sometime next week.  Does
> that work for folks?

Do you guys have any additional insights / suggestions whether NUTCH-240
and/or NUTCH-61 should be included in this release?

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

chrismattmann
Hi Andrzej,


On 4/7/06 12:18 PM, "Andrzej Bialecki" <[hidden email]> wrote:

> Do you guys have any additional insights / suggestions whether NUTCH-240
> and/or NUTCH-61 should be included in this release?

Looking at the JIRA popular issues pane for Nutch (
http://issues.apache.org/jira/browse/NUTCH?report=com.atlassian.jira.plugin.
system.project:popularissues-panel), I note that NUTCH-61 is the most
popular issue right now with 7 votes. Additionally, NUTCH-240 shares the 3rd
most votes (4) with NUTCH-134. So, all in all, there are 4 issues with >= 4
votes in JIRA. Of those 4 issues, 3 of them all have attached patches in
JIRA. Would it be safe to say that the committers should focus on committing
NUTCH-61, NUTCh-240, and NUTCH-48, since these 3 issues all have attached
patch files, and then freeze it for the 0.8.0 release? As for my own
opinion, I recently downloaded and reviewed NUTCH-61, and really like the
patch. +1 on my end. I haven't tried out NUTCH-240 yet, but it seems to be a
logical extension point for Nutch to be able to plug in different scoring
components. So, +1 from me.

Cheers,
  Chris


______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Jérôme Charron
In reply to this post by Andrzej Białecki-2
> Do you guys have any additional insights / suggestions whether NUTCH-240
> and/or NUTCH-61 should be included in this release?

NUTCH-240 : I really like the idea, but for now, I agree with that is API is
still "ugly". I would like to help in the next weeks...
So for me it should not be included in the 0.8 release...

Regards

Jérôme


--
http://motrech.free.fr/
http://www.frutch.org/
Reply | Threaded
Open this post in threaded view
|

Re: 0.8 release schedule (was Re: latest build throws error - critical)

Andrzej Białecki-2
In reply to this post by chrismattmann
Chris Mattmann wrote:
> opinion, I recently downloaded and reviewed NUTCH-61, and really like the
> patch. +1 on my end. I haven't tried out NUTCH-240 yet, but it seems to be a
> logical extension point for Nutch to be able to plug in different scoring
> components. So, +1 from me.
>  

Thanks for looking at this.

NUTCH-240: the API has some warts, it would be nice to clean up the
passScore* methods before committing it - but this may involve changing
too much code that is not strictly related to this patch.

NUTCH-61: I can commit this, it's been lightly tested on a dozen or so
cycles of a small sample of urls. However, for some settings I've seen
cases when AdaptiveFetchPolicy would go haywire and increase
fetchInterval to infinity or to zero. So, this is really about whether
people want to be "blessed" with this patch whether they need it or not,
and weed out bugs as we go, or perhaps continue waiting for some
volunteers to test it on a larger scale / more cycles.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com