Issues pending before 0.9 release

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Issues pending before 0.9 release

Andrzej Białecki-2
Hi all,

The following issues need to be discussed and appropriate action taken
before the 0.9 release:

Blocker
========
* NUTCH-400 (Update & add missing license headers) - I believe this is
fixed and should be closed

* NUTCH-353 (pages that serverside forwards will be refetched every
time) - this was partially fixed in NUTCH-273, but a more complete
solution would require significant changes to LinkDb. As there are no
patches implementing this, I left it open, but it's no longer as
critical as it was before. I propose to move it to "Major" and address
it in the next release.

* NUTCH-233 (wrong regular expression hang reduce process for ever) - I
propose to apply the fix provided by Sean Dean and close this issue for now.

Critical
========
* NUTCH-436 (Incorrect handling of relative paths when the embedded URL
path is empty). There is no patch available yet. If someone could
contribute a patch I'd like to see this fixed before the release.

* NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
certainly not critical (as this is an optional new feature). I propose
to change it to Major, and make a decision - do we want another plugin
like parse-mp3 or parse-rtf, or not.

* NUTCH-381 (Ignore external link not work as expected) - I'll try to
reproduce it, and if I find an easy fix I'd like to apply it before the
release.

* NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
to reproduce it. If there is no updated information on this I propose to
close it with "Can't reproduce".

* NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
there's a patch which I tested in a limited production env. If there are
no objections I'd like to apply it before the release.

Major
=====
There are 84 major issues, but some of them are either invalid, or
should be "minor", or no longer apply and should be closed. Please
review them if you can and provide some comments or recommendations if
you think you have some new information.


One decision also that we need to make is which version of Hadoop should
be included in the release. Current trunk uses 0.10.1, I have a set of
production-tested patches that use 0.11.2, and today the Hadoop team
released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
before our release). The most conservative option is to stay with
0.10.1, but by the time people start using Nutch this will be a fairly
old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
- but in this case with the expectation that we release less than stable
version of Nutch to be soon followed by a minor stable release ...

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Dennis Kubes
> Hi all,
>
> The following issues need to be discussed and appropriate action taken
> before the 0.9 release:
>
> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed
>
> * NUTCH-353 (pages that serverside forwards will be refetched every
> time) - this was partially fixed in NUTCH-273, but a more complete
> solution would require significant changes to LinkDb. As there are no
> patches implementing this, I left it open, but it's no longer as
> critical as it was before. I propose to move it to "Major" and address
> it in the next release.
>
> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for
> now.
>
> Critical
> ========
> * NUTCH-436 (Incorrect handling of relative paths when the embedded URL
> path is empty). There is no patch available yet. If someone could
> contribute a patch I'd like to see this fixed before the release.

I am starting to take a look at this.  I will try to get it fixed before
we release.

>
> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.
>
> * NUTCH-381 (Ignore external link not work as expected) - I'll try to
> reproduce it, and if I find an easy fix I'd like to apply it before the
> release.
>
> * NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
> to reproduce it. If there is no updated information on this I propose to
> close it with "Can't reproduce".
>
> * NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
> there's a patch which I tested in a limited production env. If there are
> no objections I'd like to apply it before the release.
>
> Major
> =====
> There are 84 major issues, but some of them are either invalid, or
> should be "minor", or no longer apply and should be closed. Please
> review them if you can and provide some comments or recommendations if
> you think you have some new information.
>
>
> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly
> old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
> - but in this case with the expectation that we release less than stable
> version of Nutch to be soon followed by a minor stable release ...

+1 for using 0.11.2.  I looked through the release notes for 0.12 and
there were some niceties such as HADOOP-432 for undeletes and alot of bug
fixes, but it didn't look like there were any critical issues as far as
Nutch is concerned.

Dennis Kubes

>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sami Siren-2
In reply to this post by Andrzej Białecki-2
Andrzej Bialecki wrote:
> Hi all,
>
> The following issues need to be discussed and appropriate action taken
> before the 0.9 release:
>
> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed

I agree. I should close it.

> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for
> now.

yes that was the resolution also last time :)

> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.

One option would be setting up a separate project outside Apache to host
and maintain these and remove the remaining torsos from Nutch source base.

> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly

0.10.1 is not an option, there is that NPE in sorting that is does not
allow any crawling beyond modes sizes (HADOOP-917). We should upgrade
hadoop to 0.11.2 or 0.12.0 and gather experiences from running it on
reasonable sized crawls, so my suggestion is that don't decide this on
paper.

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sean Dean-3
In reply to this post by Andrzej Białecki-2
As for which Hadoop version is included in the next Nutch release, I share the same concern as Sami with 0.10.1 as it NPE's on anything above 100-200k URLs. I can volunteer to test any other version we are interested in, my regular fetches are about 13 million URLs and take a couple days to complete.
 
If anyone has a specific Hadoop jar they would like to share I don't mind testing it, otherwise I can just build the "most popular" version from source and replace that with my current one. For the record, I've been using Hadoop 0.9.1 for the longest time without any problems on these somewhat large crawls.


----- Original Message ----
From: Sami Siren <[hidden email]>
To: [hidden email]
Sent: Sunday, March 4, 2007 1:50:23 AM
Subject: Re: Issues pending before 0.9 release


Andrzej Bialecki wrote:
> Hi all,
>
> The following issues need to be discussed and appropriate action taken
> before the 0.9 release:
>
> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed

I agree. I should close it.

> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for
> now.

yes that was the resolution also last time :)

> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.

One option would be setting up a separate project outside Apache to host
and maintain these and remove the remaining torsos from Nutch source base.

> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly

0.10.1 is not an option, there is that NPE in sorting that is does not
allow any crawling beyond modes sizes (HADOOP-917). We should upgrade
hadoop to 0.11.2 or 0.12.0 and gather experiences from running it on
reasonable sized crawls, so my suggestion is that don't decide this on
paper.

--
Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
Sean Dean wrote:
> As for which Hadoop version is included in the next Nutch release, I share the same concern as Sami with 0.10.1 as it NPE's on anything above 100-200k URLs. I can volunteer to test any other version we are interested in, my regular fetches are about 13 million URLs and take a couple days to complete.
>  
> If anyone has a specific Hadoop jar they would like to share I don't mind testing it, otherwise I can just build the "most popular" version from source and replace that with my current one. For the record, I've been using Hadoop 0.9.1 for the longest time without any problems on these somewhat large crawls.
>
>  

It's clear to me then that we should bring Nutch to 0.11.2 first anyway.
Then, if we have time and if you are willing, we could test the 0.12 and
if it's stable enough for your 13 mln crawl then it's likely it's good
enough for the rest of us.

If there are no dissenting votes, I'll apply the patch to bring in
0.11.2 some time tomorrow. I will also create a JIRA issue and attach
the patches from that revision to Hadoop 0.12 so that folks may test them.

Thanks for your comments!

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Dennis Kubes
NUTCH-436 has a patch now if we want to add that to this release.

Dennis Kubes

Andrzej Bialecki wrote:

> Sean Dean wrote:
>> As for which Hadoop version is included in the next Nutch release, I
>> share the same concern as Sami with 0.10.1 as it NPE's on anything
>> above 100-200k URLs. I can volunteer to test any other version we are
>> interested in, my regular fetches are about 13 million URLs and take a
>> couple days to complete.
>>  
>> If anyone has a specific Hadoop jar they would like to share I don't
>> mind testing it, otherwise I can just build the "most popular" version
>> from source and replace that with my current one. For the record, I've
>> been using Hadoop 0.9.1 for the longest time without any problems on
>> these somewhat large crawls.
>>
>>  
>
> It's clear to me then that we should bring Nutch to 0.11.2 first anyway.
> Then, if we have time and if you are willing, we could test the 0.12 and
> if it's stable enough for your 13 mln crawl then it's likely it's good
> enough for the rest of us.
>
> If there are no dissenting votes, I'll apply the patch to bring in
> 0.11.2 some time tomorrow. I will also create a JIRA issue and attach
> the patches from that revision to Hadoop 0.12 so that folks may test them.
>
> Thanks for your comments!
>
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

chrismattmann
In reply to this post by Andrzej Białecki-2
Hi Guys,

> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed

+1, thanks to Sami for closing it.

>
> * NUTCH-353 (pages that serverside forwards will be refetched every
> time) - this was partially fixed in NUTCH-273, but a more complete
> solution would require significant changes to LinkDb. As there are no
> patches implementing this, I left it open, but it's no longer as
> critical as it was before. I propose to move it to "Major" and address
> it in the next release.

+1

>
> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for now.

+1

>
> Critical
> ========
> * NUTCH-436 (Incorrect handling of relative paths when the embedded URL
> path is empty). There is no patch available yet. If someone could
> contribute a patch I'd like to see this fixed before the release.

Looks like Dennis is on this one

>
> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.

Let's hold off on this: it's not necessary for 0.9, and I don't think
there's been a bunch of traffic on the list identifying this as critical to
get into the sources for the release

>
> * NUTCH-381 (Ignore external link not work as expected) - I'll try to
> reproduce it, and if I find an easy fix I'd like to apply it before the
> release.

+1

>
> * NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
> to reproduce it. If there is no updated information on this I propose to
> close it with "Can't reproduce".

+1, I had to do something similar with NUTCH-258

>
> * NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
> there's a patch which I tested in a limited production env. If there are
> no objections I'd like to apply it before the release.

+1

>
> Major
> =====
> There are 84 major issues, but some of them are either invalid, or
> should be "minor", or no longer apply and should be closed. Please
> review them if you can and provide some comments or recommendations if
> you think you have some new information.

I will spend some time going through JIRA today and see if there's any
issues that I can find that:

1. Have a patch already
2. Sound like something quick, easy, and not so far-reaching across the
entire Nutch API

>
>
> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly
> old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
> - but in this case with the expectation that we release less than stable
> version of Nutch to be soon followed by a minor stable release ...

I'd agree with the upgrade to 0.11.2, +1


Cheers,
  Chris

P.S. I am going to contact Pitor and coordinate with him: I'd like to be the
release manager for this Nutch release.



Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Dennis Kubes


Chris Mattmann wrote:

> Hi Guys,
>
>> Blocker
>> ========
>> * NUTCH-400 (Update & add missing license headers) - I believe this is
>> fixed and should be closed
>
> +1, thanks to Sami for closing it.
>
>> * NUTCH-353 (pages that serverside forwards will be refetched every
>> time) - this was partially fixed in NUTCH-273, but a more complete
>> solution would require significant changes to LinkDb. As there are no
>> patches implementing this, I left it open, but it's no longer as
>> critical as it was before. I propose to move it to "Major" and address
>> it in the next release.
>
> +1
>
>> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
>> propose to apply the fix provided by Sean Dean and close this issue for now.
>
> +1
>
>> Critical
>> ========
>> * NUTCH-436 (Incorrect handling of relative paths when the embedded URL
>> path is empty). There is no patch available yet. If someone could
>> contribute a patch I'd like to see this fixed before the release.
>
> Looks like Dennis is on this one
>
>> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
>> certainly not critical (as this is an optional new feature). I propose
>> to change it to Major, and make a decision - do we want another plugin
>> like parse-mp3 or parse-rtf, or not.
>
> Let's hold off on this: it's not necessary for 0.9, and I don't think
> there's been a bunch of traffic on the list identifying this as critical to
> get into the sources for the release
>
>> * NUTCH-381 (Ignore external link not work as expected) - I'll try to
>> reproduce it, and if I find an easy fix I'd like to apply it before the
>> release.
>
> +1
>
>> * NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
>> to reproduce it. If there is no updated information on this I propose to
>> close it with "Can't reproduce".
>
> +1, I had to do something similar with NUTCH-258
>
>> * NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
>> there's a patch which I tested in a limited production env. If there are
>> no objections I'd like to apply it before the release.
>
> +1
>
>> Major
>> =====
>> There are 84 major issues, but some of them are either invalid, or
>> should be "minor", or no longer apply and should be closed. Please
>> review them if you can and provide some comments or recommendations if
>> you think you have some new information.
>
> I will spend some time going through JIRA today and see if there's any
> issues that I can find that:
>
> 1. Have a patch already
> 2. Sound like something quick, easy, and not so far-reaching across the
> entire Nutch API
>
>>
>> One decision also that we need to make is which version of Hadoop should
>> be included in the release. Current trunk uses 0.10.1, I have a set of
>> production-tested patches that use 0.11.2, and today the Hadoop team
>> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
>> before our release). The most conservative option is to stay with
>> 0.10.1, but by the time people start using Nutch this will be a fairly
>> old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
>> - but in this case with the expectation that we release less than stable
>> version of Nutch to be soon followed by a minor stable release ...
>
> I'd agree with the upgrade to 0.11.2, +1
>
>
> Cheers,
>   Chris
>
> P.S. I am going to contact Pitor and coordinate with him: I'd like to be the
> release manager for this Nutch release.

I would like to help with this as well, even if it is just watching how
the process works this time.

Dennis
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
In reply to this post by chrismattmann
Chris Mattmann wrote:
> P.S. I am going to contact Pitor and coordinate with him: I'd like to be the
> release manager for this Nutch release.
>  

Everyone heard that? :) That's cool, thanks!

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sami Siren-2
In reply to this post by chrismattmann
>
>
>
> P.S. I am going to contact Pitor and coordinate with him: I'd like to be
> the
> release manager for this Nutch release.
>
>
>
It would be more beneficial to everybody if the discussions (related to
release or Nutch) is
done on public (hey this is open source!). The off the list stuff IMO
smells.

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Doug Cutting
Sami Siren wrote:
> It would be more beneficial to everybody if the discussions (related to
> release or Nutch) is
> done on public (hey this is open source!). The off the list stuff IMO
> smells.

+1  Folks sometimes wish to discuss project matters off-list to spare
others the boring details, but this is usually a bad idea.  All project
decisions should be made in public on this list.  Discussions relevant
to these decisions are also thus best made on this list, since they
explain the decision.  Private discussions are permissible to develop a
proposal, but that is usually better done on-list when possible, so that
others can get involved earlier.

(The one notable exception is that personnel issues are discussed on the
private PMC list.)

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
In reply to this post by Andrzej Białecki-2
Hi all,

I just committed Hadoop 0.12.1. Let's double-check that it works ok.
Here's the list of Critical/Blocker issues I mentioned before, and their
current status:

NUTCH-400 Fixed.
NUTCH-353 Moved to Major, fix after release.
NUTCH-233 Fixed.
NUTCH-436 Fixed.
NUTCH-427 Moved to Major, fix after release.
NUTCH-381 Won't fix - this is a configuration issue.
NUTCH-277 Cannot reproduce
NUTCH-167 Fixed.

Any other stuff we need to fix before the release?

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sami Siren-2
Andrzej Bialecki wrote:
> Hi all,
>
> I just committed Hadoop 0.12.1. Let's double-check that it works ok.
> Here's the list of Critical/Blocker issues I mentioned before, and their
> current status:
>
> Any other stuff we need to fix before the release?

I am satisfied except the broken bin/nutch.

--
 Sami Siren

Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
Sami Siren wrote:

> Andrzej Bialecki wrote:
>> Hi all,
>>
>> I just committed Hadoop 0.12.1. Let's double-check that it works ok.
>> Here's the list of Critical/Blocker issues I mentioned before, and their
>> current status:
>>
>> Any other stuff we need to fix before the release?
>
> I am satisfied except the broken bin/nutch.

Fixed now - tested both under Cygwin and Fedora.

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Dennis Kubes
I am good to go as well.

Dennis Kubes

Andrzej Bialecki wrote:

> Sami Siren wrote:
>> Andrzej Bialecki wrote:
>>> Hi all,
>>>
>>> I just committed Hadoop 0.12.1. Let's double-check that it works ok.
>>> Here's the list of Critical/Blocker issues I mentioned before, and their
>>> current status:
>>>
>>> Any other stuff we need to fix before the release?
>>
>> I am satisfied except the broken bin/nutch.
>
> Fixed now - tested both under Cygwin and Fedora.
>
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

rubdabadub
Hi:

Just wondering about NUTCH-61

http://issues.apache.org/jira/browse/Nutch-61

Will it make the 0.9 cut?

It would be nice if it did. Its probably too late.

Regards

On 3/21/07, Dennis Kubes <[hidden email]> wrote:

> I am good to go as well.
>
> Dennis Kubes
>
> Andrzej Bialecki wrote:
> > Sami Siren wrote:
> >> Andrzej Bialecki wrote:
> >>> Hi all,
> >>>
> >>> I just committed Hadoop 0.12.1. Let's double-check that it works ok.
> >>> Here's the list of Critical/Blocker issues I mentioned before, and their
> >>> current status:
> >>>
> >>> Any other stuff we need to fix before the release?
> >>
> >> I am satisfied except the broken bin/nutch.
> >
> > Fixed now - tested both under Cygwin and Fedora.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
In reply to this post by Dennis Kubes
Dennis Kubes wrote:
> I am good to go as well.

Hmm ... Test suite fails for me, with a cryptic message (cryptic because
the plugin test itself succeeds):

[...]
init:

init-plugin:

deps-jar:

compile:
      [echo] Compiling plugin: urlnormalizer-regex

compile-test:

jar:

deps-test:

init:

init-plugin:

compile:

jar:

deps-test:

deploy:

copy-generated-lib:

deploy:

copy-generated-lib:

test:
      [echo] Testing plugin: urlnormalizer-regex
     [junit] Running
org.apache.nutch.net.urlnormalizer.regex.TestRegexURLNormalizer
     [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.016 sec
     [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.359 sec

BUILD FAILED
C:\disks\e\work\nutch\vanilla\build.xml:300: The following error
occurred while executing this line:
C:\disks\e\work\nutch\vanilla\src\plugin\build.xml:99: The following
error occurred while executing this line:
C:\disks\e\work\nutch\vanilla\src\plugin\build-plugin.xml:200: Tests failed!



--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sami Siren-2
In reply to this post by Andrzej Białecki-2
2007/3/21, Andrzej Bialecki <[hidden email]>:
>
> >> Any other stuff we need to fix before the release?
> >
> > I am satisfied except the broken bin/nutch.
>
> Fixed now - tested both under Cygwin and Fedora.
>
> Thanks, I can confirm that it works now :)

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Sami Siren-2
In reply to this post by Andrzej Białecki-2
for me it works:

...
BUILD SUCCESSFUL
Total time: 4 minutes 3 seconds

--
 Sami Siren

2007/3/21, Andrzej Bialecki <[hidden email]>:

>
> Dennis Kubes wrote:
> > I am good to go as well.
>
> Hmm ... Test suite fails for me, with a cryptic message (cryptic because
> the plugin test itself succeeds):
>
> [...]
> init:
>
> init-plugin:
>
> deps-jar:
>
> compile:
>       [echo] Compiling plugin: urlnormalizer-regex
>
> compile-test:
>
> jar:
>
> deps-test:
>
> init:
>
> init-plugin:
>
> compile:
>
> jar:
>
> deps-test:
>
> deploy:
>
> copy-generated-lib:
>
> deploy:
>
> copy-generated-lib:
>
> test:
>       [echo] Testing plugin: urlnormalizer-regex
>      [junit] Running
> org.apache.nutch.net.urlnormalizer.regex.TestRegexURLNormalizer
>      [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.016 sec
>      [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.359 sec
>
> BUILD FAILED
> C:\disks\e\work\nutch\vanilla\build.xml:300: The following error
> occurred while executing this line:
> C:\disks\e\work\nutch\vanilla\src\plugin\build.xml:99: The following
> error occurred while executing this line:
> C:\disks\e\work\nutch\vanilla\src\plugin\build-plugin.xml:200: Tests
> failed!
>
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Issues pending before 0.9 release

Andrzej Białecki-2
Sami Siren wrote:
> for me it works:
>
> ...
> BUILD SUCCESSFUL
> Total time: 4 minutes 3 seconds

I did a fresh checkout to an empty dir, rebuilt and it's still failing -
perhaps you have some uncommitted changes in your working copy ... ?


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

12