Hadoop 0.11.2 vs. 0.12.1

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
Hi all,

After our discussion about which Hadoop release to use for the upcoming
Nutch release, I decided to ask around on the Hadoop mailing list. The
message was clear that we should go with 0.12.1 - see below:

Owen O'Malley wrote:

>
> On Mar 10, 2007, at 12:32 AM, Andrzej Bialecki wrote:
>
>>> I think the experience on big clusters at Yahoo! is that 0.12.1
>>> should be more stable than 0.11.2, but others can confirm that.
>>
>> Hm.. That's not the impression I have from JIRA and the mailing list.
>> My impression is that even though 0.12.1 is more robust in some
>> situations, the significant changes (checksum filesystem, speculative
>> execution, in memory sorting, improved map output handling, etc, etc)
>> made between these releases introduced many subtle bugs which only
>> now start coming into light.
>
> We never upgraded our main clusters to 11.2 because it never
> stabilized to our satisfaction, which is why I was proposing an 11.3.
> However, 12.1 is looking pretty good with the exception  of a couple
> of bugs and we decided to hold out for 12.1. At this point, if I was
> going to 11, I'd want a lot of the fixes that have been done in between.

 0.12.x release has speculative execution turned on by default, but I
remember that there were places in Nutch that would break when using
PhasedFileSystem (which is what Hadoop uses when run in that mode). I'm
afraid there might be other issues here as well - noone tested Nutch
with 0.12 to be sure that it works ok.

On the other hand, I only tested 0.11.2 in a limited production env., so
there may be other bugs lurking there that Owen referred to, which show
up when you run larger jobs (or different jobs).

What do you think?

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Sean Dean-3
It looks like we might want to at least give it a try then, with the worst possible case of Nutch users having to keep speculative execution disabled if it causes grief again. If other problems arise, then we can just revert back to 0.11.2 which seems to be stable in terms of all the Nutch operations.


----- Original Message ----
From: Andrzej Bialecki <[hidden email]>
To: [hidden email]
Sent: Sunday, March 11, 2007 4:34:38 PM
Subject: Hadoop 0.11.2 vs. 0.12.1


Hi all,

After our discussion about which Hadoop release to use for the upcoming
Nutch release, I decided to ask around on the Hadoop mailing list. The
message was clear that we should go with 0.12.1 - see below:

Owen O'Malley wrote:

>
> On Mar 10, 2007, at 12:32 AM, Andrzej Bialecki wrote:
>
>>> I think the experience on big clusters at Yahoo! is that 0.12.1
>>> should be more stable than 0.11.2, but others can confirm that.
>>
>> Hm.. That's not the impression I have from JIRA and the mailing list.
>> My impression is that even though 0.12.1 is more robust in some
>> situations, the significant changes (checksum filesystem, speculative
>> execution, in memory sorting, improved map output handling, etc, etc)
>> made between these releases introduced many subtle bugs which only
>> now start coming into light.
>
> We never upgraded our main clusters to 11.2 because it never
> stabilized to our satisfaction, which is why I was proposing an 11.3.
> However, 12.1 is looking pretty good with the exception  of a couple
> of bugs and we decided to hold out for 12.1. At this point, if I was
> going to 11, I'd want a lot of the fixes that have been done in between.

0.12.x release has speculative execution turned on by default, but I
remember that there were places in Nutch that would break when using
PhasedFileSystem (which is what Hadoop uses when run in that mode). I'm
afraid there might be other issues here as well - noone tested Nutch
with 0.12 to be sure that it works ok.

On the other hand, I only tested 0.11.2 in a limited production env., so
there may be other bugs lurking there that Owen referred to, which show
up when you run larger jobs (or different jobs).

What do you think?

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes
> It looks like we might want to at least give it a try then, with the worst
> possible case of Nutch users having to keep speculative execution disabled
> if it causes grief again. If other problems arise, then we can just revert
> back to 0.11.2 which seems to be stable in terms of all the Nutch
> operations.
>
>
> ----- Original Message ----
> From: Andrzej Bialecki <[hidden email]>
> To: [hidden email]
> Sent: Sunday, March 11, 2007 4:34:38 PM
> Subject: Hadoop 0.11.2 vs. 0.12.1
>
>
> Hi all,
>
> After our discussion about which Hadoop release to use for the upcoming
> Nutch release, I decided to ask around on the Hadoop mailing list. The
> message was clear that we should go with 0.12.1 - see below:
>
> Owen O'Malley wrote:
>>
>> On Mar 10, 2007, at 12:32 AM, Andrzej Bialecki wrote:
>>
>>>> I think the experience on big clusters at Yahoo! is that 0.12.1
>>>> should be more stable than 0.11.2, but others can confirm that.
>>>
>>> Hm.. That's not the impression I have from JIRA and the mailing list.
>>> My impression is that even though 0.12.1 is more robust in some
>>> situations, the significant changes (checksum filesystem, speculative
>>> execution, in memory sorting, improved map output handling, etc, etc)
>>> made between these releases introduced many subtle bugs which only
>>> now start coming into light.
>>
>> We never upgraded our main clusters to 11.2 because it never
>> stabilized to our satisfaction, which is why I was proposing an 11.3.
>> However, 12.1 is looking pretty good with the exception  of a couple
>> of bugs and we decided to hold out for 12.1. At this point, if I was
>> going to 11, I'd want a lot of the fixes that have been done in between.
>
> 0.12.x release has speculative execution turned on by default, but I
> remember that there were places in Nutch that would break when using
> PhasedFileSystem (which is what Hadoop uses when run in that mode). I'm
> afraid there might be other issues here as well - noone tested Nutch
> with 0.12 to be sure that it works ok.
>
> On the other hand, I only tested 0.11.2 in a limited production env., so
> there may be other bugs lurking there that Owen referred to, which show
> up when you run larger jobs (or different jobs).
>
> What do you think?

I agree there may be subtle bugs.

I can do say a full dmoz crawl (~5M pages) with nutch trunk and hadoop
12.1 on a small cluster of 5 machines if this would help?  We have already
done some crawls > 100K urls with 11.2 without problems.  I say let's test
it and if there aren't any significant issues then let's go with 12.1 if
the hadoop team thinks it will be more stable.

One question though, are there any concerns about upgrading clusters as
opposed to new fetches?

Dennis Kubes

>
> --
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
Dennis Kubes wrote:
> I agree there may be subtle bugs.
>
> I can do say a full dmoz crawl (~5M pages) with nutch trunk and hadoop
> 12.1 on a small cluster of 5 machines if this would help?  We have already
>  

Certainly, that would be most welcome.


> done some crawls > 100K urls with 11.2 without problems.  I say let's test
> it and if there aren't any significant issues then let's go with 12.1 if
> the hadoop team thinks it will be more stable.
>  

0.12.1 is not out the door yet. I can create a patch that uses the
latest Hadoop trunk binaries, so that we could test it.


> One question though, are there any concerns about upgrading clusters as
> opposed to new fetches?
>  

Theoretically, there shouldn't be, but this is an uncharted area ...
until someone tries it we won't know for sure. :-/

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes


Andrzej Bialecki wrote:
> Dennis Kubes wrote:
>> I agree there may be subtle bugs.
>>
>> I can do say a full dmoz crawl (~5M pages) with nutch trunk and hadoop
>> 12.1 on a small cluster of 5 machines if this would help?  We have
>> already
>>  
>
> Certainly, that would be most welcome.

I will start that up today.

>
>
>> done some crawls > 100K urls with 11.2 without problems.  I say let's
>> test
>> it and if there aren't any significant issues then let's go with 12.1 if
>> the hadoop team thinks it will be more stable.
>>  
>
> 0.12.1 is not out the door yet. I can create a patch that uses the
> latest Hadoop trunk binaries, so that we could test it.

I can just pull it down from source.  Let me know if that isn't what we
want'.
>
>
>> One question though, are there any concerns about upgrading clusters as
>> opposed to new fetches?
>>  
>
> Theoretically, there shouldn't be, but this is an uncharted area ...
> until someone tries it we won't know for sure. :-/
>
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
Dennis Kubes wrote:

>
>
> Andrzej Bialecki wrote:
>> Dennis Kubes wrote:
>>> I agree there may be subtle bugs.
>>>
>>> I can do say a full dmoz crawl (~5M pages) with nutch trunk and hadoop
>>> 12.1 on a small cluster of 5 machines if this would help?  We have
>>> already
>>>  
>>
>> Certainly, that would be most welcome.
>
> I will start that up today.

Thanks!

>>
>> 0.12.1 is not out the door yet. I can create a patch that uses the
>> latest Hadoop trunk binaries, so that we could test it.
>
> I can just pull it down from source.  Let me know if that isn't what
> we want'.

Great, please do.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes
The crawl for 1M pages completed successfully.  There was an issue with
doing a copyToLocal but that has already been filed as a HADOOP bug and
the patch will be included in 0.12.x

Statistics for CrawlDb: crawldb
TOTAL urls:         10839170
retry 0:            10816148
retry 1:             23022

min score:      0.0090
avg score:      0.173
max score:     2119.167

status 1 (db_unfetched):        9899275
status 2 (db_fetched):   667354
status 3 (db_gone): 11195
status 4 (db_redir_temp): 219507
status 5 (db_redir_perm): 41839

Dennis Kubes

Andrzej Bialecki wrote:

> Dennis Kubes wrote:
>>
>>
>> Andrzej Bialecki wrote:
>>> Dennis Kubes wrote:
>>>> I agree there may be subtle bugs.
>>>>
>>>> I can do say a full dmoz crawl (~5M pages) with nutch trunk and hadoop
>>>> 12.1 on a small cluster of 5 machines if this would help?  We have
>>>> already
>>>>  
>>>
>>> Certainly, that would be most welcome.
>>
>> I will start that up today.
>
> Thanks!
>
>>>
>>> 0.12.1 is not out the door yet. I can create a patch that uses the
>>> latest Hadoop trunk binaries, so that we could test it.
>>
>> I can just pull it down from source.  Let me know if that isn't what
>> we want'.
>
> Great, please do.
>
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Marc Boucher-4
Dennis,

I'm curious what kind of hardware your 5 system cluster uses? CPU, RAM, HD
etc.

And I was wondering if anyone has tested a cluster using servers with
Intel's Quad Core Xeon X3210 processors? If so what type of performance
boost have you noticed over a Dual Core system?

Thanks
Marc Boucher, aTerra
--
Personal Blog: http://www.nano2sol.com

On 3/14/07, Dennis Kubes <[hidden email]> wrote:

>
> The crawl for 1M pages completed successfully.  There was an issue with
> doing a copyToLocal but that has already been filed as a HADOOP bug and
> the patch will be included in 0.12.x
>
> Statistics for CrawlDb: crawldb
> TOTAL urls:         10839170
> retry 0:            10816148
> retry 1:             23022
>
> min score:      0.0090
> avg score:      0.173
> max score:      2119.167
>
> status 1 (db_unfetched):        9899275
> status 2 (db_fetched):          667354
> status 3 (db_gone):             11195
> status 4 (db_redir_temp):       219507
> status 5 (db_redir_perm):       41839
>
> Dennis Kubes
>
> Andrzej Bialecki wrote:
> > Dennis Kubes wrote:
> >>
> >>
> >> Andrzej Bialecki wrote:
> >>> Dennis Kubes wrote:
> >>>> I agree there may be subtle bugs.
> >>>>
> >>>> I can do say a full dmoz crawl (~5M pages) with nutch trunk and
> hadoop
> >>>> 12.1 on a small cluster of 5 machines if this would help?  We have
> >>>> already
> >>>>
> >>>
> >>> Certainly, that would be most welcome.
> >>
> >> I will start that up today.
> >
> > Thanks!
> >
> >>>
> >>> 0.12.1 is not out the door yet. I can create a patch that uses the
> >>> latest Hadoop trunk binaries, so that we could test it.
> >>
> >> I can just pull it down from source.  Let me know if that isn't what
> >> we want'.
> >
> > Great, please do.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes


Marc Boucher wrote:
> Dennis,
>
> I'm curious what kind of hardware your 5 system cluster uses? CPU, RAM, HD
> etc.
>

This is a small development cluster that we use.  It has 1 master and 4
slaves.  All are core2duo 2.4Ghz with 4G ram and 2x500G sata hard drives
on intel boards running a stripped down version of fedora core 6.

Our production system consists of 1U supermicro servers with the same
specs except running a single 750G sata.


> And I was wondering if anyone has tested a cluster using servers with
> Intel's Quad Core Xeon X3210 processors? If so what type of performance
> boost have you noticed over a Dual Core system?

We are in the initial stages of experimenting with a 1U supermicro
system that can hold > 1T of space and 16x cores (2 boards with dual
quad xeons) and requires a 900W power supply.  A monster box in a very
small case.  I don't have any benchmarks as of yet but I will keep the
list informed of our progress.

Dennis Kubes

>
> Thanks
> Marc Boucher, aTerra
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Marc Boucher-4
Dennis,

Thanks for the info. I'm in the process of setting up a small cluster myself
with 1 master and 4 slaves all running the Quad Core but with half as much
RAM. Does doubling the RAM to 4GB make much of a difference?

Thanks
Marc

On 3/14/07, Dennis Kubes <[hidden email]> wrote:

>
>
>
> Marc Boucher wrote:
> > Dennis,
> >
> > I'm curious what kind of hardware your 5 system cluster uses? CPU, RAM,
> HD
> > etc.
> >
>
> This is a small development cluster that we use.  It has 1 master and 4
> slaves.  All are core2duo 2.4Ghz with 4G ram and 2x500G sata hard drives
> on intel boards running a stripped down version of fedora core 6.
>
> Our production system consists of 1U supermicro servers with the same
> specs except running a single 750G sata.
>
>
> > And I was wondering if anyone has tested a cluster using servers with
> > Intel's Quad Core Xeon X3210 processors? If so what type of performance
> > boost have you noticed over a Dual Core system?
>
> We are in the initial stages of experimenting with a 1U supermicro
> system that can hold > 1T of space and 16x cores (2 boards with dual
> quad xeons) and requires a 900W power supply.  A monster box in a very
> small case.  I don't have any benchmarks as of yet but I will keep the
> list informed of our progress.
>
> Dennis Kubes
>
> >
> > Thanks
> > Marc Boucher, aTerra
>



--
Personal Blog: http://www.nano2sol.com
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes


Marc Boucher wrote:
> Dennis,
>
> Thanks for the info. I'm in the process of setting up a small cluster
> myself
> with 1 master and 4 slaves all running the Quad Core but with half as much
> RAM. Does doubling the RAM to 4GB make much of a difference?

We have some jobs that take alot of RAM so I have the childopts set to
1024M.  For standard fetching I don't know how much difference it would
make.

Dennis Kubes

>
> Thanks
> Marc
>
> On 3/14/07, Dennis Kubes <[hidden email]> wrote:
>>
>>
>>
>> Marc Boucher wrote:
>> > Dennis,
>> >
>> > I'm curious what kind of hardware your 5 system cluster uses? CPU, RAM,
>> HD
>> > etc.
>> >
>>
>> This is a small development cluster that we use.  It has 1 master and 4
>> slaves.  All are core2duo 2.4Ghz with 4G ram and 2x500G sata hard drives
>> on intel boards running a stripped down version of fedora core 6.
>>
>> Our production system consists of 1U supermicro servers with the same
>> specs except running a single 750G sata.
>>
>>
>> > And I was wondering if anyone has tested a cluster using servers with
>> > Intel's Quad Core Xeon X3210 processors? If so what type of performance
>> > boost have you noticed over a Dual Core system?
>>
>> We are in the initial stages of experimenting with a 1U supermicro
>> system that can hold > 1T of space and 16x cores (2 boards with dual
>> quad xeons) and requires a 900W power supply.  A monster box in a very
>> small case.  I don't have any benchmarks as of yet but I will keep the
>> list informed of our progress.
>>
>> Dennis Kubes
>>
>> >
>> > Thanks
>> > Marc Boucher, aTerra
>>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
In reply to this post by Dennis Kubes
Dennis Kubes wrote:
> The crawl for 1M pages completed successfully.  There was an issue
> with doing a copyToLocal but that has already been filed as a HADOOP
> bug and the patch will be included in 0.12.x
>


That's very good news, Dennis - thanks for taking the time to do this,
your test gives me more confidence in 0.12.1 than I could muster reading
the reports on the Hadoop mailing list .. ;)

Could you perhaps create a JIRA issue and attach the patches from the
current trunk/ to your 0.12.1-based version? As soon as 0.12.1 is out
the door we can upgrade, and then finally wrap up our release.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes


Andrzej Bialecki wrote:

> Dennis Kubes wrote:
>> The crawl for 1M pages completed successfully.  There was an issue
>> with doing a copyToLocal but that has already been filed as a HADOOP
>> bug and the patch will be included in 0.12.x
>>
>
>
> That's very good news, Dennis - thanks for taking the time to do this,
> your test gives me more confidence in 0.12.1 than I could muster reading
> the reports on the Hadoop mailing list .. ;)

I agree with your assessment on the Hadoop list that a little more time
to stabilize 0.12.1 would be a good thing.  Running this test made me
feel a little better too.

>
> Could you perhaps create a JIRA issue and attach the patches from the
> current trunk/ to your 0.12.1-based version? As soon as 0.12.1 is out
> the door we can upgrade, and then finally wrap up our release.

Do you want me to create a JIRA issue and attach all patches from the
current trunk.  I just need a little more guidance on what I need to do.

Dennis Kubes
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
Dennis Kubes wrote:
>>
>> Could you perhaps create a JIRA issue and attach the patches from the
>> current trunk/ to your 0.12.1-based version? As soon as 0.12.1 is out
>> the door we can upgrade, and then finally wrap up our release.
>
> Do you want me to create a JIRA issue and attach all patches from the
> current trunk.  I just need a little more guidance on what I need to do.

Exactly. I.e. do an svn diff from your current working copy, and attach
to this issue - then at least when we bring in 0.12.1 we will know which
parts need changing, even if the patch itself doesn't apply super cleanly.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Dennis Kubes


Andrzej Bialecki wrote:

> Dennis Kubes wrote:
>>>
>>> Could you perhaps create a JIRA issue and attach the patches from the
>>> current trunk/ to your 0.12.1-based version? As soon as 0.12.1 is out
>>> the door we can upgrade, and then finally wrap up our release.
>>
>> Do you want me to create a JIRA issue and attach all patches from the
>> current trunk.  I just need a little more guidance on what I need to do.
>
> Exactly. I.e. do an svn diff from your current working copy, and attach
> to this issue - then at least when we bring in 0.12.1 we will know which
> parts need changing, even if the patch itself doesn't apply super cleanly.
>

I was just a little confused because there because at least from the IDE
perspective the hadoop-0.12.1-dev-core.jar didn't break anything.  I
have created NUTCH-459 and attached the new jar.  I am assuming that is
what we need.

Dennis Kubes
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.11.2 vs. 0.12.1

Andrzej Białecki-2
Dennis Kubes wrote:
> I was just a little confused because there because at least from the
> IDE perspective the hadoop-0.12.1-dev-core.jar didn't break anything.  
> I have created NUTCH-459 and attached the new jar.  I am assuming that
> is what we need.

Ah, I didn't know that :) yeah, in that case creating the JIRA issue
might seem pointless, indeed ... but it's there now, so iff we discover
any changes that need to be made we can attach them to this issue.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com