extremely slow reduce jobs

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

extremely slow reduce jobs

Joydeep Sen Sarma
I have a fairly simple job with a map, a local combiner and a reduce.
The combiner and the reduce do the equivalent of a group_concat (mysql).
 
 
I have horrible performance in the reduce stage:
- the map jobs are done
- all the reduce jobs claim they are copying data - but the copy rate is
abysmal (0.5MBps)
  - checked the network topology - everything's on GigE and on same
switch. (80 machine cluster)
  - seeing 50+ MBps bandwidth between any pair using scp
- when I look at the machines where reduce is running - vmstat says 0%
cpu util.
 
A sample reducetask log is below. Job conf: 64 way reduce. I specified
the map tasks to the same number - but hadoop is anyway creating 386 map
tasks.
 
Anyone has some quick hints on what could be going wrong?
 
Thanks,
 
Joydeep
 
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)

 

Reply | Threaded
Open this post in threaded view
|

Re: extremely slow reduce jobs

Nguyen Manh Tien
I met the save problem:
 -but the copy rate is only (0.1MBps)
 -10MBps bandwidth between using scp

May be the bandwidth affect the copy rate.
Can any one help us.

2007/8/4, Joydeep Sen Sarma <[hidden email]>:

>
> I have a fairly simple job with a map, a local combiner and a reduce.
> The combiner and the reduce do the equivalent of a group_concat (mysql).
>
>
> I have horrible performance in the reduce stage:
> - the map jobs are done
> - all the reduce jobs claim they are copying data - but the copy rate is
> abysmal (0.5MBps)
>   - checked the network topology - everything's on GigE and on same
> switch. (80 machine cluster)
>   - seeing 50+ MBps bandwidth between any pair using scp
> - when I look at the machines where reduce is running - vmstat says 0%
> cpu util.
>
> A sample reducetask log is below. Job conf: 64 way reduce. I specified
> the map tasks to the same number - but hadoop is anyway creating 386 map
> tasks.
>
> Anyone has some quick hints on what could be going wrong?
>
> Thanks,
>
> Joydeep
>
> 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
> outputs from previous failures
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
> 2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
> dup hosts)
> 2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
> task_0169_r_000010_0 Need 1 map output(s)
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: extremely slow reduce jobs

Stu Hood-2
In reply to this post by Joydeep Sen Sarma


Hi,

I have noticed the same thing. My cluster is only 2 machines, but I've noticed "Scheduled 0 of 1 known outputs (1 slow hosts and 0 dup hosts)" over and over in userlog/task*_r_*/syslog on the namenode machine.

I just found this thread which explains the problem:
[http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%3C03af01c75b01$0c3e1a80$2301a8c0@...%3E] http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%3C03af01c75b01$0c3e1a80$2301a8c0@...%3E

Thanks,
Stu


-----Original Message-----
From: Joydeep Sen Sarma <[hidden email]>
Sent: Fri, August 3, 2007 3:17 pm
To: [hidden email]
Subject: extremely slow reduce jobs

I have a fairly simple job with a map, a local combiner and a reduce.
The combiner and the reduce do the equivalent of a group_concat (mysql).
 
 
I have horrible performance in the reduce stage:
- the map jobs are done
- all the reduce jobs claim they are copying data - but the copy rate is
abysmal (0.5MBps)
  - checked the network topology - everything's on GigE and on same
switch. (80 machine cluster)
  - seeing 50+ MBps bandwidth between any pair using scp
- when I look at the machines where reduce is running - vmstat says 0%
cpu util.
 
A sample reducetask log is below. Job conf: 64 way reduce. I specified
the map tasks to the same number - but hadoop is anyway creating 386 map
tasks.
 
Anyone has some quick hints on what could be going wrong?
 
Thanks,
 
Joydeep
 
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)

 

Reply | Threaded
Open this post in threaded view
|

RE: extremely slow reduce jobs

Joydeep Sen Sarma
Thanks! This makes a lot of sense - I do see a lot of 'slow' hosts in
the logs.

Is it normal for hosts to be classified as 'slow'? (Given that there's
ample bandwidth and no errors - I don't know why a peer be classified as
such in the first place).

-----Original Message-----
From: Stu Hood [mailto:[hidden email]]
Sent: Tuesday, August 07, 2007 2:58 PM
To: [hidden email]
Subject: RE: extremely slow reduce jobs



Hi,

I have noticed the same thing. My cluster is only 2 machines, but I've
noticed "Scheduled 0 of 1 known outputs (1 slow hosts and 0 dup hosts)"
over and over in userlog/task*_r_*/syslog on the namenode machine.

I just found this thread which explains the problem:
[http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/
%3C03af01c75b01$0c3e1a80$[hidden email]%3E]
<a href="http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%">http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%
3C03af01c75b01$0c3e1a80$[hidden email]%3E

Thanks,
Stu


-----Original Message-----
From: Joydeep Sen Sarma <[hidden email]>
Sent: Fri, August 3, 2007 3:17 pm
To: [hidden email]
Subject: extremely slow reduce jobs

I have a fairly simple job with a map, a local combiner and a reduce.
The combiner and the reduce do the equivalent of a group_concat (mysql).
 
 
I have horrible performance in the reduce stage:
- the map jobs are done
- all the reduce jobs claim they are copying data - but the copy rate is
abysmal (0.5MBps)
  - checked the network topology - everything's on GigE and on same
switch. (80 machine cluster)
  - seeing 50+ MBps bandwidth between any pair using scp
- when I look at the machines where reduce is running - vmstat says 0%
cpu util.
 
A sample reducetask log is below. Job conf: 64 way reduce. I specified
the map tasks to the same number - but hadoop is anyway creating 386 map
tasks.
 
Anyone has some quick hints on what could be going wrong?
 
Thanks,
 
Joydeep
 
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)

 

Reply | Threaded
Open this post in threaded view
|

RE: extremely slow reduce jobs

Stu Hood-2
In reply to this post by Joydeep Sen Sarma


Actually, it turns out that the reason one of my datanodes was "slow" was because it had not responded at all. The datanode had failed to start because of a namespace mismatch (since I've been running on hadoop-trunk). After reformatting the datanode the slow host messages have disappeared.

I still have another issue that I need to resolve, but I'll send a separate subject to the list about it.

Stu


-----Original Message-----
From: Joydeep Sen Sarma <[hidden email]>
Sent: Tue, August 7, 2007 8:41 pm
To: [hidden email], [hidden email]
Subject: RE: extremely slow reduce jobs

Thanks! This makes a lot of sense - I do see a lot of 'slow' hosts in
the logs.

Is it normal for hosts to be classified as 'slow'? (Given that there's
ample bandwidth and no errors - I don't know why a peer be classified as
such in the first place).

-----Original Message-----
From: Stu Hood [mailto:[hidden email]]
Sent: Tuesday, August 07, 2007 2:58 PM
To: [hidden email]
Subject: RE: extremely slow reduce jobs



Hi,

I have noticed the same thing. My cluster is only 2 machines, but I've
noticed "Scheduled 0 of 1 known outputs (1 slow hosts and 0 dup hosts)"
over and over in userlog/task*_r_*/syslog on the namenode machine.

I just found this thread which explains the problem:
[http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/
%3C03af01c75b01$0c3e1a80$[hidden email]%3E]
<a href="http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%">http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200702.mbox/%
3C03af01c75b01$0c3e1a80$[hidden email]%3E

Thanks,
Stu


-----Original Message-----
From: Joydeep Sen Sarma
Sent: Fri, August 3, 2007 3:17 pm
To: [hidden email]
Subject: extremely slow reduce jobs

I have a fairly simple job with a map, a local combiner and a reduce.
The combiner and the reduce do the equivalent of a group_concat (mysql).
 
 
I have horrible performance in the reduce stage:
- the map jobs are done
- all the reduce jobs claim they are copying data - but the copy rate is
abysmal (0.5MBps)
  - checked the network topology - everything's on GigE and on same
switch. (80 machine cluster)
  - seeing 50+ MBps bandwidth between any pair using scp
- when I look at the machines where reduce is running - vmstat says 0%
cpu util.
 
A sample reducetask log is below. Job conf: 64 way reduce. I specified
the map tasks to the same number - but hadoop is anyway creating 386 map
tasks.
 
Anyone has some quick hints on what could be going wrong?
 
Thanks,
 
Joydeep
 
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
outputs from previous failures
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
dup hosts)
2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
task_0169_r_000010_0 Need 1 map output(s)

 

Reply | Threaded
Open this post in threaded view
|

Re: extremely slow reduce jobs

Arun C Murthy-2
In reply to this post by Joydeep Sen Sarma
On Fri, Aug 03, 2007 at 12:17:37PM -0700, Joydeep Sen Sarma wrote:

>I have a fairly simple job with a map, a local combiner and a reduce.
>The combiner and the reduce do the equivalent of a group_concat (mysql).
>
>
>I have horrible performance in the reduce stage:
>- the map jobs are done
>- all the reduce jobs claim they are copying data - but the copy rate is
>abysmal (0.5MBps)
>  - checked the network topology - everything's on GigE and on same
>switch. (80 machine cluster)
>  - seeing 50+ MBps bandwidth between any pair using scp
>- when I look at the machines where reduce is running - vmstat says 0%
>cpu util.
>
>A sample reducetask log is below. Job conf: 64 way reduce. I specified
>the map tasks to the same number - but hadoop is anyway creating 386 map
>tasks.
>

The no. of maps is only a hint to the JobTracker, to truly control the no. of maps you need to write your own input-split:
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/InputSplit.html

>Anyone has some quick hints on what could be going wrong?
>

Couple of things: I have no silver bullet, but the *slow hosts* is one clue: there were a couple of failures when trying to fetch map-outputs; do you see any exceptions in your reduce task's syslog? (in logs/userlogs/${reduce_taskid}/syslog/part-*)

Pertinent piece of information: there are some bugs (upto and including 0.14.0 release) w.r.t fetch-failures leading to hung reduces. Please look at http://issues.apache.org/jira/browse/HADOOP-1158 for more details...

Hope that helps, apologies for the late response.

Arun

>Thanks,
>
>Joydeep
>
>2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:06:54,408 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:06:59,409 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
>outputs from previous failures
>2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:06:59,410 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:07:04,411 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
>outputs from previous failures
>2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:07:04,412 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
>outputs from previous failures
>2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:07:09,413 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
>outputs from previous failures
>2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:07:14,415 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:07:19,417 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 0 new map outputs from tasktracker and 0 map
>outputs from previous failures
>2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Got 2 known map output location(s); scheduling...
>2007-08-03 12:07:19,418 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Scheduled 0 of 2 known outputs (2 slow hosts and 0
>dup hosts)
>2007-08-03 12:07:24,419 INFO org.apache.hadoop.mapred.ReduceTask:
>task_0169_r_000010_0 Need 1 map output(s)
>
>
>