Hadoop 0.15.0 - Reporter issue w/ timing out

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Hadoop 0.15.0 - Reporter issue w/ timing out

Derek Gottfrid-2
I have a very simple map task  that gets a filename - reads and
decompresss that file and stores the decompressed file away.
As it is reading and writing it does Report.incrCounter w/ the number
of bytes decompressed and written. It works and I can see the
incrementer chugging along but then even though the process is still
calling incrCounter - it stops and basically the task gets timed out
and killed. This use to work but doesn't in 0.15.0 - interestingly if
I add a report.setStatus() in there it stays alive and doesn't get
killed.

Is this a bug? Am I doing something wrong?

thanks,
derek
Reply | Threaded
Open this post in threaded view
|

RE: Hadoop 0.15.0 - Reporter issue w/ timing out

Devaraj Das
There has been a change with respect to the way since progress reporting is
done since 0.14. The application has to explicitly send the status
(incrCounter doesn't send any status). Even if the application hasn't made
any progress, it is okay to call setStatus with the earlier status.

> -----Original Message-----
> From: Derek Gottfrid [mailto:[hidden email]]
> Sent: Saturday, November 10, 2007 2:09 AM
> To: [hidden email]
> Subject: Hadoop 0.15.0 - Reporter issue w/ timing out
>
> I have a very simple map task  that gets a filename - reads
> and decompresss that file and stores the decompressed file away.
> As it is reading and writing it does Report.incrCounter w/
> the number of bytes decompressed and written. It works and I
> can see the incrementer chugging along but then even though
> the process is still calling incrCounter - it stops and
> basically the task gets timed out and killed. This use to
> work but doesn't in 0.15.0 - interestingly if I add a
> report.setStatus() in there it stays alive and doesn't get killed.
>
> Is this a bug? Am I doing something wrong?
>
> thanks,
> derek
>

Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.15.0 - Reporter issue w/ timing out

Doug Cutting
Devaraj Das wrote:
> There has been a change with respect to the way since progress reporting is
> done since 0.14. The application has to explicitly send the status
> (incrCounter doesn't send any status). Even if the application hasn't made
> any progress, it is okay to call setStatus with the earlier status.

Should we consider this a bug?  Incrementing a counter is a sign of
activity that should perhaps count towards keeping a task alive.

Doug
Reply | Threaded
Open this post in threaded view
|

RE: Hadoop 0.15.0 - Reporter issue w/ timing out

Joydeep Sen Sarma
Did anyone consider the impact of making such a change on existing
applications? Curious how it didn't fail any regression test? (the
pattern that is reported to be broken is so common).

(I suffer from upgradephobia and this doesn't help)

-----Original Message-----
From: Doug Cutting [mailto:[hidden email]]
Sent: Saturday, November 10, 2007 3:41 PM
To: [hidden email]
Subject: Re: Hadoop 0.15.0 - Reporter issue w/ timing out

Devaraj Das wrote:
> There has been a change with respect to the way since progress
reporting is
> done since 0.14. The application has to explicitly send the status
> (incrCounter doesn't send any status). Even if the application hasn't
made
> any progress, it is okay to call setStatus with the earlier status.

Should we consider this a bug?  Incrementing a counter is a sign of
activity that should perhaps count towards keeping a task alive.

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Hadoop 0.15.0 - Reporter issue w/ timing out

Derek Gottfrid-2
In reply to this post by Doug Cutting
I favor considering this a bug. It is easy enough to rework my code
but it seems like odd behaviour.

On Nov 10, 2007 6:41 PM, Doug Cutting <[hidden email]> wrote:

> Devaraj Das wrote:
> > There has been a change with respect to the way since progress reporting is
> > done since 0.14. The application has to explicitly send the status
> > (incrCounter doesn't send any status). Even if the application hasn't made
> > any progress, it is okay to call setStatus with the earlier status.
>
> Should we consider this a bug?  Incrementing a counter is a sign of
> activity that should perhaps count towards keeping a task alive.
>
> Doug
>
Reply | Threaded
Open this post in threaded view
|

RE: Hadoop 0.15.0 - Reporter issue w/ timing out

Devaraj Das
In reply to this post by Doug Cutting
Actually in the previous approach, progress reporting used to happen from a
separate thread in tasks. The issues
https://issues.apache.org/jira/browse/HADOOP-1431,
https://issues.apache.org/jira/browse/HADOOP-1462 changed this behavior.

But, yes, I agree that incrCounter should be indicative of progress.

> -----Original Message-----
> From: Doug Cutting [mailto:[hidden email]]
> Sent: Sunday, November 11, 2007 5:11 AM
> To: [hidden email]
> Subject: Re: Hadoop 0.15.0 - Reporter issue w/ timing out
>
> Devaraj Das wrote:
> > There has been a change with respect to the way since progress
> > reporting is done since 0.14. The application has to
> explicitly send
> > the status (incrCounter doesn't send any status). Even if the
> > application hasn't made any progress, it is okay to call
> setStatus with the earlier status.
>
> Should we consider this a bug?  Incrementing a counter is a
> sign of activity that should perhaps count towards keeping a
> task alive.
>
> Doug
>