getting job status

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

getting job status

Eugeny N Dzhurinsky
Hello there!

Could somebody please explain is it possible to get some statistics for the
certain job? For instance, get some numbers of how many data tuples were
processed yet, and how many tuples needs to be processed to complete the job?

This presumes the job knows the required statistics and can deliver it when
queried by Hadoop/something else.

--
Eugene N Dzhurinsky

attachment0 (194 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: getting job status

Arun C Murthy-2
Eugeny N Dzhurinsky wrote:
> Hello there!
>
> Could somebody please explain is it possible to get some statistics for the
> certain job? For instance, get some numbers of how many data tuples were
> processed yet, and how many tuples needs to be processed to complete the job?
>

http://lucene.apache.org/hadoop/mapred_tutorial.html#Job+Control

Specifically:
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#submitJob(org.apache.hadoop.mapred.JobConf)
and
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/RunningJob.html

Arun

> This presumes the job knows the required statistics and can deliver it when
> queried by Hadoop/something else.
>

Reply | Threaded
Open this post in threaded view
|

Re: getting job status

Arun C Murthy-2
Arun C Murthy wrote:

> Eugeny N Dzhurinsky wrote:
>
>> Hello there!
>>
>> Could somebody please explain is it possible to get some statistics
>> for the
>> certain job? For instance, get some numbers of how many data tuples were
>> processed yet, and how many tuples needs to be processed to complete
>> the job?
>>
>
> http://lucene.apache.org/hadoop/mapred_tutorial.html#Job+Control
>
> Specifically:
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#submitJob(org.apache.hadoop.mapred.JobConf)
>
> and
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/RunningJob.html 
>

To clarify, you can get the average progress of the maps and reduces
(individually) from RunningJob, however not task-level details.

Arun

Reply | Threaded
Open this post in threaded view
|

RE: getting job status

Devaraj Das
In reply to this post by Eugeny N Dzhurinsky
There are job counters that holds the records information by default. You
can submit a job and then get the counters for it. Have a look at
o.a.h.m.JobClient.runJob. Also the web UI for the job displays it.  

> -----Original Message-----
> From: Eugeny N Dzhurinsky [mailto:[hidden email]]
> Sent: Monday, November 19, 2007 3:34 PM
> To: [hidden email]
> Subject: getting job status
>
> Hello there!
>
> Could somebody please explain is it possible to get some
> statistics for the certain job? For instance, get some
> numbers of how many data tuples were processed yet, and how
> many tuples needs to be processed to complete the job?
>
> This presumes the job knows the required statistics and can
> deliver it when queried by Hadoop/something else.
>
> --
> Eugene N Dzhurinsky
>