Commented: (HADOOP-492) Global counters

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Commented: (HADOOP-492) Global counters

Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475172 ]

David Bowen commented on HADOOP-492:
------------------------------------


Thanks Andrzej and Doug for your comments.  There is one vote for changing the class name to Counters, and none against, so unless anyone else wants to argue about it, I will switch to Counters.

Re Doug's comments:

1. Reporter: I was thinking only about source compatibility, which is actually improved by making this an abstract class so that code like this:

Reporter reporter = new Reporter() {
    // definitions of abstract methods
};

will still work because the new method (incrCounter(String name)) is not abstract.  However, that is pretty unimportant, because it doesn't affect users, since they don't have any reason to implement Reporter.  The down side of the change is that it breaks binary compatibility - so that users would need to recompile their applications, and there isn't a good enough reason for doing this.  So I will change it back to an interface.

2. I will remove the method that I commented out.








> Global counters
> ---------------
>
>                 Key: HADOOP-492
>                 URL: https://issues.apache.org/jira/browse/HADOOP-492
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: arkady borkovsky
>         Assigned To: David Bowen
>         Attachments: counters1.patch
>
>
> It would be nice to have map / reduce job keep aggregated counts for arbitrary events occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific type, the number of sentences in passive voice, whatever the jobs finds useful.
> This can be implemented by tasks periodically sending <name, value> pairs to the jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so that the job tracker stores all the latests values from each task and aggregates them on a request.  It should also make the aggregated values available at the job end.  The value for a task would be flushed when the task fails.
> #491 and #490 may be related to this one.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.