I want to measure the time taken to read/write from HDFS and feed data to the mapper/reducer vs the actual map/reduce time for the WordCount example. I have enabled HTrace with Zipkin, and I've got a bunch of execution times for the underlying function calls (too many to post here).
How can I make sense of the tracing that I see in Zipkin to get the information I need ? What would be the function calls that split the desired time measurements that I am after? (before&after) mapper & reducer.
I have an lxc based cluster with Hadoop 2.6.0 ( 1 namenode + 3 datanodes ).