It adds a new API, IOStatisticsSource, for any class to act as a source of
a static or dynamic IOStatistics set of counters/gauges/min/max/mean stats
The intent is to allow applications to collect statistics on streams,
iterators, and other classes they use to interact with filesystems/remote
stores, so get detailed statistics on the #of operations, latencies etc.
There's help to log these results, as well as aggregate them
This is how applications can aggregate results, and then propagate it back
to the AM/job driver/query engine
We already have PRs using this for S3A and ABFS on input streams, and in
S3A we also count LIST performance, which clients can pick up provided they
use the listStatusIterator, listFiles etc calls which return RemoteIterator.
I know it's a lot of code, but it's split into interface and
implementation, the public interface is for applications, the
implementation is what we are using internally, and which we will tune as
we adopt it more.
I have been working on this on and off for months, and yes it has grown.
But now that we are supporting more complex storage systems, the existing
tracking of long/short reads isn't informative enough. I want to know how
many GET requests failed and had to be retried, how often the DELETE calls
were throttled, and what the real latency of list operations are over
Please, take a look. As a new API it's unlikely to cause any regressions
-the main things to worry about are "is that API the one applications can
use" and "hi Steve got something fundamentally wrong in his implementation