Re: [VOTE]: Support for RBF data locality Solution

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE]: Support for RBF data locality Solution

Xiaoqiao He
Thanks everyone for discussing and voting for the issue.
Totally 6 +1s for Approach A (include my own +1).
I would like summary voting solution:

   - Add extra optional field about client hostname in
   RpcHeader#RpcRequestHeaderProto,
   - Router set RpcRequestHeader#clientHostname if necessary,
   - Namenode will get clientHostname when invoke #getRemoteAddress if
   RpcRequestHeader#clientHostname set, otherwise keeps current logic.

I will create new issue to push this feature forward.
Thanks all again.

On Fri, Apr 12, 2019 at 7:31 PM Vinayakumar B <[hidden email]>
wrote:

> +1 for approach A.
>
> On Thu, 11 Apr 2019, 12:23 pm Akira Ajisaka, <[hidden email]> wrote:
>
>> The Approach A looks good to me.
>>
>> Thanks,
>> Akira
>>
>> On Thu, Apr 11, 2019 at 2:30 PM Xiaoqiao He <[hidden email]> wrote:
>> >
>> > Hi forks,
>> >
>> > The current implementation of RBF is not sensitive about data locality,
>> > since NameNode could not get real client hostname by invoke
>> > Server#getRemoteAddress when RPC request forward by Router to NameNode.
>> > Therefore, it will lead to several challenges, for instance,
>> >
>> >    - a. Client could have to go for remote read instead of local read,
>> >    Short-Circuit could not be used in most cases.
>> >    - b. Block placement policy could not run as except based on defined
>> >    rack aware. Thus it will loss local node write.
>> >
>> > There are some different solutions to solve data locality issue after
>> > discussion, some of them will change RPC protocol, so we look forward to
>> > furthermore suggestions and votes. HDFS-13248 is tracking the issue.
>> >
>> >    - Approach A: Changing IPC/RPC layer protocol
>> (IpcConnectionContextProto
>> >    or RpcHeader#RpcRequestHeaderProto) and add extra field about client
>> >    hostname. Of course the new field is optional, only input by Router
>> and
>> >    parse by Namenode in generally. This approach is compatibility and
>> Client
>> >    should do nothing after changing.
>> >    - Approach B: Changing ClientProtocol and add extra interface
>> >    create/append/getBlockLocations with additional parameter about
>> client
>> >    hostname. As approach A, it is input by Router and parse by
>> Namenode, and
>> >    also is compatibility.
>> >    - Approach C: Solve write and read locality separately based on
>> current
>> >    interface and no changes, for write, hack client hostname as one of
>> favor
>> >    nodes for addBlocks, for read, reorder targets at Router after
>> Namenode
>> >    returns result to Router.
>> >
>> > As discussion and evaluation in HDFS-13248, we prefer to change IPC/RPC
>> > layer protocol to support RPC data locality. We welcome more
>> suggestions,
>> > votes or just give us feedback to push forward this feature. Thanks.
>> >
>> > Best Regards,
>> > Hexiaoqiao
>> >
>> > reference
>> > [1] https://issues.apache.org/jira/browse/HDFS-13248
>> > [2] https://issues.apache.org/jira/browse/HDFS-10467
>> >
>> > [3] https://issues.apache.org/jira/browse/HDFS-12615
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>