Introduce Apache Kerby to Hadoop

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Introduce Apache Kerby to Hadoop

Zheng, Kai
Hi folks,

I'd like to mention Apache Kerby [1] here to the community and propose to introduce the project to Hadoop, a sub project of Apache Directory project.

Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos library that contains both client and server supports. The relevant features include:
It supports full Kerberos encryption types aligned with both MIT KDC and MS AD;
Client APIs to allow to login via password, credential cache, keytab file and etc.;
Utilities for generate, operate and inspect keytab and credential cache files;
A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can be used in tests but with minimal overhead in external dependencies;
A brand new token mechanism is provided, can be experimentally used, using it a JWT token can be used to exchange a TGT or service ticket;
Anonymous PKINIT support, can be experientially used, as the first Java library that supports the Kerberos major extension.

The project stands alone and is ensured to only depend on JRE for easier usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.


As an initial step, this proposal suggests using Apache Kerby to upgrade the existing codes related to ApacheDS for the Kerberos support. The advantageous:

1. The kerby-kerb library is all the need, which is purely in Java, SLF4J is the only dependency, the whole is rather small;

2. There is a SimpleKDC in the library for test usage, which borrowed the MiniKDC idea and implemented all the support existing in MiniKDC. We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works fine;

3. Full Kerberos encryption types (many of them are not available in JRE but supported by major Kerberos vendors) and more functionalities like credential cache support;

4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old Kerberos implementation in Directory Server project, but the implementation is stopped being maintained. Directory project has a plan to replace the implementation using Kerby. MiniKDC can use Kerby directly to simplify the deps;

5. Extensively tested with all kinds of unit tests, already being used for some time (like PSU), even in production environment;

6. Actively developed, and can be fixed and released in time if necessary, separately and independently from other components in Apache Directory project. By actively developing Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying, troubleshooting and further enhancement can  be much easier and thereafter possible.



Wish this is a good beginning, and eventually Apache Kerby can benefit other projects in the ecosystem as well.



This Kerberos related work is actually a long time effort led by Weihua Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions and inputs in the past.



Your feedback is very welcome. Thanks in advance.



[1] https://github.com/apache/directory-kerby



Regards,

Kai
Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

Steve Loughran-3


I've discussed this offline with Kai, as part of the "let's fix kerberos" project. Not only is it a better Kerberos engine, we can do more diagnostics, get better algorithms and ultimately get better APIs for doing Kerberos and SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.

For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.

Long term, I'd like Hadoop 3 to be Kerby-ized


> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
>
> Hi folks,
>
> I'd like to mention Apache Kerby [1] here to the community and propose to introduce the project to Hadoop, a sub project of Apache Directory project.
>
> Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos library that contains both client and server supports. The relevant features include:
> It supports full Kerberos encryption types aligned with both MIT KDC and MS AD;
> Client APIs to allow to login via password, credential cache, keytab file and etc.;
> Utilities for generate, operate and inspect keytab and credential cache files;
> A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can be used in tests but with minimal overhead in external dependencies;
> A brand new token mechanism is provided, can be experimentally used, using it a JWT token can be used to exchange a TGT or service ticket;
> Anonymous PKINIT support, can be experientially used, as the first Java library that supports the Kerberos major extension.
>
> The project stands alone and is ensured to only depend on JRE for easier usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.
>
>
> As an initial step, this proposal suggests using Apache Kerby to upgrade the existing codes related to ApacheDS for the Kerberos support. The advantageous:
>
> 1. The kerby-kerb library is all the need, which is purely in Java, SLF4J is the only dependency, the whole is rather small;
>
> 2. There is a SimpleKDC in the library for test usage, which borrowed the MiniKDC idea and implemented all the support existing in MiniKDC. We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works fine;
>
> 3. Full Kerberos encryption types (many of them are not available in JRE but supported by major Kerberos vendors) and more functionalities like credential cache support;
>
> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old Kerberos implementation in Directory Server project, but the implementation is stopped being maintained. Directory project has a plan to replace the implementation using Kerby. MiniKDC can use Kerby directly to simplify the deps;
>
> 5. Extensively tested with all kinds of unit tests, already being used for some time (like PSU), even in production environment;
>
> 6. Actively developed, and can be fixed and released in time if necessary, separately and independently from other components in Apache Directory project. By actively developing Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying, troubleshooting and further enhancement can  be much easier and thereafter possible.
>
>
>
> Wish this is a good beginning, and eventually Apache Kerby can benefit other projects in the ecosystem as well.
>
>
>
> This Kerberos related work is actually a long time effort led by Weihua Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions and inputs in the past.
>
>
>
> Your feedback is very welcome. Thanks in advance.
>
>
>
> [1] https://github.com/apache/directory-kerby
>
>
>
> Regards,
>
> Kai

Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

larry mccay-2
Replacing MiniKDC with kerby certainly makes sense.

Kerby-izing Hadoop 3 needs to be defined carefully.
As much as a JWT proponent that I am, I don't know that that taking up
non-standard features such as the JWT token would necessarily serve us well.
If we are talking about client side only uptake in Hadoop 3 as a better
diagnosable client library that completely makes sense.

Better algorithms and APIs would require server side compliance as well -
no?
These decisions would need to align deployment usecases that want to go
directly to AD/MIT.
Perhaps, it just means careful configuration of algorithms to match the
server side in those cases.

+1 on the baby step of replacing MiniKDC - as this is really just alignment
with the directory project roadmap anyway.

On Mon, Feb 22, 2016 at 5:51 AM, Steve Loughran <[hidden email]>
wrote:

>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos"
> project. Not only is it a better Kerberos engine, we can do more
> diagnostics, get better algorithms and ultimately get better APIs for doing
> Kerberos and SASL —the latter would dramatically reduce the cost of
> wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how
> it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
> > On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
> >
> > Hi folks,
> >
> > I'd like to mention Apache Kerby [1] here to the community and propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >
> > Apache Kerby is a Kerberos centric project and aims to provide a first
> Java Kerberos library that contains both client and server supports. The
> relevant features include:
> > It supports full Kerberos encryption types aligned with both MIT KDC and
> MS AD;
> > Client APIs to allow to login via password, credential cache, keytab
> file and etc.;
> > Utilities for generate, operate and inspect keytab and credential cache
> files;
> > A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can
> be used in tests but with minimal overhead in external dependencies;
> > A brand new token mechanism is provided, can be experimentally used,
> using it a JWT token can be used to exchange a TGT or service ticket;
> > Anonymous PKINIT support, can be experientially used, as the first Java
> library that supports the Kerberos major extension.
> >
> > The project stands alone and is ensured to only depend on JRE for easier
> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is
> upcoming.
> >
> >
> > As an initial step, this proposal suggests using Apache Kerby to upgrade
> the existing codes related to ApacheDS for the Kerberos support. The
> advantageous:
> >
> > 1. The kerby-kerb library is all the need, which is purely in Java,
> SLF4J is the only dependency, the whole is rather small;
> >
> > 2. There is a SimpleKDC in the library for test usage, which borrowed
> the MiniKDC idea and implemented all the support existing in MiniKDC. We
> had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works fine;
> >
> > 3. Full Kerberos encryption types (many of them are not available in JRE
> but supported by major Kerberos vendors) and more functionalities like
> credential cache support;
> >
> > 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old
> Kerberos implementation in Directory Server project, but the implementation
> is stopped being maintained. Directory project has a plan to replace the
> implementation using Kerby. MiniKDC can use Kerby directly to simplify the
> deps;
> >
> > 5. Extensively tested with all kinds of unit tests, already being used
> for some time (like PSU), even in production environment;
> >
> > 6. Actively developed, and can be fixed and released in time if
> necessary, separately and independently from other components in Apache
> Directory project. By actively developing Apache Kerby and now applying it
> to Hadoop, our side wish to make the Kerberos deploying, troubleshooting
> and further enhancement can  be much easier and thereafter possible.
> >
> >
> >
> > Wish this is a good beginning, and eventually Apache Kerby can benefit
> other projects in the ecosystem as well.
> >
> >
> >
> > This Kerberos related work is actually a long time effort led by Weihua
> Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve
> Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their
> great discussions and inputs in the past.
> >
> >
> >
> > Your feedback is very welcome. Thanks in advance.
> >
> >
> >
> > [1] https://github.com/apache/directory-kerby
> >
> >
> >
> > Regards,
> >
> > Kai
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Introduce Apache Kerby to Hadoop

Zheng, Kai
In reply to this post by Steve Loughran-3
Thanks for the confirm and further inputs, Steve.

>> the latter would dramatically reduce the cost of wire-encrypting IPC.
Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can help with, it's possible because we may hook Chimera or AES-NI thing into the Kerberos layer by leveraging the Kerberos library. As it may be noted, HADOOP-12725 is on the going for this aspect. There may be good result and further update on this recently.

>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the right thing we could do. After some interactions with Kerby project, we may have more ideas how to proceed on the followings.

>> Long term, I'd like Hadoop 3 to be Kerby-ized
This sounds great! With necessary support from the community like feedback and patch reviewing, we can speed up the related work.

Regards,
Kai

-----Original Message-----
From: Steve Loughran [mailto:[hidden email]]
Sent: Monday, February 22, 2016 6:51 PM
To: [hidden email]
Subject: Re: Introduce Apache Kerby to Hadoop



I've discussed this offline with Kai, as part of the "let's fix kerberos" project. Not only is it a better Kerberos engine, we can do more diagnostics, get better algorithms and ultimately get better APIs for doing Kerberos and SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.

For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.

Long term, I'd like Hadoop 3 to be Kerby-ized


> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
>
> Hi folks,
>
> I'd like to mention Apache Kerby [1] here to the community and propose to introduce the project to Hadoop, a sub project of Apache Directory project.
>
> Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos library that contains both client and server supports. The relevant features include:
> It supports full Kerberos encryption types aligned with both MIT KDC
> and MS AD; Client APIs to allow to login via password, credential
> cache, keytab file and etc.; Utilities for generate, operate and
> inspect keytab and credential cache files; A simple KDC server that
> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
> with minimal overhead in external dependencies; A brand new token mechanism is provided, can be experimentally used, using it a JWT token can be used to exchange a TGT or service ticket; Anonymous PKINIT support, can be experientially used, as the first Java library that supports the Kerberos major extension.
>
> The project stands alone and is ensured to only depend on JRE for easier usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.
>
>
> As an initial step, this proposal suggests using Apache Kerby to upgrade the existing codes related to ApacheDS for the Kerberos support. The advantageous:
>
> 1. The kerby-kerb library is all the need, which is purely in Java,
> SLF4J is the only dependency, the whole is rather small;
>
> 2. There is a SimpleKDC in the library for test usage, which borrowed
> the MiniKDC idea and implemented all the support existing in MiniKDC.
> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
> fine;
>
> 3. Full Kerberos encryption types (many of them are not available in
> JRE but supported by major Kerberos vendors) and more functionalities
> like credential cache support;
>
> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
> old Kerberos implementation in Directory Server project, but the
> implementation is stopped being maintained. Directory project has a
> plan to replace the implementation using Kerby. MiniKDC can use Kerby
> directly to simplify the deps;
>
> 5. Extensively tested with all kinds of unit tests, already being used
> for some time (like PSU), even in production environment;
>
> 6. Actively developed, and can be fixed and released in time if necessary, separately and independently from other components in Apache Directory project. By actively developing Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying, troubleshooting and further enhancement can  be much easier and thereafter possible.
>
>
>
> Wish this is a good beginning, and eventually Apache Kerby can benefit other projects in the ecosystem as well.
>
>
>
> This Kerberos related work is actually a long time effort led by Weihua Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions and inputs in the past.
>
>
>
> Your feedback is very welcome. Thanks in advance.
>
>
>
> [1] https://github.com/apache/directory-kerby
>
>
>
> Regards,
>
> Kai

Reply | Threaded
Open this post in threaded view
|

RE: Introduce Apache Kerby to Hadoop

Zheng, Kai
In reply to this post by larry mccay-2
Thanks Larry for your thoughts and inputs.

>> Replacing MiniKDC with kerby certainly makes sense.
Thanks.

>> Kerby-izing Hadoop 3 needs to be defined carefully.
Fully agree. We're still working to make the relevant Kerberos support come to the ideal state, either in Kerby project or outside of it. When appropriate and sounds good, we can think about what's next steps, come up design and discuss this then. Maybe we can discuss about these inputs separately after the initial things done?

Regards,
Kai

-----Original Message-----
From: larry mccay [mailto:[hidden email]]
Sent: Monday, February 22, 2016 9:05 PM
To: [hidden email]
Subject: Re: Introduce Apache Kerby to Hadoop

Replacing MiniKDC with kerby certainly makes sense.

Kerby-izing Hadoop 3 needs to be defined carefully.
As much as a JWT proponent that I am, I don't know that that taking up non-standard features such as the JWT token would necessarily serve us well.
If we are talking about client side only uptake in Hadoop 3 as a better diagnosable client library that completely makes sense.

Better algorithms and APIs would require server side compliance as well - no?
These decisions would need to align deployment usecases that want to go directly to AD/MIT.
Perhaps, it just means careful configuration of algorithms to match the server side in those cases.

+1 on the baby step of replacing MiniKDC - as this is really just
+alignment
with the directory project roadmap anyway.

On Mon, Feb 22, 2016 at 5:51 AM, Steve Loughran <[hidden email]>
wrote:

>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos"
> project. Not only is it a better Kerberos engine, we can do more
> diagnostics, get better algorithms and ultimately get better APIs for
> doing Kerberos and SASL —the latter would dramatically reduce the cost
> of wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> how it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
> > On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
> >
> > Hi folks,
> >
> > I'd like to mention Apache Kerby [1] here to the community and
> > propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >
> > Apache Kerby is a Kerberos centric project and aims to provide a
> > first
> Java Kerberos library that contains both client and server supports.
> The relevant features include:
> > It supports full Kerberos encryption types aligned with both MIT KDC
> > and
> MS AD;
> > Client APIs to allow to login via password, credential cache, keytab
> file and etc.;
> > Utilities for generate, operate and inspect keytab and credential
> > cache
> files;
> > A simple KDC server that borrows some ideas from Hadoop-MiniKDC and
> > can
> be used in tests but with minimal overhead in external dependencies;
> > A brand new token mechanism is provided, can be experimentally used,
> using it a JWT token can be used to exchange a TGT or service ticket;
> > Anonymous PKINIT support, can be experientially used, as the first
> > Java
> library that supports the Kerberos major extension.
> >
> > The project stands alone and is ensured to only depend on JRE for
> > easier
> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2)
> is upcoming.
> >
> >
> > As an initial step, this proposal suggests using Apache Kerby to
> > upgrade
> the existing codes related to ApacheDS for the Kerberos support. The
> advantageous:
> >
> > 1. The kerby-kerb library is all the need, which is purely in Java,
> SLF4J is the only dependency, the whole is rather small;
> >
> > 2. There is a SimpleKDC in the library for test usage, which
> > borrowed
> the MiniKDC idea and implemented all the support existing in MiniKDC.
> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
> fine;
> >
> > 3. Full Kerberos encryption types (many of them are not available in
> > JRE
> but supported by major Kerberos vendors) and more functionalities like
> credential cache support;
> >
> > 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
> > old
> Kerberos implementation in Directory Server project, but the
> implementation is stopped being maintained. Directory project has a
> plan to replace the implementation using Kerby. MiniKDC can use Kerby
> directly to simplify the deps;
> >
> > 5. Extensively tested with all kinds of unit tests, already being
> > used
> for some time (like PSU), even in production environment;
> >
> > 6. Actively developed, and can be fixed and released in time if
> necessary, separately and independently from other components in
> Apache Directory project. By actively developing Apache Kerby and now
> applying it to Hadoop, our side wish to make the Kerberos deploying,
> troubleshooting and further enhancement can  be much easier and thereafter possible.
> >
> >
> >
> > Wish this is a good beginning, and eventually Apache Kerby can
> > benefit
> other projects in the ecosystem as well.
> >
> >
> >
> > This Kerberos related work is actually a long time effort led by
> > Weihua
> Jiang in Intel, and had been kindly encouraged by Andrew Purtell,
> Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for
> their great discussions and inputs in the past.
> >
> >
> >
> > Your feedback is very welcome. Thanks in advance.
> >
> >
> >
> > [1] https://github.com/apache/directory-kerby
> >
> >
> >
> > Regards,
> >
> > Kai
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

Andrew Purtell-3
In reply to this post by Zheng, Kai
I get a excited thinking about the prospect of better performance with auth-conf QoP. HBase RPC is an increasingly distant fork but still close enough to Hadoop in that respect. Our bulk data transfer protocol isn't a separate thing like in HDFS, which avoids a SASL wrapped implementation, so we really suffer when auth-conf is negotiated. You'll see the same impact where there might be a high frequency of NameNode RPC calls or similar still. Throughput drops 3-4x, or worse.

> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
>
> Thanks for the confirm and further inputs, Steve.
>
>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can help with, it's possible because we may hook Chimera or AES-NI thing into the Kerberos layer by leveraging the Kerberos library. As it may be noted, HADOOP-12725 is on the going for this aspect. There may be good result and further update on this recently.
>
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the right thing we could do. After some interactions with Kerby project, we may have more ideas how to proceed on the followings.
>
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> This sounds great! With necessary support from the community like feedback and patch reviewing, we can speed up the related work.
>
> Regards,
> Kai
>
> -----Original Message-----
> From: Steve Loughran [mailto:[hidden email]]
> Sent: Monday, February 22, 2016 6:51 PM
> To: [hidden email]
> Subject: Re: Introduce Apache Kerby to Hadoop
>
>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos" project. Not only is it a better Kerberos engine, we can do more diagnostics, get better algorithms and ultimately get better APIs for doing Kerberos and SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
>> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
>>
>> Hi folks,
>>
>> I'd like to mention Apache Kerby [1] here to the community and propose to introduce the project to Hadoop, a sub project of Apache Directory project.
>>
>> Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos library that contains both client and server supports. The relevant features include:
>> It supports full Kerberos encryption types aligned with both MIT KDC
>> and MS AD; Client APIs to allow to login via password, credential
>> cache, keytab file and etc.; Utilities for generate, operate and
>> inspect keytab and credential cache files; A simple KDC server that
>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
>> with minimal overhead in external dependencies; A brand new token mechanism is provided, can be experimentally used, using it a JWT token can be used to exchange a TGT or service ticket; Anonymous PKINIT support, can be experientially used, as the first Java library that supports the Kerberos major extension.
>>
>> The project stands alone and is ensured to only depend on JRE for easier usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.
>>
>>
>> As an initial step, this proposal suggests using Apache Kerby to upgrade the existing codes related to ApacheDS for the Kerberos support. The advantageous:
>>
>> 1. The kerby-kerb library is all the need, which is purely in Java,
>> SLF4J is the only dependency, the whole is rather small;
>>
>> 2. There is a SimpleKDC in the library for test usage, which borrowed
>> the MiniKDC idea and implemented all the support existing in MiniKDC.
>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
>> fine;
>>
>> 3. Full Kerberos encryption types (many of them are not available in
>> JRE but supported by major Kerberos vendors) and more functionalities
>> like credential cache support;
>>
>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
>> old Kerberos implementation in Directory Server project, but the
>> implementation is stopped being maintained. Directory project has a
>> plan to replace the implementation using Kerby. MiniKDC can use Kerby
>> directly to simplify the deps;
>>
>> 5. Extensively tested with all kinds of unit tests, already being used
>> for some time (like PSU), even in production environment;
>>
>> 6. Actively developed, and can be fixed and released in time if necessary, separately and independently from other components in Apache Directory project. By actively developing Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying, troubleshooting and further enhancement can  be much easier and thereafter possible.
>>
>>
>>
>> Wish this is a good beginning, and eventually Apache Kerby can benefit other projects in the ecosystem as well.
>>
>>
>>
>> This Kerberos related work is actually a long time effort led by Weihua Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions and inputs in the past.
>>
>>
>>
>> Your feedback is very welcome. Thanks in advance.
>>
>>
>>
>> [1] https://github.com/apache/directory-kerby
>>
>>
>>
>> Regards,
>>
>> Kai
>
Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

Haohui Mai-3
Have we evaluated GRPC? A robust RPC requires significant effort. Migrating
to GRPC can save ourselves a lot of headache.

Haohui
On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <[hidden email]>
wrote:

> I get a excited thinking about the prospect of better performance with
> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
> enough to Hadoop in that respect. Our bulk data transfer protocol isn't a
> separate thing like in HDFS, which avoids a SASL wrapped implementation, so
> we really suffer when auth-conf is negotiated. You'll see the same impact
> where there might be a high frequency of NameNode RPC calls or similar
> still. Throughput drops 3-4x, or worse.
>
> > On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
> >
> > Thanks for the confirm and further inputs, Steve.
> >
> >>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> > Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
> can help with, it's possible because we may hook Chimera or AES-NI thing
> into the Kerberos layer by leveraging the Kerberos library. As it may be
> noted, HADOOP-12725 is on the going for this aspect. There may be good
> result and further update on this recently.
> >
> >>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> how it works.
> > Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
> the right thing we could do. After some interactions with Kerby project, we
> may have more ideas how to proceed on the followings.
> >
> >>> Long term, I'd like Hadoop 3 to be Kerby-ized
> > This sounds great! With necessary support from the community like
> feedback and patch reviewing, we can speed up the related work.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Steve Loughran [mailto:[hidden email]]
> > Sent: Monday, February 22, 2016 6:51 PM
> > To: [hidden email]
> > Subject: Re: Introduce Apache Kerby to Hadoop
> >
> >
> >
> > I've discussed this offline with Kai, as part of the "let's fix
> kerberos" project. Not only is it a better Kerberos engine, we can do more
> diagnostics, get better algorithms and ultimately get better APIs for doing
> Kerberos and SASL —the latter would dramatically reduce the cost of
> wire-encrypting IPC.
> >
> > For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> how it works.
> >
> > Long term, I'd like Hadoop 3 to be Kerby-ized
> >
> >
> >> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
> >>
> >> Hi folks,
> >>
> >> I'd like to mention Apache Kerby [1] here to the community and propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >>
> >> Apache Kerby is a Kerberos centric project and aims to provide a first
> Java Kerberos library that contains both client and server supports. The
> relevant features include:
> >> It supports full Kerberos encryption types aligned with both MIT KDC
> >> and MS AD; Client APIs to allow to login via password, credential
> >> cache, keytab file and etc.; Utilities for generate, operate and
> >> inspect keytab and credential cache files; A simple KDC server that
> >> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
> >> with minimal overhead in external dependencies; A brand new token
> mechanism is provided, can be experimentally used, using it a JWT token can
> be used to exchange a TGT or service ticket; Anonymous PKINIT support, can
> be experientially used, as the first Java library that supports the
> Kerberos major extension.
> >>
> >> The project stands alone and is ensured to only depend on JRE for
> easier usage. It has made the first release (1.0.0-RC1) and 2nd release
> (RC2) is upcoming.
> >>
> >>
> >> As an initial step, this proposal suggests using Apache Kerby to
> upgrade the existing codes related to ApacheDS for the Kerberos support.
> The advantageous:
> >>
> >> 1. The kerby-kerb library is all the need, which is purely in Java,
> >> SLF4J is the only dependency, the whole is rather small;
> >>
> >> 2. There is a SimpleKDC in the library for test usage, which borrowed
> >> the MiniKDC idea and implemented all the support existing in MiniKDC.
> >> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
> >> fine;
> >>
> >> 3. Full Kerberos encryption types (many of them are not available in
> >> JRE but supported by major Kerberos vendors) and more functionalities
> >> like credential cache support;
> >>
> >> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
> >> old Kerberos implementation in Directory Server project, but the
> >> implementation is stopped being maintained. Directory project has a
> >> plan to replace the implementation using Kerby. MiniKDC can use Kerby
> >> directly to simplify the deps;
> >>
> >> 5. Extensively tested with all kinds of unit tests, already being used
> >> for some time (like PSU), even in production environment;
> >>
> >> 6. Actively developed, and can be fixed and released in time if
> necessary, separately and independently from other components in Apache
> Directory project. By actively developing Apache Kerby and now applying it
> to Hadoop, our side wish to make the Kerberos deploying, troubleshooting
> and further enhancement can  be much easier and thereafter possible.
> >>
> >>
> >>
> >> Wish this is a good beginning, and eventually Apache Kerby can benefit
> other projects in the ecosystem as well.
> >>
> >>
> >>
> >> This Kerberos related work is actually a long time effort led by Weihua
> Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve
> Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their
> great discussions and inputs in the past.
> >>
> >>
> >>
> >> Your feedback is very welcome. Thanks in advance.
> >>
> >>
> >>
> >> [1] https://github.com/apache/directory-kerby
> >>
> >>
> >>
> >> Regards,
> >>
> >> Kai
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Introduce Apache Kerby to Hadoop

Zheng, Kai
In reply to this post by Andrew Purtell-3
Thanks Andrew for the update on HBase side!

>> Throughput drops 3-4x, or worse.
Hopefully we can avoid much of the encryption overhead. We're prototyping a solution working on that.

Regards,
Kai

-----Original Message-----
From: Andrew Purtell [mailto:[hidden email]]
Sent: Saturday, February 27, 2016 5:35 PM
To: [hidden email]
Subject: Re: Introduce Apache Kerby to Hadoop

I get a excited thinking about the prospect of better performance with auth-conf QoP. HBase RPC is an increasingly distant fork but still close enough to Hadoop in that respect. Our bulk data transfer protocol isn't a separate thing like in HDFS, which avoids a SASL wrapped implementation, so we really suffer when auth-conf is negotiated. You'll see the same impact where there might be a high frequency of NameNode RPC calls or similar still. Throughput drops 3-4x, or worse.

> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
>
> Thanks for the confirm and further inputs, Steve.
>
>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can help with, it's possible because we may hook Chimera or AES-NI thing into the Kerberos layer by leveraging the Kerberos library. As it may be noted, HADOOP-12725 is on the going for this aspect. There may be good result and further update on this recently.
>
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the right thing we could do. After some interactions with Kerby project, we may have more ideas how to proceed on the followings.
>
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> This sounds great! With necessary support from the community like feedback and patch reviewing, we can speed up the related work.
>
> Regards,
> Kai
>
> -----Original Message-----
> From: Steve Loughran [mailto:[hidden email]]
> Sent: Monday, February 22, 2016 6:51 PM
> To: [hidden email]
> Subject: Re: Introduce Apache Kerby to Hadoop
>
>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos" project. Not only is it a better Kerberos engine, we can do more diagnostics, get better algorithms and ultimately get better APIs for doing Kerberos and SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
>> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
>>
>> Hi folks,
>>
>> I'd like to mention Apache Kerby [1] here to the community and propose to introduce the project to Hadoop, a sub project of Apache Directory project.
>>
>> Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos library that contains both client and server supports. The relevant features include:
>> It supports full Kerberos encryption types aligned with both MIT KDC
>> and MS AD; Client APIs to allow to login via password, credential
>> cache, keytab file and etc.; Utilities for generate, operate and
>> inspect keytab and credential cache files; A simple KDC server that
>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
>> with minimal overhead in external dependencies; A brand new token mechanism is provided, can be experimentally used, using it a JWT token can be used to exchange a TGT or service ticket; Anonymous PKINIT support, can be experientially used, as the first Java library that supports the Kerberos major extension.
>>
>> The project stands alone and is ensured to only depend on JRE for easier usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.
>>
>>
>> As an initial step, this proposal suggests using Apache Kerby to upgrade the existing codes related to ApacheDS for the Kerberos support. The advantageous:
>>
>> 1. The kerby-kerb library is all the need, which is purely in Java,
>> SLF4J is the only dependency, the whole is rather small;
>>
>> 2. There is a SimpleKDC in the library for test usage, which borrowed
>> the MiniKDC idea and implemented all the support existing in MiniKDC.
>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
>> fine;
>>
>> 3. Full Kerberos encryption types (many of them are not available in
>> JRE but supported by major Kerberos vendors) and more functionalities
>> like credential cache support;
>>
>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
>> old Kerberos implementation in Directory Server project, but the
>> implementation is stopped being maintained. Directory project has a
>> plan to replace the implementation using Kerby. MiniKDC can use Kerby
>> directly to simplify the deps;
>>
>> 5. Extensively tested with all kinds of unit tests, already being
>> used for some time (like PSU), even in production environment;
>>
>> 6. Actively developed, and can be fixed and released in time if necessary, separately and independently from other components in Apache Directory project. By actively developing Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying, troubleshooting and further enhancement can  be much easier and thereafter possible.
>>
>>
>>
>> Wish this is a good beginning, and eventually Apache Kerby can benefit other projects in the ecosystem as well.
>>
>>
>>
>> This Kerberos related work is actually a long time effort led by Weihua Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions and inputs in the past.
>>
>>
>>
>> Your feedback is very welcome. Thanks in advance.
>>
>>
>>
>> [1] https://github.com/apache/directory-kerby
>>
>>
>>
>> Regards,
>>
>> Kai
>
Reply | Threaded
Open this post in threaded view
|

RE: Introduce Apache Kerby to Hadoop

Zheng, Kai
In reply to this post by Haohui Mai-3
Hi Haohui,

I'm glad to know GRPC and it sounds cool. I think it's a good proposal to suggest Hadoop IPC/RPC upgrading to GRPC.

We haven't evaluated GRPC for the question of RPC encryption optimization because it's another story. It's not an overlap for the optimization work because even if we use GRPC, the RPC protocol messages still need to go through the stack of SASL/GSSAPI/Kerberos. What's desired here is not to re-implement any RPC layer, or the stack, but is to optimize the stack, by possibly implementing and plugin-ing new SASL or GSSAPI mechanism. Hope this clarifying helps. Thanks.

Regards,
Kai

-----Original Message-----
From: Haohui Mai [mailto:[hidden email]]
Sent: Sunday, February 28, 2016 3:02 AM
To: [hidden email]
Subject: Re: Introduce Apache Kerby to Hadoop

Have we evaluated GRPC? A robust RPC requires significant effort. Migrating to GRPC can save ourselves a lot of headache.

Haohui
On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <[hidden email]>
wrote:

> I get a excited thinking about the prospect of better performance with
> auth-conf QoP. HBase RPC is an increasingly distant fork but still
> close enough to Hadoop in that respect. Our bulk data transfer
> protocol isn't a separate thing like in HDFS, which avoids a SASL
> wrapped implementation, so we really suffer when auth-conf is
> negotiated. You'll see the same impact where there might be a high
> frequency of NameNode RPC calls or similar still. Throughput drops 3-4x, or worse.
>
> > On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
> >
> > Thanks for the confirm and further inputs, Steve.
> >
> >>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> > Yes to optimize Hadoop IPC/RPC encryption is another opportunity
> > Kerby
> can help with, it's possible because we may hook Chimera or AES-NI
> thing into the Kerberos layer by leveraging the Kerberos library. As
> it may be noted, HADOOP-12725 is on the going for this aspect. There
> may be good result and further update on this recently.
> >
> >>> For now, I'd like to see basic steps -upgrading minkdc to krypto,
> >>> see
> how it works.
> > Yes, starting with this initial steps upgrading MiniKDC to use Kerby
> > is
> the right thing we could do. After some interactions with Kerby
> project, we may have more ideas how to proceed on the followings.
> >
> >>> Long term, I'd like Hadoop 3 to be Kerby-ized
> > This sounds great! With necessary support from the community like
> feedback and patch reviewing, we can speed up the related work.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Steve Loughran [mailto:[hidden email]]
> > Sent: Monday, February 22, 2016 6:51 PM
> > To: [hidden email]
> > Subject: Re: Introduce Apache Kerby to Hadoop
> >
> >
> >
> > I've discussed this offline with Kai, as part of the "let's fix
> kerberos" project. Not only is it a better Kerberos engine, we can do
> more diagnostics, get better algorithms and ultimately get better APIs
> for doing Kerberos and SASL —the latter would dramatically reduce the
> cost of wire-encrypting IPC.
> >
> > For now, I'd like to see basic steps -upgrading minkdc to krypto,
> > see
> how it works.
> >
> > Long term, I'd like Hadoop 3 to be Kerby-ized
> >
> >
> >> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
> >>
> >> Hi folks,
> >>
> >> I'd like to mention Apache Kerby [1] here to the community and
> >> propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >>
> >> Apache Kerby is a Kerberos centric project and aims to provide a
> >> first
> Java Kerberos library that contains both client and server supports.
> The relevant features include:
> >> It supports full Kerberos encryption types aligned with both MIT
> >> KDC and MS AD; Client APIs to allow to login via password,
> >> credential cache, keytab file and etc.; Utilities for generate,
> >> operate and inspect keytab and credential cache files; A simple KDC
> >> server that borrows some ideas from Hadoop-MiniKDC and can be used
> >> in tests but with minimal overhead in external dependencies; A
> >> brand new token
> mechanism is provided, can be experimentally used, using it a JWT
> token can be used to exchange a TGT or service ticket; Anonymous
> PKINIT support, can be experientially used, as the first Java library
> that supports the Kerberos major extension.
> >>
> >> The project stands alone and is ensured to only depend on JRE for
> easier usage. It has made the first release (1.0.0-RC1) and 2nd
> release
> (RC2) is upcoming.
> >>
> >>
> >> As an initial step, this proposal suggests using Apache Kerby to
> upgrade the existing codes related to ApacheDS for the Kerberos support.
> The advantageous:
> >>
> >> 1. The kerby-kerb library is all the need, which is purely in Java,
> >> SLF4J is the only dependency, the whole is rather small;
> >>
> >> 2. There is a SimpleKDC in the library for test usage, which
> >> borrowed the MiniKDC idea and implemented all the support existing in MiniKDC.
> >> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it
> >> works fine;
> >>
> >> 3. Full Kerberos encryption types (many of them are not available
> >> in JRE but supported by major Kerberos vendors) and more
> >> functionalities like credential cache support;
> >>
> >> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on
> >> the old Kerberos implementation in Directory Server project, but
> >> the implementation is stopped being maintained. Directory project
> >> has a plan to replace the implementation using Kerby. MiniKDC can
> >> use Kerby directly to simplify the deps;
> >>
> >> 5. Extensively tested with all kinds of unit tests, already being
> >> used for some time (like PSU), even in production environment;
> >>
> >> 6. Actively developed, and can be fixed and released in time if
> necessary, separately and independently from other components in
> Apache Directory project. By actively developing Apache Kerby and now
> applying it to Hadoop, our side wish to make the Kerberos deploying,
> troubleshooting and further enhancement can  be much easier and thereafter possible.
> >>
> >>
> >>
> >> Wish this is a good beginning, and eventually Apache Kerby can
> >> benefit
> other projects in the ecosystem as well.
> >>
> >>
> >>
> >> This Kerberos related work is actually a long time effort led by
> >> Weihua
> Jiang in Intel, and had been kindly encouraged by Andrew Purtell,
> Steve Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for
> their great discussions and inputs in the past.
> >>
> >>
> >>
> >> Your feedback is very welcome. Thanks in advance.
> >>
> >>
> >>
> >> [1] https://github.com/apache/directory-kerby
> >>
> >>
> >>
> >> Regards,
> >>
> >> Kai
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

Steve Loughran-3
In reply to this post by Haohui Mai-3

> On 27 Feb 2016, at 19:02, Haohui Mai <[hidden email]> wrote:
>
> Have we evaluated GRPC? A robust RPC requires significant effort. Migrating
> to GRPC can save ourselves a lot of headache.
>

That's the google protobuf 3 based GRPC? More specifically, protobufVersion = '3.0.0-beta-2'?

That's successor to the protobuf.jar whose Alejandro-choreographed cross-project upgrade caused the "great protobuf upgrade of 2013"? That's the protobuf library where some of us have seriously considered forking the library so that we could have a version of protobuf which would link across java classes generated with older versions?

We have enough problems working with a released version of protobuf breaking across minor point releases, whose guava JARs are a recurrent source of cross version compatibility pain?


I would rather stab myself in the leg with a fork —repeatedly— than adopt something based on a beta-release of a google artifact as critical path of the Hadoop RPC chain.

While google are pretty obsessive about wire format compatibility across languages and versions, we just can't trust google to maintain binary compatibility, primarily due to a build process which clean builds everything from scratch. They don't have the same problem of trying to nudge things up across a loosely coupled set of projects, including those who still have requirements of JAR-sharing compatibility with older hadoop versions. Indeed, for those projects, being backwards compatible with Hadoop 1.x (no protobuf) is easier than working with Hadoop 2.205, purely due to to that protobuf difference.


 Even when protobuf 3.0 finally ships, we should hold back even adopting it for its current role until 3.1 comes out so we can asses google's compatibility policy in the 3.x line.


> Haohui
> On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <[hidden email]>
> wrote:
>
>> I get a excited thinking about the prospect of better performance with
>> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
>> enough to Hadoop in that respect. Our bulk data transfer protocol isn't a
>> separate thing like in HDFS, which avoids a SASL wrapped implementation, so
>> we really suffer when auth-conf is negotiated. You'll see the same impact
>> where there might be a high frequency of NameNode RPC calls or similar
>> still. Throughput drops 3-4x, or worse.
>>
>>> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
>>>
>>> Thanks for the confirm and further inputs, Steve.
>>>
>>>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
>>> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
>> can help with, it's possible because we may hook Chimera or AES-NI thing
>> into the Kerberos layer by leveraging the Kerberos library. As it may be
>> noted, HADOOP-12725 is on the going for this aspect. There may be good
>> result and further update on this recently.
>>>
>>>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
>> how it works.
>>> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
>> the right thing we could do. After some interactions with Kerby project, we
>> may have more ideas how to proceed on the followings.
>>>
>>>>> Long term, I'd like Hadoop 3 to be Kerby-ized
>>> This sounds great! With necessary support from the community like
>> feedback and patch reviewing, we can speed up the related work.
>>>
>>> Regards,
>>> Kai
>>>
>>> -----Original Message-----
>>> From: Steve Loughran [mailto:[hidden email]]
>>> Sent: Monday, February 22, 2016 6:51 PM
>>> To: [hidden email]
>>> Subject: Re: Introduce Apache Kerby to Hadoop
>>>
>>>
>>>
>>> I've discussed this offline with Kai, as part of the "let's fix
>> kerberos" project. Not only is it a better Kerberos engine, we can do more
>> diagnostics, get better algorithms and ultimately get better APIs for doing
>> Kerberos and SASL —the latter would dramatically reduce the cost of
>> wire-encrypting IPC.
>>>
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
>> how it works.
>>>
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
>>>
>>>
>>>> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> I'd like to mention Apache Kerby [1] here to the community and propose
>> to introduce the project to Hadoop, a sub project of Apache Directory
>> project.
>>>>
>>>> Apache Kerby is a Kerberos centric project and aims to provide a first
>> Java Kerberos library that contains both client and server supports. The
>> relevant features include:
>>>> It supports full Kerberos encryption types aligned with both MIT KDC
>>>> and MS AD; Client APIs to allow to login via password, credential
>>>> cache, keytab file and etc.; Utilities for generate, operate and
>>>> inspect keytab and credential cache files; A simple KDC server that
>>>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
>>>> with minimal overhead in external dependencies; A brand new token
>> mechanism is provided, can be experimentally used, using it a JWT token can
>> be used to exchange a TGT or service ticket; Anonymous PKINIT support, can
>> be experientially used, as the first Java library that supports the
>> Kerberos major extension.
>>>>
>>>> The project stands alone and is ensured to only depend on JRE for
>> easier usage. It has made the first release (1.0.0-RC1) and 2nd release
>> (RC2) is upcoming.
>>>>
>>>>
>>>> As an initial step, this proposal suggests using Apache Kerby to
>> upgrade the existing codes related to ApacheDS for the Kerberos support.
>> The advantageous:
>>>>
>>>> 1. The kerby-kerb library is all the need, which is purely in Java,
>>>> SLF4J is the only dependency, the whole is rather small;
>>>>
>>>> 2. There is a SimpleKDC in the library for test usage, which borrowed
>>>> the MiniKDC idea and implemented all the support existing in MiniKDC.
>>>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
>>>> fine;
>>>>
>>>> 3. Full Kerberos encryption types (many of them are not available in
>>>> JRE but supported by major Kerberos vendors) and more functionalities
>>>> like credential cache support;
>>>>
>>>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
>>>> old Kerberos implementation in Directory Server project, but the
>>>> implementation is stopped being maintained. Directory project has a
>>>> plan to replace the implementation using Kerby. MiniKDC can use Kerby
>>>> directly to simplify the deps;
>>>>
>>>> 5. Extensively tested with all kinds of unit tests, already being used
>>>> for some time (like PSU), even in production environment;
>>>>
>>>> 6. Actively developed, and can be fixed and released in time if
>> necessary, separately and independently from other components in Apache
>> Directory project. By actively developing Apache Kerby and now applying it
>> to Hadoop, our side wish to make the Kerberos deploying, troubleshooting
>> and further enhancement can  be much easier and thereafter possible.
>>>>
>>>>
>>>>
>>>> Wish this is a good beginning, and eventually Apache Kerby can benefit
>> other projects in the ecosystem as well.
>>>>
>>>>
>>>>
>>>> This Kerberos related work is actually a long time effort led by Weihua
>> Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve
>> Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their
>> great discussions and inputs in the past.
>>>>
>>>>
>>>>
>>>> Your feedback is very welcome. Thanks in advance.
>>>>
>>>>
>>>>
>>>> [1] https://github.com/apache/directory-kerby
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Kai
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Introduce Apache Kerby to Hadoop

Haohui Mai-3
Handling Kerberos is similar to what we have done for WebHDFS now. Kerby
will be in the picture but things are much simpler.

If protobuf is a concern, why not shading it into hadoop-common? The
generated binaries might not be compatible but the wire format is.


On Mon, Feb 29, 2016 at 1:55 AM Steve Loughran <[hidden email]>
wrote:

>
> > On 27 Feb 2016, at 19:02, Haohui Mai <[hidden email]> wrote:
> >
> > Have we evaluated GRPC? A robust RPC requires significant effort.
> Migrating
> > to GRPC can save ourselves a lot of headache.
> >
>
> That's the google protobuf 3 based GRPC? More specifically,
> protobufVersion = '3.0.0-beta-2'?
>
> That's successor to the protobuf.jar whose Alejandro-choreographed
> cross-project upgrade caused the "great protobuf upgrade of 2013"? That's
> the protobuf library where some of us have seriously considered forking the
> library so that we could have a version of protobuf which would link across
> java classes generated with older versions?
>
> We have enough problems working with a released version of protobuf
> breaking across minor point releases, whose guava JARs are a recurrent
> source of cross version compatibility pain?
>
>
> I would rather stab myself in the leg with a fork —repeatedly— than adopt
> something based on a beta-release of a google artifact as critical path of
> the Hadoop RPC chain.
>
> While google are pretty obsessive about wire format compatibility across
> languages and versions, we just can't trust google to maintain binary
> compatibility, primarily due to a build process which clean builds
> everything from scratch. They don't have the same problem of trying to
> nudge things up across a loosely coupled set of projects, including those
> who still have requirements of JAR-sharing compatibility with older hadoop
> versions. Indeed, for those projects, being backwards compatible with
> Hadoop 1.x (no protobuf) is easier than working with Hadoop 2.205, purely
> due to to that protobuf difference.
>
>
>  Even when protobuf 3.0 finally ships, we should hold back even adopting
> it for its current role until 3.1 comes out so we can asses google's
> compatibility policy in the 3.x line.
>
>
> > Haohui
> > On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <[hidden email]
> >
> > wrote:
> >
> >> I get a excited thinking about the prospect of better performance with
> >> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
> >> enough to Hadoop in that respect. Our bulk data transfer protocol isn't
> a
> >> separate thing like in HDFS, which avoids a SASL wrapped
> implementation, so
> >> we really suffer when auth-conf is negotiated. You'll see the same
> impact
> >> where there might be a high frequency of NameNode RPC calls or similar
> >> still. Throughput drops 3-4x, or worse.
> >>
> >>> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <[hidden email]> wrote:
> >>>
> >>> Thanks for the confirm and further inputs, Steve.
> >>>
> >>>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> >>> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
> >> can help with, it's possible because we may hook Chimera or AES-NI thing
> >> into the Kerberos layer by leveraging the Kerberos library. As it may be
> >> noted, HADOOP-12725 is on the going for this aspect. There may be good
> >> result and further update on this recently.
> >>>
> >>>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> >> how it works.
> >>> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
> >> the right thing we could do. After some interactions with Kerby
> project, we
> >> may have more ideas how to proceed on the followings.
> >>>
> >>>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> >>> This sounds great! With necessary support from the community like
> >> feedback and patch reviewing, we can speed up the related work.
> >>>
> >>> Regards,
> >>> Kai
> >>>
> >>> -----Original Message-----
> >>> From: Steve Loughran [mailto:[hidden email]]
> >>> Sent: Monday, February 22, 2016 6:51 PM
> >>> To: [hidden email]
> >>> Subject: Re: Introduce Apache Kerby to Hadoop
> >>>
> >>>
> >>>
> >>> I've discussed this offline with Kai, as part of the "let's fix
> >> kerberos" project. Not only is it a better Kerberos engine, we can do
> more
> >> diagnostics, get better algorithms and ultimately get better APIs for
> doing
> >> Kerberos and SASL —the latter would dramatically reduce the cost of
> >> wire-encrypting IPC.
> >>>
> >>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> >> how it works.
> >>>
> >>> Long term, I'd like Hadoop 3 to be Kerby-ized
> >>>
> >>>
> >>>> On 22 Feb 2016, at 06:41, Zheng, Kai <[hidden email]> wrote:
> >>>>
> >>>> Hi folks,
> >>>>
> >>>> I'd like to mention Apache Kerby [1] here to the community and propose
> >> to introduce the project to Hadoop, a sub project of Apache Directory
> >> project.
> >>>>
> >>>> Apache Kerby is a Kerberos centric project and aims to provide a first
> >> Java Kerberos library that contains both client and server supports. The
> >> relevant features include:
> >>>> It supports full Kerberos encryption types aligned with both MIT KDC
> >>>> and MS AD; Client APIs to allow to login via password, credential
> >>>> cache, keytab file and etc.; Utilities for generate, operate and
> >>>> inspect keytab and credential cache files; A simple KDC server that
> >>>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
> >>>> with minimal overhead in external dependencies; A brand new token
> >> mechanism is provided, can be experimentally used, using it a JWT token
> can
> >> be used to exchange a TGT or service ticket; Anonymous PKINIT support,
> can
> >> be experientially used, as the first Java library that supports the
> >> Kerberos major extension.
> >>>>
> >>>> The project stands alone and is ensured to only depend on JRE for
> >> easier usage. It has made the first release (1.0.0-RC1) and 2nd release
> >> (RC2) is upcoming.
> >>>>
> >>>>
> >>>> As an initial step, this proposal suggests using Apache Kerby to
> >> upgrade the existing codes related to ApacheDS for the Kerberos support.
> >> The advantageous:
> >>>>
> >>>> 1. The kerby-kerb library is all the need, which is purely in Java,
> >>>> SLF4J is the only dependency, the whole is rather small;
> >>>>
> >>>> 2. There is a SimpleKDC in the library for test usage, which borrowed
> >>>> the MiniKDC idea and implemented all the support existing in MiniKDC.
> >>>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
> >>>> fine;
> >>>>
> >>>> 3. Full Kerberos encryption types (many of them are not available in
> >>>> JRE but supported by major Kerberos vendors) and more functionalities
> >>>> like credential cache support;
> >>>>
> >>>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the
> >>>> old Kerberos implementation in Directory Server project, but the
> >>>> implementation is stopped being maintained. Directory project has a
> >>>> plan to replace the implementation using Kerby. MiniKDC can use Kerby
> >>>> directly to simplify the deps;
> >>>>
> >>>> 5. Extensively tested with all kinds of unit tests, already being used
> >>>> for some time (like PSU), even in production environment;
> >>>>
> >>>> 6. Actively developed, and can be fixed and released in time if
> >> necessary, separately and independently from other components in Apache
> >> Directory project. By actively developing Apache Kerby and now applying
> it
> >> to Hadoop, our side wish to make the Kerberos deploying, troubleshooting
> >> and further enhancement can  be much easier and thereafter possible.
> >>>>
> >>>>
> >>>>
> >>>> Wish this is a good beginning, and eventually Apache Kerby can benefit
> >> other projects in the ecosystem as well.
> >>>>
> >>>>
> >>>>
> >>>> This Kerberos related work is actually a long time effort led by
> Weihua
> >> Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve
> >> Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their
> >> great discussions and inputs in the past.
> >>>>
> >>>>
> >>>>
> >>>> Your feedback is very welcome. Thanks in advance.
> >>>>
> >>>>
> >>>>
> >>>> [1] https://github.com/apache/directory-kerby
> >>>>
> >>>>
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kai
> >>>
> >>
>
>