Re: [DISCUSS]: re-enable listing of secrets in S3x URIs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS]: re-enable listing of secrets in S3x URIs

Ravi Prakash-3
Thanks for starting the discussion Steve.

This is a prickly issue and unfortunately we are hostages of past
decisions. Thanks a lot for attacking the problem in the first place and
sticking with it.

In my experience we have found a lot of places that AWS secrets were logged
for everyone to see. I'm not sure allowing people to do that is the right
thing to do in the long-term. We have to bite the bullet sometime. Perhaps
we should do that in trunk (3.0.0)? To unbreak clients of Hadoop-2.x we can
go with Vinayakumar's proposal but only in branch-2. Ofcourse technically
we have hadoop-2.8.0 already out with this, but I agree we can put the fix
in 2.8.2.

$0.02
Ravi

On Wed, Aug 2, 2017 at 5:52 AM, Steve Loughran <[hidden email]>
wrote:

>
>
> HADOOP-3733<https://issues.apache.org/jira/browse/HADOOP-3733> stripped
> out the user:password secret from the s3., s3a, s3n URLs for security
> grounds: everything logged Path entries without ever considering that they
> contained secret credentials.
>
> but that turns out to break things, as noted in HADOOP-14439  ...you can't
> any more go Path -> String -> Path without authentication details being
> lost, and of course, guess how paths are often marshalled around? As
> strings (after all, they weren't serializable until recently)
>
> Vinayakumar has proposed a patch reinstating retaining the secrets, at
> least enough for distcp
>
> https://issues.apache.org/jira/browse/HADOOP-3733?focusedCom
> mentId=16110297&page=com.atlassian.jira.plugin.system.
> issuetabpanels:comment-tabpanel#comment-16110297
>
> I think I'm going to go with this, once I get the tests & testing to go
> with, and if its enough to work with spark too .. targeting 2.8.2 if its
> not too late.
>
> If there's a risk, it's that if someone puts secrets into s3 URIs, the
> secrets are more likely to be logged. But even with the current code,
> there's no way to guarantee that the secrets will never be logged. The
> danger comes from having id:secret credentials in the URI —something people
> will be told off for doing.
>
>
>