Re: [DISCUSS]: re-enable listing of secrets in S3x URIs
Thanks for starting the discussion Steve.
This is a prickly issue and unfortunately we are hostages of past
decisions. Thanks a lot for attacking the problem in the first place and
sticking with it.
In my experience we have found a lot of places that AWS secrets were logged
for everyone to see. I'm not sure allowing people to do that is the right
thing to do in the long-term. We have to bite the bullet sometime. Perhaps
we should do that in trunk (3.0.0)? To unbreak clients of Hadoop-2.x we can
go with Vinayakumar's proposal but only in branch-2. Ofcourse technically
we have hadoop-2.8.0 already out with this, but I agree we can put the fix
On Wed, Aug 2, 2017 at 5:52 AM, Steve Loughran <[hidden email]>
> HADOOP-3733<https://issues.apache.org/jira/browse/HADOOP-3733> stripped
> out the user:password secret from the s3., s3a, s3n URLs for security
> grounds: everything logged Path entries without ever considering that they
> contained secret credentials.
> but that turns out to break things, as noted in HADOOP-14439 ...you can't
> any more go Path -> String -> Path without authentication details being
> lost, and of course, guess how paths are often marshalled around? As
> strings (after all, they weren't serializable until recently)
> Vinayakumar has proposed a patch reinstating retaining the secrets, at
> least enough for distcp
> https://issues.apache.org/jira/browse/HADOOP-3733?focusedCom > mentId=16110297&page=com.atlassian.jira.plugin.system.
> I think I'm going to go with this, once I get the tests & testing to go
> with, and if its enough to work with spark too .. targeting 2.8.2 if its
> not too late.
> If there's a risk, it's that if someone puts secrets into s3 URIs, the
> secrets are more likely to be logged. But even with the current code,
> there's no way to guarantee that the secrets will never be logged. The
> danger comes from having id:secret credentials in the URI —something people
> will be told off for doing.