Quantcast

[VOTE] Release Apache Hadoop 2.8.0 (RC2)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
58 messages Options
123
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
Hi all,
     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.

     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.

      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2

      The RC tag in git is: release-2.8.0-RC2

      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056

      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.

Thanks,

Junping
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Andrew Wang
Hi Junping,

Noticed this possible blocker float by my inbox today. It had an affects
but no target version set:

https://issues.apache.org/jira/browse/HDFS-11431

Thoughts? Seems like the hadoop-hdfs-client artifact doesn't work right now.

Best,
Andrew


On Tue, Mar 14, 2017 at 1:41 AM, Junping Du <[hidden email]> wrote:

> Hi all,
>      With several important fixes get merged last week, I've created a new
> release candidate (RC2) for Apache Hadoop 2.8.0.
>
>      This is the next minor release to follow up 2.7.0 which has been
> released for more than 1 year. It comprises 2,919 fixes, improvements, and
> new features. Most of these commits are released for the first time in
> branch-2.
>
>       More information about the 2.8.0 release plan can be found here:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>       Please note that RC0 and RC1 are not voted public because
> significant issues are found just after RC tag getting published.
>
>       The RC is available at: http://home.apache.org/~
> junping_du/hadoop-2.8.0-RC2
>
>       The RC tag in git is: release-2.8.0-RC2
>
>       The maven artifacts are available via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1056
>
>       Please try the release and vote; the vote will run for the usual 5
> days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
Thanks Andrew for reporting the issue. This JIRA is out of my radar as it? didn't specify any target version before.


From my understanding, this issue is related to our previous improvements with separating client and server jars in HDFS-6200. If we use the new "client" jar in NN HA deployment, then we will hit the issue reported.


I can see two options here:

- Without any change in 2.8.0, if user hit the issue when they deploy HA cluster by using new client jar, adding back hdfs jar just like how things work previously

- Make the change now in 2.8.0, either moving ConfiguredFailoverProxyProvider to client jar or adding dependency between client jar and server jar. There must be some arguments there on which way to fix is better especially ConfiguredFailoverProxyProvider still has some sever side dependencies.


I would prefer the first option, given:

- The issue fixing time is unpredictable as there are still discussion on how to fix this issue. Our 2.8.0 release shouldn't be an endless journey which has been deferred several times for more serious issue.

- We have workaround for this improvement, no regression happens due to this issue. People can still use hdfs jar in old way. The worst case is improvement for HDFS doesn't work in some cases - that shouldn't block the whole release.


I think we should let vote keep going unless someone have more concerns which I could miss.



Thanks,


Junping


________________________________
From: Andrew Wang <[hidden email]>
Sent: Tuesday, March 14, 2017 2:50 PM
To: Junping Du
Cc: [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Hi Junping,

Noticed this possible blocker float by my inbox today. It had an affects but no target version set:

https://issues.apache.org/jira/browse/HDFS-11431

Thoughts? Seems like the hadoop-hdfs-client artifact doesn't work right now.

Best,
Andrew


On Tue, Mar 14, 2017 at 1:41 AM, Junping Du <[hidden email]<mailto:[hidden email]>> wrote:
Hi all,
     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.

     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.

      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2

      The RC tag in git is: release-2.8.0-RC2

      The maven artifacts are available via repository.apache.org<http://repository.apache.org> at: https://repository.apache.org/content/repositories/orgapachehadoop-1056

      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.

Thanks,

Junping

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Steve Loughran-3

> On 15 Mar 2017, at 00:36, Junping Du <[hidden email]> wrote:
>
> Thanks Andrew for reporting the issue. This JIRA is out of my radar as it? didn't specify any target version before.
>
>
> From my understanding, this issue is related to our previous improvements with separating client and server jars in HDFS-6200. If we use the new "client" jar in NN HA deployment, then we will hit the issue reported.
>
>
> I can see two options here:
>
> - Without any change in 2.8.0, if user hit the issue when they deploy HA cluster by using new client jar, adding back hdfs jar just like how things work previously
>
> - Make the change now in 2.8.0, either moving ConfiguredFailoverProxyProvider to client jar or adding dependency between client jar and server jar. There must be some arguments there on which way to fix is better especially ConfiguredFailoverProxyProvider still has some sever side dependencies.
>
>
> I would prefer the first option, given:
>
> - The issue fixing time is unpredictable as there are still discussion on how to fix this issue. Our 2.8.0 release shouldn't be an endless journey which has been deferred several times for more serious issue.
>
> - We have workaround for this improvement, no regression happens due to this issue. People can still use hdfs jar in old way. The worst case is improvement for HDFS doesn't work in some cases - that shouldn't block the whole release.
>
>
> I think we should let vote keep going unless someone have more concerns which I could miss.

getting it out the door with this in the release notes, and a plan for 2.8.1 would be ideal

>
>
>
> Thanks,
>
>
> Junping
>
>
> ________________________________
> From: Andrew Wang <[hidden email]>
> Sent: Tuesday, March 14, 2017 2:50 PM
> To: Junping Du
> Cc: [hidden email]; [hidden email]; [hidden email]; [hidden email]
> Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)
>
> Hi Junping,
>
> Noticed this possible blocker float by my inbox today. It had an affects but no target version set:
>
> https://issues.apache.org/jira/browse/HDFS-11431
>
> Thoughts? Seems like the hadoop-hdfs-client artifact doesn't work right now.
>
> Best,
> Andrew
>
>
> On Tue, Mar 14, 2017 at 1:41 AM, Junping Du <[hidden email]<mailto:[hidden email]>> wrote:
> Hi all,
>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>
>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>
>      The RC tag in git is: release-2.8.0-RC2
>
>      The maven artifacts are available via repository.apache.org<http://repository.apache.org> at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>
>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Steve Loughran-3
In reply to this post by Junping Du-2

> On 14 Mar 2017, at 08:41, Junping Du <[hidden email]> wrote:
>
> Hi all,
>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>
>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>
>      The RC tag in git is: release-2.8.0-RC2

given tags are so easy to move, we need to be relying on one or more of:
-the commit ID,
-the tag being signed

Junping: what is the commit Id for the release?

>
>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>

thanks, I'll play with these downstream, as well as checking out and trying to build on windows

>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
The latest commit on RC2 is: e51312e8e106efb2ebd4844eecacb51026fac8b7.
btw, I think tags are immutable. Isn't it?

Thanks,

Junping
________________________________________
From: Steve Loughran
Sent: Wednesday, March 15, 2017 12:30 PM
To: Junping Du
Cc: [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

> On 14 Mar 2017, at 08:41, Junping Du <[hidden email]> wrote:
>
> Hi all,
>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>
>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>
>      The RC tag in git is: release-2.8.0-RC2

given tags are so easy to move, we need to be relying on one or more of:
-the commit ID,
-the tag being signed

Junping: what is the commit Id for the release?

>
>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>

thanks, I'll play with these downstream, as well as checking out and trying to build on windows

>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Andrew Wang
In reply to this post by Junping Du-2
Hi Junping, inline,

From my understanding, this issue is related to our previous
> improvements with separating client and server jars in HDFS-6200. If we use
> the new "client" jar in NN HA deployment, then we will hit the issue
> reported.
>
From my read of the poms, hadoop-client depends on hadoop-hdfs-client to
pull in HDFS-related code. It doesn't have its own dependency on
hadoop-hdfs. So I think this affects users of the hadoop-client artifact,
which has existed for a long time.

Essentially all of our customer deployments run with NN HA, so this would
affect a lot of users.

> I can see two options here:
>
> - Without any change in 2.8.0, if user hit the issue when they deploy HA
> cluster by using new client jar, adding back hdfs jar just like how things
> work previously
>
> - Make the change now in 2.8.0, either moving
> ConfiguredFailoverProxyProvider to client jar or adding dependency
> between client jar and server jar. There must be some arguments there on
> which way to fix is better especially ConfiguredFailoverProxyProvider
> still has some sever side dependencies.
>
>
> I would prefer the first option, given:
>
> - The issue fixing time is unpredictable as there are still discussion on
> how to fix this issue. Our 2.8.0 release shouldn't be an endless journey
> which has been deferred several times for more serious issue.
>
Looks like we have a patch being actively revved and reviewed to fix this
by making hadoop-hdfs-client depend on hadoop-hdfs. Thanks to Steven and
Steve for working on this.

Steve proposed doing a proper split in a later JIRA.

> - We have workaround for this improvement, no regression happens due to
> this issue. People can still use hdfs jar in old way. The worst case
> is improvement for HDFS doesn't work in some cases - that shouldn't block
> the whole release.
>
Based on the above, I think there is a regression for users of the
hadoop-client artifact.

If it actually only affects users of hadoop-hdfs-client, then I agree we
can document it as a Known Issue and fix it later.

Best,
Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Eric Badger
In reply to this post by Junping Du-2
All on MacOS Sierra

Verified signatures
  - Minor note: Junping, I had a hard time finding your key. I grabbed the keys for hadoop from
http://home.apache.org/keys/group/hadoop.asc and you had a key there, but it wasn't the one that you signed this commit with. Then with some help from Jason I found the correct key at
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS. So it would be nice if those were in sync.
Compiled from source
Deployed pseudo-distributed cluster
Ran some sample MR jobs

+1 (non-binding)

Thanks,

Eric


On Wednesday, March 15, 2017 2:58 PM, Junping Du <[hidden email]> wrote:



The latest commit on RC2 is: e51312e8e106efb2ebd4844eecacb51026fac8b7.
btw, I think tags are immutable. Isn't it?

Thanks,

Junping
________________________________________

From: Steve Loughran
Sent: Wednesday, March 15, 2017 12:30 PM
To: Junping Du
Cc: [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

> On 14 Mar 2017, at 08:41, Junping Du <[hidden email]> wrote:
>
> Hi all,
>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>
>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>
>      The RC tag in git is: release-2.8.0-RC2

given tags are so easy to move, we need to be relying on one or more of:
-the commit ID,
-the tag being signed

Junping: what is the commit Id for the release?

>
>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>

thanks, I'll play with these downstream, as well as checking out and trying to build on windows

>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Josh Elser
In reply to this post by Junping Du-2
A tag is immutable, but you (or someone else) could remove the tag you
pushed and re-push a new one. That's why the commit id is important --
it ensures that everyone else knows the exact commit being voted on.

Junping Du wrote:

> The latest commit on RC2 is: e51312e8e106efb2ebd4844eecacb51026fac8b7.
> btw, I think tags are immutable. Isn't it?
>
> Thanks,
>
> Junping
> ________________________________________
> From: Steve Loughran
> Sent: Wednesday, March 15, 2017 12:30 PM
> To: Junping Du
> Cc: [hidden email]
> Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)
>
>> On 14 Mar 2017, at 08:41, Junping Du<[hidden email]>  wrote:
>>
>> Hi all,
>>      With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>>
>>      This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>>
>>       More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>>
>>       Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>>
>>       The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>>
>>       The RC tag in git is: release-2.8.0-RC2
>
> given tags are so easy to move, we need to be relying on one or more of:
> -the commit ID,
> -the tag being signed
>
> Junping: what is the commit Id for the release?
>
>>       The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>>
>
> thanks, I'll play with these downstream, as well as checking out and trying to build on windows
>
>>       Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>>
>> Thanks,
>>
>> Junping
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
In reply to this post by Andrew Wang
bq. From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.

I could miss that. Thanks for reminding! From my quick check: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client/2.7.3?, it sounds like 669 artifacts from other projects were depending on it.


I think we should withdraw the current RC bits. Please stop the verification & vote.

I will kick off another RC immediately when HDFS-11431 get fixed.


Thanks,


Junping


________________________________
From: Andrew Wang <[hidden email]>
Sent: Wednesday, March 15, 2017 2:04 PM
To: Junping Du
Cc: [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Hi Junping, inline,


From my understanding, this issue is related to our previous improvements with separating client and server jars in HDFS-6200. If we use the new "client" jar in NN HA deployment, then we will hit the issue reported.

From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.

Essentially all of our customer deployments run with NN HA, so this would affect a lot of users.

I can see two options here:

- Without any change in 2.8.0, if user hit the issue when they deploy HA cluster by using new client jar, adding back hdfs jar just like how things work previously

- Make the change now in 2.8.0, either moving ConfiguredFailoverProxyProvider to client jar or adding dependency between client jar and server jar. There must be some arguments there on which way to fix is better especially ConfiguredFailoverProxyProvider still has some sever side dependencies.


I would prefer the first option, given:

- The issue fixing time is unpredictable as there are still discussion on how to fix this issue. Our 2.8.0 release shouldn't be an endless journey which has been deferred several times for more serious issue.

Looks like we have a patch being actively revved and reviewed to fix this by making hadoop-hdfs-client depend on hadoop-hdfs. Thanks to Steven and Steve for working on this.

Steve proposed doing a proper split in a later JIRA.

- We have workaround for this improvement, no regression happens due to this issue. People can still use hdfs jar in old way. The worst case is improvement for HDFS doesn't work in some cases - that shouldn't block the whole release.

Based on the above, I think there is a regression for users of the hadoop-client artifact.

If it actually only affects users of hadoop-hdfs-client, then I agree we can document it as a Known Issue and fix it later.

Best,
Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
In reply to this post by Eric Badger
Hi Eric,
     Thanks for your verification work! About your question on RM's key, we actually mentioned we were using  https://dist.apache.org/repos/dist/release/hadoop/common/KEYS in our hadoop wiki page: https://wiki.apache.org/hadoop/HowToRelease. Also, for hadoop user, our release page (http://hadoop.apache.org/releases.html) points key file location to the same place. So for developers and users in hadoop community, I hope this is not confusing too much.
     However, from my offline check with Owen, it sounds like http://home.apache.org/keys/group/hadoop.asc is something tradition for apache projects and convenient for usage. I already updated related key to my apache id which should sync to there automatically. We'd better document it also in our hadoop wiki page.  

Thanks,

Junping
________________________________________
From: Eric Badger <[hidden email]>
Sent: Wednesday, March 15, 2017 2:06 PM
To: Junping Du; Steve Loughran
Cc: [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

All on MacOS Sierra

Verified signatures
  - Minor note: Junping, I had a hard time finding your key. I grabbed the keys for hadoop from
http://home.apache.org/keys/group/hadoop.asc and you had a key there, but it wasn't the one that you signed this commit with. Then with some help from Jason I found the correct key at
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS. So it would be nice if those were in sync.
Compiled from source
Deployed pseudo-distributed cluster
Ran some sample MR jobs

+1 (non-binding)

Thanks,

Eric


On Wednesday, March 15, 2017 2:58 PM, Junping Du <[hidden email]> wrote:



The latest commit on RC2 is: e51312e8e106efb2ebd4844eecacb51026fac8b7.
btw, I think tags are immutable. Isn't it?

Thanks,

Junping
________________________________________

From: Steve Loughran
Sent: Wednesday, March 15, 2017 12:30 PM
To: Junping Du
Cc: [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

> On 14 Mar 2017, at 08:41, Junping Du <[hidden email]> wrote:
>
> Hi all,
>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>
>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>
>      The RC tag in git is: release-2.8.0-RC2

given tags are so easy to move, we need to be relying on one or more of:
-the commit ID,
-the tag being signed

Junping: what is the commit Id for the release?

>
>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>

thanks, I'll play with these downstream, as well as checking out and trying to build on windows

>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>
> Thanks,
>
> Junping

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Andrew Wang
On Wed, Mar 15, 2017 at 5:42 PM, Junping Du <[hidden email]> wrote:

> Hi Eric,
>      Thanks for your verification work! About your question on RM's key,
> we actually mentioned we were using  https://dist.apache.org/repos/
> dist/release/hadoop/common/KEYS in our hadoop wiki page:
> https://wiki.apache.org/hadoop/HowToRelease. Also, for hadoop user, our
> release page (http://hadoop.apache.org/releases.html) points key file
> location to the same place. So for developers and users in hadoop
> community, I hope this is not confusing too much.
>      However, from my offline check with Owen, it sounds like
> http://home.apache.org/keys/group/hadoop.asc is something tradition for
> apache projects and convenient for usage. I already updated related key to
> my apache id which should sync to there automatically. We'd better document
> it also in our hadoop wiki page.
>
> I actually asked INFRA about this when I was adding my key, a little more
backstory:

We used to have a README in dist saying to add your key on id.apache.org,
then to export the hadoop group's keys to generate dist's KEYS file.

INFRA told me this is a Bad Thing, since the KEYS file should be append
only. This way, users can still verify a release even if an RM leaves the
hadoop group or changes their key on id.apache.org.

So, I deleted the old README instructions. The dist KEYS file is the
canonical (and only) place to look for an RM's keys. Based on Junping's
examination, it sounds like our docs to reflect this. I'd rather not
complicate matters by also discussing the hadoop group's keys.

Best,
Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Steve Loughran-3
In reply to this post by Josh Elser

> On 15 Mar 2017, at 23:04, Josh Elser <[hidden email]> wrote:
>
> A tag is immutable, but you (or someone else) could remove the tag you pushed and re-push a new one. That's why the commit id is important -- it ensures that everyone else knows the exact commit being voted on.
>

There's tag signing too, "git tag --sign". We can/should use that for authenticating tags, saying that "the release is tag 2.8.x signed by me"

> Junping Du wrote:
>> The latest commit on RC2 is: e51312e8e106efb2ebd4844eecacb51026fac8b7.
>> btw, I think tags are immutable. Isn't it?
>>
>> Thanks,
>>
>> Junping
>> ________________________________________
>> From: Steve Loughran
>> Sent: Wednesday, March 15, 2017 12:30 PM
>> To: Junping Du
>> Cc: [hidden email]
>> Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)
>>
>>> On 14 Mar 2017, at 08:41, Junping Du<[hidden email]>  wrote:
>>>
>>> Hi all,
>>>     With several important fixes get merged last week, I've created a new release candidate (RC2) for Apache Hadoop 2.8.0.
>>>
>>>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,919 fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>>>
>>>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>>>
>>>      Please note that RC0 and RC1 are not voted public because significant issues are found just after RC tag getting published.
>>>
>>>      The RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC2
>>>
>>>      The RC tag in git is: release-2.8.0-RC2
>>
>> given tags are so easy to move, we need to be relying on one or more of:
>> -the commit ID,
>> -the tag being signed
>>
>> Junping: what is the commit Id for the release?
>>
>>>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1056
>>>
>>
>> thanks, I'll play with these downstream, as well as checking out and trying to build on windows
>>
>>>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/20/2017 PDT time.
>>>
>>> Thanks,
>>>
>>> Junping
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Kuhu Shukla
In reply to this post by Junping Du-2
+1 (non-binding)-Downloaded source.-Verified signatures.- Compiled the source.-Ran sample jobs like MR sleep on pseudo distributed cluster. (Mac OS)
Thanks Junping and others!Regards,Kuhu
On Wednesday, March 15, 2017, 7:25:46 PM CDT, Junping Du <[hidden email]> wrote:bq. From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.

I could miss that. Thanks for reminding! From my quick check: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client/2.7.3?, it sounds like 669 artifacts from other projects were depending on it.


I think we should withdraw the current RC bits. Please stop the verification & vote.

I will kick off another RC immediately when HDFS-11431 get fixed.


Thanks,


Junping


________________________________
From: Andrew Wang <[hidden email]>
Sent: Wednesday, March 15, 2017 2:04 PM
To: Junping Du
Cc: [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Hi Junping, inline,


From my understanding, this issue is related to our previous improvements with separating client and server jars in HDFS-6200. If we use the new "client" jar in NN HA deployment, then we will hit the issue reported.

From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.

Essentially all of our customer deployments run with NN HA, so this would affect a lot of users.

I can see two options here:

- Without any change in 2.8.0, if user hit the issue when they deploy HA cluster by using new client jar, adding back hdfs jar just like how things work previously

- Make the change now in 2.8.0, either moving ConfiguredFailoverProxyProvider to client jar or adding dependency between client jar and server jar. There must be some arguments there on which way to fix is better especially ConfiguredFailoverProxyProvider still has some sever side dependencies.


I would prefer the first option, given:

- The issue fixing time is unpredictable as there are still discussion on how to fix this issue. Our 2.8.0 release shouldn't be an endless journey which has been deferred several times for more serious issue.

Looks like we have a patch being actively revved and reviewed to fix this by making hadoop-hdfs-client depend on hadoop-hdfs. Thanks to Steven and Steve for working on this.

Steve proposed doing a proper split in a later JIRA.

- We have workaround for this improvement, no regression happens due to this issue. People can still use hdfs jar in old way. The worst case is improvement for HDFS doesn't work in some cases - that shouldn't block the whole release.

Based on the above, I think there is a regression for users of the hadoop-client artifact.

If it actually only affects users of hadoop-hdfs-client, then I agree we can document it as a Known Issue and fix it later.

Best,
Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Steve Loughran-3
In reply to this post by Junping Du-2

> On 16 Mar 2017, at 00:25, Junping Du <[hidden email]> wrote:
>
> bq. From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.
>
> I could miss that. Thanks for reminding! From my quick check: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client/2.7.3?, it sounds like 669 artifacts from other projects were depending on it.
>
>
> I think we should withdraw the current RC bits. Please stop the verification & vote.
>
> I will kick off another RC immediately when HDFS-11431 get fixed.

is done. hadoop-hdfs without any server-side dependencies is now a hadoop-client dependency.

Release notes:

The hadoop-client POM now includes a leaner hdfs-client, stripping out all the transitive dependencies on JARs only needed for the Hadoop HDFS daemon itself. The specific jars now excluded are: leveldbjni-all, jetty-util, commons-daemon, xercesImpl, netty and servlet-api.

This should make downstream projects dependent JARs smaller, and avoid version conflict problems with the specific JARs now excluded.

Applications may encounter build problems if they did depend on these JARs, and which didn't explicitly include them. There are two fixes for this

* explicitly include the JARs, stating which version of them you want.
* add a dependency on hadoop-hdfs. For Hadoop 2.8+, this will add the missing dependencies. For builds against older versions of Hadoop, this will be harmless, as hadoop-hdfs and all its dependencies are already pulled in by the hadoop-client POM.




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

Junping Du-2
Thanks Steve. That's Awesome! I will kick off a new RC soon.
Shall we reopen HDFS-6200 given issues here? Making it in release note of 2.8.0 could confuse people as it doesn't work in HA deployment.

Thanks,

Junping
________________________________________
From: Steve Loughran
Sent: Thursday, March 16, 2017 7:27 AM
To: Junping Du
Cc: [hidden email]; [hidden email]; [hidden email]; [hidden email]
Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC2)

> On 16 Mar 2017, at 00:25, Junping Du <[hidden email]> wrote:
>
> bq. From my read of the poms, hadoop-client depends on hadoop-hdfs-client to pull in HDFS-related code. It doesn't have its own dependency on hadoop-hdfs. So I think this affects users of the hadoop-client artifact, which has existed for a long time.
>
> I could miss that. Thanks for reminding! From my quick check: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client/2.7.3?, it sounds like 669 artifacts from other projects were depending on it.
>
>
> I think we should withdraw the current RC bits. Please stop the verification & vote.
>
> I will kick off another RC immediately when HDFS-11431 get fixed.

is done. hadoop-hdfs without any server-side dependencies is now a hadoop-client dependency.

Release notes:

The hadoop-client POM now includes a leaner hdfs-client, stripping out all the transitive dependencies on JARs only needed for the Hadoop HDFS daemon itself. The specific jars now excluded are: leveldbjni-all, jetty-util, commons-daemon, xercesImpl, netty and servlet-api.

This should make downstream projects dependent JARs smaller, and avoid version conflict problems with the specific JARs now excluded.

Applications may encounter build problems if they did depend on these JARs, and which didn't explicitly include them. There are two fixes for this

* explicitly include the JARs, stating which version of them you want.
* add a dependency on hadoop-hdfs. For Hadoop 2.8+, this will add the missing dependencies. For builds against older versions of Hadoop, this will be harmless, as hadoop-hdfs and all its dependencies are already pulled in by the hadoop-client POM.




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[VOTE] Release Apache Hadoop 2.8.0 (RC3)

Junping Du-2
In reply to this post by Junping Du-2
Hi all,
     With fix of HDFS-11431 get in, I've created a new release candidate (RC3) for Apache Hadoop 2.8.0.

     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,900+ fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      New RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC3

      The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 91f2b7a13d1e97be65db92ddabc627cc29ac0009

      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1057

      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/22/2017 PDT time.

Thanks,

Junping
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

Eric Payne-3
+1
Thanks, Junping, for your efforts to get this release out.
I downloaded and built the source, and i did the following manual testing on a 2-node pseudo cluster:
- Streaming job
- Inter-queue (cross-queue) preemption--verified that only the expected amount of preemptions occured.
- Inter-queue (in-queue) preemption with higher priority apps preempting lower ones.
- Limited node label testing.
- Yarn distributed shell, both with and without keeping containers across AM restart.
- Killing apps from Application UI




________________________________
From: Junping Du <[hidden email]>
To: "[hidden email]" <[hidden email]>; "[hidden email]" <[hidden email]>; "[hidden email]" <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Friday, March 17, 2017 4:18 AM
Subject: [VOTE] Release Apache Hadoop 2.8.0 (RC3)



Hi all,
     With fix of HDFS-11431 get in, I've created a new release candidate (RC3) for Apache Hadoop 2.8.0.

     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,900+ fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      New RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC3

      The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 91f2b7a13d1e97be65db92ddabc627cc29ac0009

      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1057

      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/22/2017 PDT time.


Thanks,

Junping

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

Daniel Templeton
In reply to this post by Junping Du-2
Thanks for the new RC, Junping.  I built from source and tried it out on
a 2-node cluster with HA enabled.  I ran a pi job and some streaming
jobs.  I tested that localization and failover work correctly, and I
played a little with the YARN and HDFS web UIs.

I did encounter an old friend of mine, which is that if you submit a
streaming job with input that is only 1 block, you will nonetheless get
2 mappers that both process the same split. What's new this time is that
the second mapper was consistently failing on certain input sizes.  I
(re)verified that the issue also exists is 2.7.3, so it's not a
regression.  I'm pretty sure it's been there since at least 2.6.0.  I
filed MAPREDUCE-6864 for it.

Given that my issue was not a regression, I'm +1 on the RC.

Daniel

On 3/17/17 2:18 AM, Junping Du wrote:

> Hi all,
>       With fix of HDFS-11431 get in, I've created a new release candidate (RC3) for Apache Hadoop 2.8.0.
>
>       This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,900+ fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>        More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>        New RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC3
>
>        The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 91f2b7a13d1e97be65db92ddabc627cc29ac0009
>
>        The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1057
>
>        Please try the release and vote; the vote will run for the usual 5 days, ending on 03/22/2017 PDT time.
>
> Thanks,
>
> Junping
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

Mingliang Liu-2
In reply to this post by Junping Du-2
Thanks Junping for doing this.

+1 (non-binding)

0. Download the src tar.gz file; checked the MD5 checksum
1. Build Hadoop from source successfully
2. Deploy a single node cluster and start the cluster successfully
3. Operate the HDFS from command line: ls, put, distcp, dfsadmin etc
4. Run hadoop mapreduce examples: grep
5. Operate AWS S3 using S3A schema from commandline: ls, cat, distcp
6. Check the HDFS service logs

L

> On Mar 17, 2017, at 2:18 AM, Junping Du <[hidden email]> wrote:
>
> Hi all,
>     With fix of HDFS-11431 get in, I've created a new release candidate (RC3) for Apache Hadoop 2.8.0.
>
>     This is the next minor release to follow up 2.7.0 which has been released for more than 1 year. It comprises 2,900+ fixes, improvements, and new features. Most of these commits are released for the first time in branch-2.
>
>      More information about the 2.8.0 release plan can be found here: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>      New RC is available at: http://home.apache.org/~junping_du/hadoop-2.8.0-RC3
>
>      The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 91f2b7a13d1e97be65db92ddabc627cc29ac0009
>
>      The maven artifacts are available via repository.apache.org at: https://repository.apache.org/content/repositories/orgapachehadoop-1057
>
>      Please try the release and vote; the vote will run for the usual 5 days, ending on 03/22/2017 PDT time.
>
> Thanks,
>
> Junping

123
Loading...