[VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Andrew Wang
Hi all,

With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
ready.

3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
since alpha1 was released on September 3rd, 2016.

More information about the 3.0.0 release plan can be found here:

https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release

The artifacts can be found here:

http://home.apache.org/~wang/3.0.0-alpha2-RC0/

This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.

I ran basic validation with a local pseudo cluster and a Pi job. RAT output
was clean.

My +1 to start.

Thanks,
Andrew
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

John Zhuge
Discovered 1 missing license using the script verify-license-notice.sh.

Missing LICENSE!
./../hadoop-dist/target/hadoop-3.0.0-alpha2/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar
- Verified checksums and signatures of the tarballs
- Built source with Java 1.8.0_66 on Mac
- Deployed a pseudo cluster, passed the following sanity tests in both insecure
and SSL mode:

- basic dfs, distcp, ACL commands

- KMS and HttpFS tests

- MapReduce wordcount example
- balancer start/stop


John Zhuge
Software Engineer, Cloudera

On Fri, Jan 20, 2017 at 2:36 PM, Andrew Wang <[hidden email]>
wrote:

> Hi all,
>
> With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
> ready.
>
> 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
> up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
> since alpha1 was released on September 3rd, 2016.
>
> More information about the 3.0.0 release plan can be found here:
>
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
> The artifacts can be found here:
>
> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
> This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
> I ran basic validation with a local pseudo cluster and a Pi job. RAT output
> was clean.
>
> My +1 to start.
>
> Thanks,
> Andrew
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Anu Engineer
In reply to this post by Andrew Wang
Hi Andrew,

Thank you for all the hard work. I am really excited to see us making progress towards a 3.0 release.

+1 (Non-Binding)

1. Deployed the downloaded bits on 4 node cluster with 1 Namenode and 3 datanodes.
2. Verified all normal HDFS operations like create directory, create file , delete file etc.
3. Ran Map reduce jobs  - Pi and Wordcount
5. Verified Hadoop version command output is correct.

Thanks
Anu


On 1/20/17, 2:36 PM, "Andrew Wang" <[hidden email]> wrote:

>Hi all,
>
>With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
>ready.
>
>3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
>up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
>since alpha1 was released on September 3rd, 2016.
>
>More information about the 3.0.0 release plan can be found here:
>
>https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
>The artifacts can be found here:
>
>http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
>This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
>I ran basic validation with a local pseudo cluster and a Pi job. RAT output
>was clean.
>
>My +1 to start.
>
>Thanks,
>Andrew


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Marton Elek
In reply to this post by Andrew Wang
hadoop-3.0.0-alpha2.tar.gz is much more smaller than hadoop-3.0.0-alpha1.tar.gz. (246M vs 316M)

The big difference is the generated source documentation:

find -name src-html
./hadoop-2.7.3/share/doc/hadoop/api/src-html
./hadoop-2.7.3/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html
./hadoop-3.0.0-alpha1/share/doc/hadoop/api/src-html
./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html
./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/api/src-html
./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-client/api/src-html
./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-common/api/src-html
(./hadoop-3.0.0-alpha-2 --> no match)

I am just wondering if it's intentional or not as I can't find any related jira or mail thread (maybe I missed it)

Regards,
Marton

On 01/20/2017 11:36 PM, Andrew Wang wrote:

> Hi all,
>
> With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
> ready.
>
> 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
> up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
> since alpha1 was released on September 3rd, 2016.
>
> More information about the 3.0.0 release plan can be found here:
>
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
> The artifacts can be found here:
>
> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
> This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
> I ran basic validation with a local pseudo cluster and a Pi job. RAT output
> was clean.
>
> My +1 to start.
>
> Thanks,
> Andrew
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Akira Ajisaka-2
Hi Marton,

This is intentional. source htmls were removed.
https://issues.apache.org/jira/browse/HADOOP-13688

Regards,
Akira

On 2017/01/22 0:50, Marton Elek wrote:

> hadoop-3.0.0-alpha2.tar.gz is much more smaller than hadoop-3.0.0-alpha1.tar.gz. (246M vs 316M)
>
> The big difference is the generated source documentation:
>
> find -name src-html
> ./hadoop-2.7.3/share/doc/hadoop/api/src-html
> ./hadoop-2.7.3/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html
> ./hadoop-3.0.0-alpha1/share/doc/hadoop/api/src-html
> ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html
> ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/api/src-html
> ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-client/api/src-html
> ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-common/api/src-html
> (./hadoop-3.0.0-alpha-2 --> no match)
>
> I am just wondering if it's intentional or not as I can't find any related jira or mail thread (maybe I missed it)
>
> Regards,
> Marton
>
> On 01/20/2017 11:36 PM, Andrew Wang wrote:
>> Hi all,
>>
>> With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
>> ready.
>>
>> 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
>> up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
>> since alpha1 was released on September 3rd, 2016.
>>
>> More information about the 3.0.0 release plan can be found here:
>>
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>>
>> The artifacts can be found here:
>>
>> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>>
>> This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>>
>> I ran basic validation with a local pseudo cluster and a Pi job. RAT output
>> was clean.
>>
>> My +1 to start.
>>
>> Thanks,
>> Andrew
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Andrew Wang
In reply to this post by John Zhuge
Thanks for the validation John. Do we feel the L&N issue is enough to sink
the RC?

On Fri, Jan 20, 2017 at 6:47 PM, John Zhuge <[hidden email]> wrote:

> Discovered 1 missing license using the script verify-license-notice.sh.
>
> Missing LICENSE! ./../hadoop-dist/target/hadoop-3.0.0-alpha2/share/
> hadoop/client/hadoop-client-api-3.0.0-alpha2.jar
> - Verified checksums and signatures of the tarballs
> - Built source with Java 1.8.0_66 on Mac
> - Deployed a pseudo cluster, passed the following sanity tests in both insecure
> and SSL mode:
>
> - basic dfs, distcp, ACL commands
>
> - KMS and HttpFS tests
>
> - MapReduce wordcount example
> - balancer start/stop
>
>
> John Zhuge
> Software Engineer, Cloudera
>
> On Fri, Jan 20, 2017 at 2:36 PM, Andrew Wang <[hidden email]>
> wrote:
>
>> Hi all,
>>
>> With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
>> ready.
>>
>> 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
>> up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
>> since alpha1 was released on September 3rd, 2016.
>>
>> More information about the 3.0.0 release plan can be found here:
>>
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>>
>> The artifacts can be found here:
>>
>> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>>
>> This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>>
>> I ran basic validation with a local pseudo cluster and a Pi job. RAT
>> output
>> was clean.
>>
>> My +1 to start.
>>
>> Thanks,
>> Andrew
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Allen Wittenauer-6
In reply to this post by Andrew Wang




> On Jan 20, 2017, at 2:36 PM, Andrew Wang <[hidden email]> wrote:
>
> http://home.apache.org/~wang/3.0.0-alpha2-RC0/

        There are quite a few JIRA issues that need release notes.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Allen Wittenauer-6

> On Jan 22, 2017, at 9:05 PM, Allen Wittenauer <[hidden email]> wrote:
>
>
>
>
>
>> On Jan 20, 2017, at 2:36 PM, Andrew Wang <[hidden email]> wrote:
>>
>> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
> There are quite a few JIRA issues that need release notes.
>


        One other thing, before I forget... I'm not sure the hadoop-client-minicluster jar is getting built properly.  If you look inside, you'll find a real mishmash of things, including files and directories with the same names but different cases.  This means it won't extract properly on OS X.  (jar xf on that jar file literally stack traces on my El Capitan machine. Neat!)
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Andrew Wang
There are 5 JIRAs by my count that are missing release notes, which isn't
that bad but could of course be improved. Four of those I had already
checked earlier and forgot to follow up, and they were very minorly
incompatible (affecting private APIs) or mistakenly marked incompatible.

I'm not too concerned about the shaded minicluster since it's a new
feature, this is an alpha, and we have an IT test against the shaded
minicluster. Multiple files with the same name are actually also allowed by
the zip standard, so it's not clear if there is a functionality bug.

Could I get some additional PMC input on this vote? The most critical issue
in my mind is the missing LICENSE on that one new artifact. If we end up
spinning a new RC, I'll also handle the missing release notes that Allen
mentioned.

Thanks,
Andrew

On Sun, Jan 22, 2017 at 10:45 PM, Allen Wittenauer <[hidden email]
> wrote:

>
> > On Jan 22, 2017, at 9:05 PM, Allen Wittenauer <[hidden email]>
> wrote:
> >
> >
> >
> >
> >
> >> On Jan 20, 2017, at 2:36 PM, Andrew Wang <[hidden email]>
> wrote:
> >>
> >> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
> >
> >       There are quite a few JIRA issues that need release notes.
> >
>
>
>         One other thing, before I forget... I'm not sure the
> hadoop-client-minicluster jar is getting built properly.  If you look
> inside, you'll find a real mishmash of things, including files and
> directories with the same names but different cases.  This means it won't
> extract properly on OS X.  (jar xf on that jar file literally stack traces
> on my El Capitan machine. Neat!)
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Chris Douglas-2
Thanks for all your work on this, Andrew. It's great to see the 3.x
series moving forward.

If you were willing to modify the release notes and add the LICENSE to
the jar, we don't need to reset the clock on the VOTE, IMO.

What's the issue with the minicluster jar [1]? I tried to reproduce,
but had no issues with 1.8.0_92-b14.

+1 Verified checksums, signature, built src tarball. -C

[1] hadoop-3.0.0-alpha2/share/hadoop/client/hadoop-client-minicluster-3.0.0-alpha2.jar


On Mon, Jan 23, 2017 at 10:51 AM, Andrew Wang <[hidden email]> wrote:

> There are 5 JIRAs by my count that are missing release notes, which isn't
> that bad but could of course be improved. Four of those I had already
> checked earlier and forgot to follow up, and they were very minorly
> incompatible (affecting private APIs) or mistakenly marked incompatible.
>
> I'm not too concerned about the shaded minicluster since it's a new
> feature, this is an alpha, and we have an IT test against the shaded
> minicluster. Multiple files with the same name are actually also allowed by
> the zip standard, so it's not clear if there is a functionality bug.
>
> Could I get some additional PMC input on this vote? The most critical issue
> in my mind is the missing LICENSE on that one new artifact. If we end up
> spinning a new RC, I'll also handle the missing release notes that Allen
> mentioned.
>
> Thanks,
> Andrew
>
> On Sun, Jan 22, 2017 at 10:45 PM, Allen Wittenauer <[hidden email]
>> wrote:
>
>>
>> > On Jan 22, 2017, at 9:05 PM, Allen Wittenauer <[hidden email]>
>> wrote:
>> >
>> >
>> >
>> >
>> >
>> >> On Jan 20, 2017, at 2:36 PM, Andrew Wang <[hidden email]>
>> wrote:
>> >>
>> >> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>> >
>> >       There are quite a few JIRA issues that need release notes.
>> >
>>
>>
>>         One other thing, before I forget... I'm not sure the
>> hadoop-client-minicluster jar is getting built properly.  If you look
>> inside, you'll find a real mishmash of things, including files and
>> directories with the same names but different cases.  This means it won't
>> extract properly on OS X.  (jar xf on that jar file literally stack traces
>> on my El Capitan machine. Neat!)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Allen Wittenauer-6

> On Jan 23, 2017, at 8:50 PM, Chris Douglas <[hidden email]> wrote:
>
> Thanks for all your work on this, Andrew. It's great to see the 3.x
> series moving forward.
>
> If you were willing to modify the release notes and add the LICENSE to
> the jar, we don't need to reset the clock on the VOTE, IMO.

FWIW, I wrote a new version of the verify-license-files tool and attached it to HADOOP-13374.  This version actually verifies that the license and notice files in jars and wars matches the one in base of the (tarball) distribution.

ERROR: hadoop-client-api-3.0.0-alpha3-SNAPSHOT.jar: Missing a LICENSE file
ERROR: hadoop-client-api-3.0.0-alpha3-SNAPSHOT.jar: No valid NOTICE found

WARNING: hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar: Found 5 LICENSE files (0 were valid)
ERROR: hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar: No valid LICENSE found
WARNING: hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar: Found 3 NOTICE files (0 were valid)
ERROR: hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar: No valid NOTICE found

ERROR: hadoop-client-runtime-3.0.0-alpha3-SNAPSHOT.jar: No valid LICENSE found
ERROR: hadoop-client-runtime-3.0.0-alpha3-SNAPSHOT.jar: No valid NOTICE found

> What's the issue with the minicluster jar [1]? I tried to reproduce,
> but had no issues with 1.8.0_92-b14.

minicluster is kind of weird on filesystems that don't support mixed case, like OS X's default HFS+.

$  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i license
LICENSE.txt
license/
license/LICENSE
license/LICENSE.dom-documentation.txt
license/LICENSE.dom-software.txt
license/LICENSE.sax.txt
license/NOTICE
license/README.dom.txt
license/README.sax.txt
LICENSE
Grizzly_THIRDPARTYLICENSEREADME.txt



The problem here is that there is a 'license' directory and a file called 'LICENSE'.  If this gets extracted by jar via jar xf, it will fail.  unzip can be made to extract it via an option like -o.  To make matters worse, none of these license files match the one in the generated tarball. :(



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Marton Elek
]>

> minicluster is kind of weird on filesystems that don't support mixed case, like OS X's default HFS+.
>
> $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i license
> LICENSE.txt
> license/
> license/LICENSE
> license/LICENSE.dom-documentation.txt
> license/LICENSE.dom-software.txt
> license/LICENSE.sax.txt
> license/NOTICE
> license/README.dom.txt
> license/README.sax.txt
> LICENSE
> Grizzly_THIRDPARTYLICENSEREADME.txt


I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to add the missing META-INF/LICENSE.txt to the shaded files.

Question: what should be done with the other LICENSE files in the minicluster. Can we just exclude them (from legal point of view)?

Regards,
Marton

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Chris Douglas-2
In reply to this post by Allen Wittenauer-6
On Mon, Jan 23, 2017 at 9:32 PM, Allen Wittenauer
<[hidden email]> wrote:
> The problem here is that there is a 'license' directory and a file called 'LICENSE'.  If this gets extracted by jar via jar xf, it will fail.  unzip can be made to extract it via an option like -o.  To make matters worse, none of these license files match the one in the generated tarball. :(

Ah, got it. I didn't strip the trailing slash on directories. With
that, it looks like the "license" directory and "LICENSE" file are the
only conflict?

I've not followed the development of packaging LICENSE/NOTICE in the
jar files. AFAIK, it's sufficient that we have the top-level
LICENSE/NOTICE in the tarball. Unless there's a LEGAL thread to the
contrary, it's OK as-is.

Again, I don't think we need to restart the clock on the RC vote if
the release notes and LICENSE/NOTICE were fixed, but it's Andrew's
time and I don't think any of these are blockers for the release. -C

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Eric Badger
In reply to this post by Marton Elek
+1 (non-binding)
- Verified signatures and md5- Built from source- Started single-node cluster on my mac- Ran some sleep jobs
Eric

    On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]> wrote:
 

 Hi Andrew,

Thanks for working on this.

+1 (Non-Binding)

1. Downloaded the binary and verified the md5.
2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager.
3. Set YARN to use Fair Scheduler.
4. Ran MapReduce jobs Pi
5. Verified Hadoop version command output is correct.

Best,

Yufei

On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]> wrote:

> ]>
> > minicluster is kind of weird on filesystems that don't support mixed
> case, like OS X's default HFS+.
> >
> > $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i
> license
> > LICENSE.txt
> > license/
> > license/LICENSE
> > license/LICENSE.dom-documentation.txt
> > license/LICENSE.dom-software.txt
> > license/LICENSE.sax.txt
> > license/NOTICE
> > license/README.dom.txt
> > license/README.sax.txt
> > LICENSE
> > Grizzly_THIRDPARTYLICENSEREADME.txt
>
>
> I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
> add the missing META-INF/LICENSE.txt to the shaded files.
>
> Question: what should be done with the other LICENSE files in the
> minicluster. Can we just exclude them (from legal point of view)?
>
> Regards,
> Marton
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


   
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Yongjun Zhang-2
In reply to this post by Andrew Wang
Thanks Andrew much for the work here!

+1 (binding).

- Downloaded both binary and src tarballs
- Verified md5 checksum and signature for both
- Built from source tarball
- Deployed 2 pseudo clusters, one with the released tarball and the other
  with what I built from source, and did the following on both:
      - Run basic HDFS operations, snapshots and distcp jobs
      - Run pi job
      - Examined HDFS webui, YARN webui.

 Best,

 --Yongjun


On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <[hidden email]>
wrote:

> +1 (non-binding)
> - Verified signatures and md5- Built from source- Started single-node
> cluster on my mac- Ran some sleep jobs
> Eric
>
>     On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]>
> wrote:
>
>
>  Hi Andrew,
>
> Thanks for working on this.
>
> +1 (Non-Binding)
>
> 1. Downloaded the binary and verified the md5.
> 2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager.
> 3. Set YARN to use Fair Scheduler.
> 4. Ran MapReduce jobs Pi
> 5. Verified Hadoop version command output is correct.
>
> Best,
>
> Yufei
>
> On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]>
> wrote:
>
> > ]>
> > > minicluster is kind of weird on filesystems that don't support mixed
> > case, like OS X's default HFS+.
> > >
> > > $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
> -i
> > license
> > > LICENSE.txt
> > > license/
> > > license/LICENSE
> > > license/LICENSE.dom-documentation.txt
> > > license/LICENSE.dom-software.txt
> > > license/LICENSE.sax.txt
> > > license/NOTICE
> > > license/README.dom.txt
> > > license/README.sax.txt
> > > LICENSE
> > > Grizzly_THIRDPARTYLICENSEREADME.txt
> >
> >
> > I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
> > add the missing META-INF/LICENSE.txt to the shaded files.
> >
> > Question: what should be done with the other LICENSE files in the
> > minicluster. Can we just exclude them (from legal point of view)?
> >
> > Regards,
> > Marton
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Marton Elek
Hi,

I also did a quick smoketest with the provided 3.0.0-alpha2 binaries:

TLDR; It works well

Environment:
 * 5 hosts, docker based hadoop cluster, every component in separated container (5 datanode/5 nodemanager/...)
 * Components are:
   * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary package for vote)
   * Zeppelin 0.6.2/0.7.0-RC2
   * Spark 2.0.2/2.1.0
   * HBase 1.2.4 + zookeeper
   * + additional docker containers for configuration management and monitoring
* No HA, no kerberos, no wire encryption

 * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data)
 * Imported 100G data to HBase successfully
 * Started Spark jobs to process 1G json from HDFS (using spark-master/slave cluster). It worked even when I used the Zeppelin 0.6.2 + Spark 2.0.2 (with old hadoop client included). Obviously the old version can't use the new Yarn cluster as the token file format has been changed.
 * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the same json files from HDFS with spark jobs (from zeppelin) using yarn cluster (master: yarn deploy-mode: cluster)
 * Started spark jobs (with spark submit, master: yarn) to count records from the hbase database: OK
 * Started example Mapreduce jobs from distribution over yarn. It was OK but only with specific configuration (see bellow)

So my overall impression that it works very well (at least with my 'smalldata')

Some notes (none of them are blocking):

1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at command line:
./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha2.jar pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 10 10

And in the yarn-site:

yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR

I don't know the exact reason for the change, but the 2.7.3 was more userfriendly as the example could be run without specific configuration.

For the same reason I didn't start hbase mapreduce job with hbase command line app (There could be some option for hbase to define MAPRED_HOME_DIR as well, but by default I got ClassNotFoundException for one of the MR class)

2. For the records: The logging and htrace classes are excluded from the shaded hadoop client jar so I added it manually one by one to the spark (spark 2.1.0 distribution without hadoop):

RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm spark.tar.gz && mv spark* spark
RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar /opt/spark/jars
RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-3.0.0-alpha2.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar /opt/spark/jars/
ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar /opt/spark/jars

With this jars files spark 2.1.0 works well with the alpha2 version of HDFS and YARN.

3. The messages "Upgrade in progress. Not yet finalized." wasn't disappeared from the namenode webui but the cluster works well.

Most probably I missed to do something, but it's a little bit confusing.

(I checked the REST call, it is the jmx bean who reports that it was not yet finalized, the code of the webpage seems to be ok.)

Regards
Marton

On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <[hidden email]<mailto:[hidden email]>> wrote:

Thanks Andrew much for the work here!

+1 (binding).

- Downloaded both binary and src tarballs
- Verified md5 checksum and signature for both
- Built from source tarball
- Deployed 2 pseudo clusters, one with the released tarball and the other
 with what I built from source, and did the following on both:
     - Run basic HDFS operations, snapshots and distcp jobs
     - Run pi job
     - Examined HDFS webui, YARN webui.

Best,

--Yongjun


On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <[hidden email]<mailto:[hidden email]>>
wrote:

+1 (non-binding)
- Verified signatures and md5- Built from source- Started single-node
cluster on my mac- Ran some sleep jobs
Eric

   On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]<mailto:[hidden email]>>
wrote:


Hi Andrew,

Thanks for working on this.

+1 (Non-Binding)

1. Downloaded the binary and verified the md5.
2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager.
3. Set YARN to use Fair Scheduler.
4. Ran MapReduce jobs Pi
5. Verified Hadoop version command output is correct.

Best,

Yufei

On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]<mailto:[hidden email]>>
wrote:

]>
minicluster is kind of weird on filesystems that don't support mixed
case, like OS X's default HFS+.

$  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
-i
license
LICENSE.txt
license/
license/LICENSE
license/LICENSE.dom-documentation.txt
license/LICENSE.dom-software.txt
license/LICENSE.sax.txt
license/NOTICE
license/README.dom.txt
license/README.sax.txt
LICENSE
Grizzly_THIRDPARTYLICENSEREADME.txt


I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
add the missing META-INF/LICENSE.txt to the shaded files.

Question: what should be done with the other LICENSE files in the
minicluster. Can we just exclude them (from legal point of view)?

Regards,
Marton

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]<mailto:[hidden email]>
For additional commands, e-mail: [hidden email]<mailto:[hidden email]>






Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Kuhu Shukla
+1 (non-binding)* Built from source* Deployed on a pseudo-distributed cluster (MAC)* Ran wordcount and sleep jobs.
 

    On Wednesday, January 25, 2017 3:21 AM, Marton Elek <[hidden email]> wrote:
 

 Hi,

I also did a quick smoketest with the provided 3.0.0-alpha2 binaries:

TLDR; It works well

Environment:
 * 5 hosts, docker based hadoop cluster, every component in separated container (5 datanode/5 nodemanager/...)
 * Components are:
  * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary package for vote)
  * Zeppelin 0.6.2/0.7.0-RC2
  * Spark 2.0.2/2.1.0
  * HBase 1.2.4 + zookeeper
  * + additional docker containers for configuration management and monitoring
* No HA, no kerberos, no wire encryption

 * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data)
 * Imported 100G data to HBase successfully
 * Started Spark jobs to process 1G json from HDFS (using spark-master/slave cluster). It worked even when I used the Zeppelin 0.6.2 + Spark 2.0.2 (with old hadoop client included). Obviously the old version can't use the new Yarn cluster as the token file format has been changed.
 * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the same json files from HDFS with spark jobs (from zeppelin) using yarn cluster (master: yarn deploy-mode: cluster)
 * Started spark jobs (with spark submit, master: yarn) to count records from the hbase database: OK
 * Started example Mapreduce jobs from distribution over yarn. It was OK but only with specific configuration (see bellow)

So my overall impression that it works very well (at least with my 'smalldata')

Some notes (none of them are blocking):

1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at command line:
./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha2.jar pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 10 10

And in the yarn-site:

yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR

I don't know the exact reason for the change, but the 2.7.3 was more userfriendly as the example could be run without specific configuration.

For the same reason I didn't start hbase mapreduce job with hbase command line app (There could be some option for hbase to define MAPRED_HOME_DIR as well, but by default I got ClassNotFoundException for one of the MR class)

2. For the records: The logging and htrace classes are excluded from the shaded hadoop client jar so I added it manually one by one to the spark (spark 2.1.0 distribution without hadoop):

RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm spark.tar.gz && mv spark* spark
RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar /opt/spark/jars
RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-3.0.0-alpha2.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar /opt/spark/jars
ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar /opt/spark/jars/
ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar /opt/spark/jars

With this jars files spark 2.1.0 works well with the alpha2 version of HDFS and YARN.

3. The messages "Upgrade in progress. Not yet finalized." wasn't disappeared from the namenode webui but the cluster works well.

Most probably I missed to do something, but it's a little bit confusing.

(I checked the REST call, it is the jmx bean who reports that it was not yet finalized, the code of the webpage seems to be ok.)

Regards
Marton

On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <[hidden email]<mailto:[hidden email]>> wrote:

Thanks Andrew much for the work here!

+1 (binding).

- Downloaded both binary and src tarballs
- Verified md5 checksum and signature for both
- Built from source tarball
- Deployed 2 pseudo clusters, one with the released tarball and the other
 with what I built from source, and did the following on both:
    - Run basic HDFS operations, snapshots and distcp jobs
    - Run pi job
    - Examined HDFS webui, YARN webui.

Best,

--Yongjun


On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <[hidden email]<mailto:[hidden email]>>
wrote:

+1 (non-binding)
- Verified signatures and md5- Built from source- Started single-node
cluster on my mac- Ran some sleep jobs
Eric

  On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]<mailto:[hidden email]>>
wrote:


Hi Andrew,

Thanks for working on this.

+1 (Non-Binding)

1. Downloaded the binary and verified the md5.
2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager.
3. Set YARN to use Fair Scheduler.
4. Ran MapReduce jobs Pi
5. Verified Hadoop version command output is correct.

Best,

Yufei

On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]<mailto:[hidden email]>>
wrote:

]>
minicluster is kind of weird on filesystems that don't support mixed
case, like OS X's default HFS+.

$  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
-i
license
LICENSE.txt
license/
license/LICENSE
license/LICENSE.dom-documentation.txt
license/LICENSE.dom-software.txt
license/LICENSE.sax.txt
license/NOTICE
license/README.dom.txt
license/README.sax.txt
LICENSE
Grizzly_THIRDPARTYLICENSEREADME.txt


I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
add the missing META-INF/LICENSE.txt to the shaded files.

Question: what should be done with the other LICENSE files in the
minicluster. Can we just exclude them (from legal point of view)?

Regards,
Marton

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]<mailto:[hidden email]>
For additional commands, e-mail: [hidden email]<mailto:[hidden email]>







   
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Zhihai Xu-3
Thanks Andrew for creating release Hadoop 3.0.0-alpha2 RC0
+1 ( non-binding)

--Downloaded source and built from it.
--Deployed on a pseudo-distributed cluster.
--Ran sample MR jobs and tested with basics HDFS operations.
--Did a sanity check for RM and NM UI.

Best,
zhihai

On Wed, Jan 25, 2017 at 8:07 AM, Kuhu Shukla <[hidden email]>
wrote:

> +1 (non-binding)* Built from source* Deployed on a pseudo-distributed
> cluster (MAC)* Ran wordcount and sleep jobs.
>
>
>     On Wednesday, January 25, 2017 3:21 AM, Marton Elek <
> [hidden email]> wrote:
>
>
>  Hi,
>
> I also did a quick smoketest with the provided 3.0.0-alpha2 binaries:
>
> TLDR; It works well
>
> Environment:
>  * 5 hosts, docker based hadoop cluster, every component in separated
> container (5 datanode/5 nodemanager/...)
>  * Components are:
>   * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary
> package for vote)
>   * Zeppelin 0.6.2/0.7.0-RC2
>   * Spark 2.0.2/2.1.0
>   * HBase 1.2.4 + zookeeper
>   * + additional docker containers for configuration management and
> monitoring
> * No HA, no kerberos, no wire encryption
>
>  * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data)
>  * Imported 100G data to HBase successfully
>  * Started Spark jobs to process 1G json from HDFS (using
> spark-master/slave cluster). It worked even when I used the Zeppelin 0.6.2
> + Spark 2.0.2 (with old hadoop client included). Obviously the old version
> can't use the new Yarn cluster as the token file format has been changed.
>  * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution
> without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the
> same json files from HDFS with spark jobs (from zeppelin) using yarn
> cluster (master: yarn deploy-mode: cluster)
>  * Started spark jobs (with spark submit, master: yarn) to count records
> from the hbase database: OK
>  * Started example Mapreduce jobs from distribution over yarn. It was OK
> but only with specific configuration (see bellow)
>
> So my overall impression that it works very well (at least with my
> 'smalldata')
>
> Some notes (none of them are blocking):
>
> 1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at
> command line:
> ./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha2.jar
> pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 10
> 10
>
> And in the yarn-site:
>
> yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,
> HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_
> DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR
>
> I don't know the exact reason for the change, but the 2.7.3 was more
> userfriendly as the example could be run without specific configuration.
>
> For the same reason I didn't start hbase mapreduce job with hbase command
> line app (There could be some option for hbase to define MAPRED_HOME_DIR as
> well, but by default I got ClassNotFoundException for one of the MR class)
>
> 2. For the records: The logging and htrace classes are excluded from the
> shaded hadoop client jar so I added it manually one by one to the spark
> (spark 2.1.0 distribution without hadoop):
>
> RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm
> spark.tar.gz && mv spark* spark
> RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar
> /opt/spark/jars
> RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-3.0.0-alpha2.jar
> /opt/spark/jars
> ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-
> log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars
> ADD https://repo1.maven.org/maven2/org/apache/htrace/
> htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar
> /opt/spark/jars
> ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.
> 7.10/slf4j-api-1.7.10.jar /opt/spark/jars/
> ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar
> /opt/spark/jars
>
> With this jars files spark 2.1.0 works well with the alpha2 version of
> HDFS and YARN.
>
> 3. The messages "Upgrade in progress. Not yet finalized." wasn't
> disappeared from the namenode webui but the cluster works well.
>
> Most probably I missed to do something, but it's a little bit confusing.
>
> (I checked the REST call, it is the jmx bean who reports that it was not
> yet finalized, the code of the webpage seems to be ok.)
>
> Regards
> Marton
>
> On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <[hidden email]<mailto:y
> [hidden email]>> wrote:
>
> Thanks Andrew much for the work here!
>
> +1 (binding).
>
> - Downloaded both binary and src tarballs
> - Verified md5 checksum and signature for both
> - Built from source tarball
> - Deployed 2 pseudo clusters, one with the released tarball and the other
>  with what I built from source, and did the following on both:
>     - Run basic HDFS operations, snapshots and distcp jobs
>     - Run pi job
>     - Examined HDFS webui, YARN webui.
>
> Best,
>
> --Yongjun
>
>
> On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <[hidden email]
> <mailto:[hidden email]>>
> wrote:
>
> +1 (non-binding)
> - Verified signatures and md5- Built from source- Started single-node
> cluster on my mac- Ran some sleep jobs
> Eric
>
>   On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]
> <mailto:[hidden email]>>
> wrote:
>
>
> Hi Andrew,
>
> Thanks for working on this.
>
> +1 (Non-Binding)
>
> 1. Downloaded the binary and verified the md5.
> 2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager.
> 3. Set YARN to use Fair Scheduler.
> 4. Ran MapReduce jobs Pi
> 5. Verified Hadoop version command output is correct.
>
> Best,
>
> Yufei
>
> On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]
> <mailto:[hidden email]>>
> wrote:
>
> ]>
> minicluster is kind of weird on filesystems that don't support mixed
> case, like OS X's default HFS+.
>
> $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
> -i
> license
> LICENSE.txt
> license/
> license/LICENSE
> license/LICENSE.dom-documentation.txt
> license/LICENSE.dom-software.txt
> license/LICENSE.sax.txt
> license/NOTICE
> license/README.dom.txt
> license/README.sax.txt
> LICENSE
> Grizzly_THIRDPARTYLICENSEREADME.txt
>
>
> I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
> add the missing META-INF/LICENSE.txt to the shaded files.
>
> Question: what should be done with the other LICENSE files in the
> minicluster. Can we just exclude them (from legal point of view)?
>
> Regards,
> Marton
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]<mailto:
> [hidden email]>
> For additional commands, e-mail: [hidden email]<mailto:
> [hidden email]>
>
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Xiao Chen
Thanks Andrew and the community to work out the alpha2 RC!

+1 (non-binding)

   - Built the source tarball
   - Tested on a pseudo-distributed cluster, basic HDFS operations/sample
   pi job over HDFS encryption zone work.
   - Sanity checked NN and KMS webui
   - Sanity checked NN/DN/KMS logs.


-Xiao

On Wed, Jan 25, 2017 at 9:41 AM, Zhihai Xu <[hidden email]> wrote:

> Thanks Andrew for creating release Hadoop 3.0.0-alpha2 RC0
> +1 ( non-binding)
>
> --Downloaded source and built from it.
> --Deployed on a pseudo-distributed cluster.
> --Ran sample MR jobs and tested with basics HDFS operations.
> --Did a sanity check for RM and NM UI.
>
> Best,
> zhihai
>
> On Wed, Jan 25, 2017 at 8:07 AM, Kuhu Shukla <[hidden email]
> >
> wrote:
>
> > +1 (non-binding)* Built from source* Deployed on a pseudo-distributed
> > cluster (MAC)* Ran wordcount and sleep jobs.
> >
> >
> >     On Wednesday, January 25, 2017 3:21 AM, Marton Elek <
> > [hidden email]> wrote:
> >
> >
> >  Hi,
> >
> > I also did a quick smoketest with the provided 3.0.0-alpha2 binaries:
> >
> > TLDR; It works well
> >
> > Environment:
> >  * 5 hosts, docker based hadoop cluster, every component in separated
> > container (5 datanode/5 nodemanager/...)
> >  * Components are:
> >   * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary
> > package for vote)
> >   * Zeppelin 0.6.2/0.7.0-RC2
> >   * Spark 2.0.2/2.1.0
> >   * HBase 1.2.4 + zookeeper
> >   * + additional docker containers for configuration management and
> > monitoring
> > * No HA, no kerberos, no wire encryption
> >
> >  * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data)
> >  * Imported 100G data to HBase successfully
> >  * Started Spark jobs to process 1G json from HDFS (using
> > spark-master/slave cluster). It worked even when I used the Zeppelin
> 0.6.2
> > + Spark 2.0.2 (with old hadoop client included). Obviously the old
> version
> > can't use the new Yarn cluster as the token file format has been changed.
> >  * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution
> > without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the
> > same json files from HDFS with spark jobs (from zeppelin) using yarn
> > cluster (master: yarn deploy-mode: cluster)
> >  * Started spark jobs (with spark submit, master: yarn) to count records
> > from the hbase database: OK
> >  * Started example Mapreduce jobs from distribution over yarn. It was OK
> > but only with specific configuration (see bellow)
> >
> > So my overall impression that it works very well (at least with my
> > 'smalldata')
> >
> > Some notes (none of them are blocking):
> >
> > 1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at
> > command line:
> > ./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-
> alpha2.jar
> > pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_
> COMMON_HOME}}"
> > -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"
> 10
> > 10
> >
> > And in the yarn-site:
> >
> > yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,
> > HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_
> > DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR
> >
> > I don't know the exact reason for the change, but the 2.7.3 was more
> > userfriendly as the example could be run without specific configuration.
> >
> > For the same reason I didn't start hbase mapreduce job with hbase command
> > line app (There could be some option for hbase to define MAPRED_HOME_DIR
> as
> > well, but by default I got ClassNotFoundException for one of the MR
> class)
> >
> > 2. For the records: The logging and htrace classes are excluded from the
> > shaded hadoop client jar so I added it manually one by one to the spark
> > (spark 2.1.0 distribution without hadoop):
> >
> > RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm
> > spark.tar.gz && mv spark* spark
> > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.
> 0-alpha2.jar
> > /opt/spark/jars
> > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-
> 3.0.0-alpha2.jar
> > /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-
> > log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/apache/htrace/
> > htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar
> > /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.
> > 7.10/slf4j-api-1.7.10.jar /opt/spark/jars/
> > ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar
> > /opt/spark/jars
> >
> > With this jars files spark 2.1.0 works well with the alpha2 version of
> > HDFS and YARN.
> >
> > 3. The messages "Upgrade in progress. Not yet finalized." wasn't
> > disappeared from the namenode webui but the cluster works well.
> >
> > Most probably I missed to do something, but it's a little bit confusing.
> >
> > (I checked the REST call, it is the jmx bean who reports that it was not
> > yet finalized, the code of the webpage seems to be ok.)
> >
> > Regards
> > Marton
> >
> > On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <[hidden email]<mailto:
> y
> > [hidden email]>> wrote:
> >
> > Thanks Andrew much for the work here!
> >
> > +1 (binding).
> >
> > - Downloaded both binary and src tarballs
> > - Verified md5 checksum and signature for both
> > - Built from source tarball
> > - Deployed 2 pseudo clusters, one with the released tarball and the other
> >  with what I built from source, and did the following on both:
> >     - Run basic HDFS operations, snapshots and distcp jobs
> >     - Run pi job
> >     - Examined HDFS webui, YARN webui.
> >
> > Best,
> >
> > --Yongjun
> >
> >
> > On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger
> <[hidden email]
> > <mailto:[hidden email]>>
> > wrote:
> >
> > +1 (non-binding)
> > - Verified signatures and md5- Built from source- Started single-node
> > cluster on my mac- Ran some sleep jobs
> > Eric
> >
> >   On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <[hidden email]
> > <mailto:[hidden email]>>
> > wrote:
> >
> >
> > Hi Andrew,
> >
> > Thanks for working on this.
> >
> > +1 (Non-Binding)
> >
> > 1. Downloaded the binary and verified the md5.
> > 2. Deployed it on 3 node cluster with 1 ResourceManager and 2
> NodeManager.
> > 3. Set YARN to use Fair Scheduler.
> > 4. Ran MapReduce jobs Pi
> > 5. Verified Hadoop version command output is correct.
> >
> > Best,
> >
> > Yufei
> >
> > On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <[hidden email]
> > <mailto:[hidden email]>>
> > wrote:
> >
> > ]>
> > minicluster is kind of weird on filesystems that don't support mixed
> > case, like OS X's default HFS+.
> >
> > $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
> > -i
> > license
> > LICENSE.txt
> > license/
> > license/LICENSE
> > license/LICENSE.dom-documentation.txt
> > license/LICENSE.dom-software.txt
> > license/LICENSE.sax.txt
> > license/NOTICE
> > license/README.dom.txt
> > license/README.sax.txt
> > LICENSE
> > Grizzly_THIRDPARTYLICENSEREADME.txt
> >
> >
> > I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
> > add the missing META-INF/LICENSE.txt to the shaded files.
> >
> > Question: what should be done with the other LICENSE files in the
> > minicluster. Can we just exclude them (from legal point of view)?
> >
> > Regards,
> > Marton
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]<mailto:
> > [hidden email]>
> > For additional commands, e-mail: [hidden email]<mailto:
> > [hidden email]>
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

Karthik Kambatla-2
In reply to this post by Andrew Wang
Thanks for driving the alphas, Andrew. I don't see the need to restart the
vote and I feel it is okay to fix the minor issues before releasing.

+1 (binding). Downloaded source, stood up a pseudo-distributed cluster with
FairScheduler, ran example jobs, and played around with the UI.

Thanks
Karthik


On Fri, Jan 20, 2017 at 2:36 PM, Andrew Wang <[hidden email]>
wrote:

> Hi all,
>
> With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
> ready.
>
> 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
> up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
> since alpha1 was released on September 3rd, 2016.
>
> More information about the 3.0.0 release plan can be found here:
>
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
> The artifacts can be found here:
>
> http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
> This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
> I ran basic validation with a local pseudo cluster and a Pi job. RAT output
> was clean.
>
> My +1 to start.
>
> Thanks,
> Andrew
>
12