[Vote] Merge branch-trunk-win to trunk

classic Classic list List threaded Threaded
55 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[Vote] Merge branch-trunk-win to trunk

Suresh Srinivas-2
I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I
am happy to announce that we are ready for the merge.

Here is a brief recap on the highlights of the work done:
- Command-line scripts for the Hadoop surface area
- Mapping the HDFS permissions model to Windows
- Abstracted and reconciled mismatches around differences in Path semantics
in Java and Windows
- Native Task Controller for Windows
- Implementation of a Block Placement Policy to support cloud environments,
more specifically Azure.
- Implementation of Hadoop native libraries for Windows (compression
codecs, native I/O)
- Several reliability issues, including race-conditions, intermittent test
failures, resource leaks.
- Several new unit test cases written for the above changes

Please find the details of the work in CHANGES.branch-trunk-win.txt -
Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
ported from branch-1-win to a branch based on trunk.

For details of the testing done, please see the thread -
http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
https://issues.apache.org/jira/browse/HADOOP-8562>.

This was a large undertaking that involved developing code, testing the
entire Hadoop stack, including scale tests. This is made possible only with
the contribution from many many folks in the community. Following people
contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur
Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh
Srinivas and Sanjay Radia. There are many others who contributed as well
providing feedback and comments on numerous jiras.

The vote will run for seven days and will end on March 5, 6:00PM PST.

Regards,
Suresh




On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
<[hidden email]>wrote:

> It is super exciting to look at the prospect of these changes being merged
> to trunk. Having Windows as one of the supported Hadoop platforms is a
> fantastic opportunity both for the Hadoop project and Microsoft customers.
>
> This work began around a year back when a few of us started with a basic
> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have
> made significant progress in the following areas:
> (PS: Some of these items are already included in Suresh's email, but
> including again for completeness)
>
> - Command-line scripts for the Hadoop surface area
> - Mapping the HDFS permissions model to Windows
> - Abstracted and reconciled mismatches around differences in Path
> semantics in Java and Windows
> - Native Task Controller for Windows
> - Implementation of a Block Placement Policy to support cloud
> environments, more specifically Azure.
> - Implementation of Hadoop native libraries for Windows (compression
> codecs, native I/O) - Several reliability issues, including
> race-conditions, intermittent test failures, resource leaks.
> - Several new unit test cases written for the above changes
>
> In the process, we have closely engaged with the Apache open source
> community and have got great support and assistance from the community in
> terms of contributing fixes, code review comments and commits.
>
> In addition, the Hadoop team at Microsoft has also made good progress in
> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of
> these changes have already been committed to the respective trunks with
> help from various committers and contributors. It is great to see the
> commitment of the community to support multiple platforms, and we look
> forward to the day when a developer/customer is able to successfully deploy
> a complete solution stack based on Apache Hadoop releases.
>
> Next Steps:
>
> All of the above changes are part of the Windows Azure HDInsight and
> HDInsight Server products from Microsoft. We have successfully on-boarded
> several internal customers and have been running production workloads on
> Windows Azure HDInsight. Our vision is to create a big data platform based
> on Hadoop, and we are committed to helping make Hadoop a world-class
> solution that anyone can use to solve their biggest data challenges.
>
> As an immediate next step, we would like to have a discussion around how
> we can ensure that the quality of the mainline Hadoop branches on Windows
> is maintained. To this end, we would like to get to the state where we have
> pre-checkin validation gates and nightly test runs enabled on Windows. If
> you have any suggestions around this, please do send an email.  We are
> committed to helping sustain the long-term quality of Hadoop on both Linux
> and Windows.
>
> We sincerely thank the community for their contribution and support so
> far. And hope to continue having a close engagement in the future.
>
> -Microsoft HDInsight Team
>
>
> -----Original Message-----
> From: Suresh Srinivas [mailto:[hidden email]]
> Sent: Thursday, February 7, 2013 5:42 PM
> To: [hidden email]; [hidden email];
> [hidden email]; [hidden email]
> Subject: Heads up - merge branch-trunk-win to trunk
>
> The support for Hadoop on Windows was proposed in HADOOP-8079<
> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago. The
> goal was to make Hadoop natively integrated, full-featured, and performance
> and scalability tuned on Windows Server or Windows Azure.
> We are happy to announce that a lot of progress has been made in this
> regard.
>
> Initial work started in a feature branch, branch-1-win, based on branch-1.
> The details related to the work done in the branch can be seen in
> CHANGES.txt<
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.branch-1-win.txt?view=markup
> >.
> This work has been ported to a branch, branch-trunk-win, based on trunk.
> Merge patch for this is available on
> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> .
>
> Highlights of the work done so far:
> 1. Necessary changes in Hadoop to run natively on Windows. These changes
> handle differences in platforms related to path names, process/task
> management etc.
> 2. Addition of winutils tools for managing file permissions and ownership,
> user group mapping, hardlinks, symbolic links, chmod, disk utilization, and
> process/task management.
> 3. Added cmd scripts equivalent to existing shell scripts
> hadoop-daemon.sh, start and stop scripts.
> 4. Addition of block placement policy implemnation to support cloud
> enviroment, more specifically Azure.
>
> We are very close to wrapping up the work in branch-trunk-win and getting
> ready for a merge. Currently the merge patch is passing close to 100% of
> unit tests on Linux. Soon I will call for a vote to merge this branch into
> trunk.
>
> Next steps:
> 1. Call for vote to merge branch-trunk-win to trunk, when the work
> completes and precommit build is clean.
> 2. Start a discussion on adding Jenkins precommit builds on windows and
> how to integrate that with the existing commit process.
>
> Let me know if you have any questions.
>
> Regards,
> Suresh
>
>


--
http://hortonworks.com/download/
Reply | Threaded
Open this post in threaded view
|

RE: [Vote] Merge branch-trunk-win to trunk

Bikas Saha
+1

As someone who has been part of this effort from inception, I am glad that
we have reached this stable state in the project on both branches of
Hadoop.
It has been a great collaboration across teams and engineers and opens up
Hadoop to a whole new set of deployments and developers!

Bikas

-----Original Message-----
From: Suresh Srinivas [mailto:[hidden email]]
Sent: Tuesday, February 26, 2013 2:56 PM
To: [hidden email]
Cc: [hidden email]; [hidden email];
[hidden email]
Subject: [Vote] Merge branch-trunk-win to trunk

I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
I am happy to announce that we are ready for the merge.

Here is a brief recap on the highlights of the work done:
- Command-line scripts for the Hadoop surface area
- Mapping the HDFS permissions model to Windows
- Abstracted and reconciled mismatches around differences in Path
semantics in Java and Windows
- Native Task Controller for Windows
- Implementation of a Block Placement Policy to support cloud
environments, more specifically Azure.
- Implementation of Hadoop native libraries for Windows (compression
codecs, native I/O)
- Several reliability issues, including race-conditions, intermittent test
failures, resource leaks.
- Several new unit test cases written for the above changes

Please find the details of the work in CHANGES.branch-trunk-win.txt -
Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
ported from branch-1-win to a branch based on trunk.

For details of the testing done, please see the thread -
http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
https://issues.apache.org/jira/browse/HADOOP-8562>.

This was a large undertaking that involved developing code, testing the
entire Hadoop stack, including scale tests. This is made possible only
with the contribution from many many folks in the community. Following
people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas
Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing
Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan
Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who
contributed as well providing feedback and comments on numerous jiras.

The vote will run for seven days and will end on March 5, 6:00PM PST.

Regards,
Suresh




On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
<[hidden email]>wrote:

> It is super exciting to look at the prospect of these changes being
> merged to trunk. Having Windows as one of the supported Hadoop
> platforms is a fantastic opportunity both for the Hadoop project and
Microsoft customers.

>
> This work began around a year back when a few of us started with a
> basic port of Hadoop on Windows. Ever since, the Hadoop team in
> Microsoft have made significant progress in the following areas:
> (PS: Some of these items are already included in Suresh's email, but
> including again for completeness)
>
> - Command-line scripts for the Hadoop surface area
> - Mapping the HDFS permissions model to Windows
> - Abstracted and reconciled mismatches around differences in Path
> semantics in Java and Windows
> - Native Task Controller for Windows
> - Implementation of a Block Placement Policy to support cloud
> environments, more specifically Azure.
> - Implementation of Hadoop native libraries for Windows (compression
> codecs, native I/O) - Several reliability issues, including
> race-conditions, intermittent test failures, resource leaks.
> - Several new unit test cases written for the above changes
>
> In the process, we have closely engaged with the Apache open source
> community and have got great support and assistance from the community
> in terms of contributing fixes, code review comments and commits.
>
> In addition, the Hadoop team at Microsoft has also made good progress
> in other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> Many of these changes have already been committed to the respective
> trunks with help from various committers and contributors. It is great
> to see the commitment of the community to support multiple platforms,
> and we look forward to the day when a developer/customer is able to
> successfully deploy a complete solution stack based on Apache Hadoop
releases.
>
> Next Steps:
>
> All of the above changes are part of the Windows Azure HDInsight and
> HDInsight Server products from Microsoft. We have successfully
> on-boarded several internal customers and have been running production
> workloads on Windows Azure HDInsight. Our vision is to create a big
> data platform based on Hadoop, and we are committed to helping make
> Hadoop a world-class solution that anyone can use to solve their biggest
data challenges.

>
> As an immediate next step, we would like to have a discussion around
> how we can ensure that the quality of the mainline Hadoop branches on
> Windows is maintained. To this end, we would like to get to the state
> where we have pre-checkin validation gates and nightly test runs
> enabled on Windows. If you have any suggestions around this, please do
> send an email.  We are committed to helping sustain the long-term
> quality of Hadoop on both Linux and Windows.
>
> We sincerely thank the community for their contribution and support so
> far. And hope to continue having a close engagement in the future.
>
> -Microsoft HDInsight Team
>
>
> -----Original Message-----
> From: Suresh Srinivas [mailto:[hidden email]]
> Sent: Thursday, February 7, 2013 5:42 PM
> To: [hidden email]; [hidden email];
> [hidden email]; [hidden email]
> Subject: Heads up - merge branch-trunk-win to trunk
>
> The support for Hadoop on Windows was proposed in HADOOP-8079<
> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> The goal was to make Hadoop natively integrated, full-featured, and
> performance and scalability tuned on Windows Server or Windows Azure.
> We are happy to announce that a lot of progress has been made in this
> regard.
>
> Initial work started in a feature branch, branch-1-win, based on
branch-1.

> The details related to the work done in the branch can be seen in
> CHANGES.txt<
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANG
> ES.branch-1-win.txt?view=markup
> >.
> This work has been ported to a branch, branch-trunk-win, based on trunk.
> Merge patch for this is available on
> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> .
>
> Highlights of the work done so far:
> 1. Necessary changes in Hadoop to run natively on Windows. These
> changes handle differences in platforms related to path names,
> process/task management etc.
> 2. Addition of winutils tools for managing file permissions and
> ownership, user group mapping, hardlinks, symbolic links, chmod, disk
> utilization, and process/task management.
> 3. Added cmd scripts equivalent to existing shell scripts
> hadoop-daemon.sh, start and stop scripts.
> 4. Addition of block placement policy implemnation to support cloud
> enviroment, more specifically Azure.
>
> We are very close to wrapping up the work in branch-trunk-win and
> getting ready for a merge. Currently the merge patch is passing close
> to 100% of unit tests on Linux. Soon I will call for a vote to merge
> this branch into trunk.
>
> Next steps:
> 1. Call for vote to merge branch-trunk-win to trunk, when the work
> completes and precommit build is clean.
> 2. Start a discussion on adding Jenkins precommit builds on windows
> and how to integrate that with the existing commit process.
>
> Let me know if you have any questions.
>
> Regards,
> Suresh
>
>


--
http://hortonworks.com/download/
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Chris Nauroth
+1 (non-binding)

I've been testing and patching this branch for the past several months, and
I believe we've reached stability for a merge to trunk.  I want to point
out once again that the branch has been tested on Linux to build confidence
that regressions were not introduced on existing platforms.  I also want to
second the comments from Bikas about how wonderful the collaboration has
been!

Thank you,
--Chris


On Tue, Feb 26, 2013 at 4:30 PM, Bikas Saha <[hidden email]> wrote:

> +1
>
> As someone who has been part of this effort from inception, I am glad that
> we have reached this stable state in the project on both branches of
> Hadoop.
> It has been a great collaboration across teams and engineers and opens up
> Hadoop to a whole new set of deployments and developers!
>
> Bikas
>
> -----Original Message-----
> From: Suresh Srinivas [mailto:[hidden email]]
> Sent: Tuesday, February 26, 2013 2:56 PM
> To: [hidden email]
> Cc: [hidden email]; [hidden email];
> [hidden email]
> Subject: [Vote] Merge branch-trunk-win to trunk
>
> I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
> I am happy to announce that we are ready for the merge.
>
> Here is a brief recap on the highlights of the work done:
> - Command-line scripts for the Hadoop surface area
> - Mapping the HDFS permissions model to Windows
> - Abstracted and reconciled mismatches around differences in Path
> semantics in Java and Windows
> - Native Task Controller for Windows
> - Implementation of a Block Placement Policy to support cloud
> environments, more specifically Azure.
> - Implementation of Hadoop native libraries for Windows (compression
> codecs, native I/O)
> - Several reliability issues, including race-conditions, intermittent test
> failures, resource leaks.
> - Several new unit test cases written for the above changes
>
> Please find the details of the work in CHANGES.branch-trunk-win.txt -
> Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
> and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
> ported from branch-1-win to a branch based on trunk.
>
> For details of the testing done, please see the thread -
> http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
> https://issues.apache.org/jira/browse/HADOOP-8562>.
>
> This was a large undertaking that involved developing code, testing the
> entire Hadoop stack, including scale tests. This is made possible only
> with the contribution from many many folks in the community. Following
> people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas
> Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing
> Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan
> Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
> Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who
> contributed as well providing feedback and comments on numerous jiras.
>
> The vote will run for seven days and will end on March 5, 6:00PM PST.
>
> Regards,
> Suresh
>
>
>
>
> On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> <[hidden email]>wrote:
>
> > It is super exciting to look at the prospect of these changes being
> > merged to trunk. Having Windows as one of the supported Hadoop
> > platforms is a fantastic opportunity both for the Hadoop project and
> Microsoft customers.
> >
> > This work began around a year back when a few of us started with a
> > basic port of Hadoop on Windows. Ever since, the Hadoop team in
> > Microsoft have made significant progress in the following areas:
> > (PS: Some of these items are already included in Suresh's email, but
> > including again for completeness)
> >
> > - Command-line scripts for the Hadoop surface area
> > - Mapping the HDFS permissions model to Windows
> > - Abstracted and reconciled mismatches around differences in Path
> > semantics in Java and Windows
> > - Native Task Controller for Windows
> > - Implementation of a Block Placement Policy to support cloud
> > environments, more specifically Azure.
> > - Implementation of Hadoop native libraries for Windows (compression
> > codecs, native I/O) - Several reliability issues, including
> > race-conditions, intermittent test failures, resource leaks.
> > - Several new unit test cases written for the above changes
> >
> > In the process, we have closely engaged with the Apache open source
> > community and have got great support and assistance from the community
> > in terms of contributing fixes, code review comments and commits.
> >
> > In addition, the Hadoop team at Microsoft has also made good progress
> > in other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> > Many of these changes have already been committed to the respective
> > trunks with help from various committers and contributors. It is great
> > to see the commitment of the community to support multiple platforms,
> > and we look forward to the day when a developer/customer is able to
> > successfully deploy a complete solution stack based on Apache Hadoop
> releases.
> >
> > Next Steps:
> >
> > All of the above changes are part of the Windows Azure HDInsight and
> > HDInsight Server products from Microsoft. We have successfully
> > on-boarded several internal customers and have been running production
> > workloads on Windows Azure HDInsight. Our vision is to create a big
> > data platform based on Hadoop, and we are committed to helping make
> > Hadoop a world-class solution that anyone can use to solve their biggest
> data challenges.
> >
> > As an immediate next step, we would like to have a discussion around
> > how we can ensure that the quality of the mainline Hadoop branches on
> > Windows is maintained. To this end, we would like to get to the state
> > where we have pre-checkin validation gates and nightly test runs
> > enabled on Windows. If you have any suggestions around this, please do
> > send an email.  We are committed to helping sustain the long-term
> > quality of Hadoop on both Linux and Windows.
> >
> > We sincerely thank the community for their contribution and support so
> > far. And hope to continue having a close engagement in the future.
> >
> > -Microsoft HDInsight Team
> >
> >
> > -----Original Message-----
> > From: Suresh Srinivas [mailto:[hidden email]]
> > Sent: Thursday, February 7, 2013 5:42 PM
> > To: [hidden email]; [hidden email];
> > [hidden email]; [hidden email]
> > Subject: Heads up - merge branch-trunk-win to trunk
> >
> > The support for Hadoop on Windows was proposed in HADOOP-8079<
> > https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> > The goal was to make Hadoop natively integrated, full-featured, and
> > performance and scalability tuned on Windows Server or Windows Azure.
> > We are happy to announce that a lot of progress has been made in this
> > regard.
> >
> > Initial work started in a feature branch, branch-1-win, based on
> branch-1.
> > The details related to the work done in the branch can be seen in
> > CHANGES.txt<
> > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANG
> > ES.branch-1-win.txt?view=markup
> > >.
> > This work has been ported to a branch, branch-trunk-win, based on trunk.
> > Merge patch for this is available on
> > HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > .
> >
> > Highlights of the work done so far:
> > 1. Necessary changes in Hadoop to run natively on Windows. These
> > changes handle differences in platforms related to path names,
> > process/task management etc.
> > 2. Addition of winutils tools for managing file permissions and
> > ownership, user group mapping, hardlinks, symbolic links, chmod, disk
> > utilization, and process/task management.
> > 3. Added cmd scripts equivalent to existing shell scripts
> > hadoop-daemon.sh, start and stop scripts.
> > 4. Addition of block placement policy implemnation to support cloud
> > enviroment, more specifically Azure.
> >
> > We are very close to wrapping up the work in branch-trunk-win and
> > getting ready for a merge. Currently the merge patch is passing close
> > to 100% of unit tests on Linux. Soon I will call for a vote to merge
> > this branch into trunk.
> >
> > Next steps:
> > 1. Call for vote to merge branch-trunk-win to trunk, when the work
> > completes and precommit build is clean.
> > 2. Start a discussion on adding Jenkins precommit builds on windows
> > and how to integrate that with the existing commit process.
> >
> > Let me know if you have any questions.
> >
> > Regards,
> > Suresh
> >
> >
>
>
> --
> http://hortonworks.com/download/
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Robert Evans
In reply to this post by Suresh Srinivas-2
After this is merged in is Windows still going to be a second class
citizen but happens to work for more than just development or is it a
fully supported platform where if something breaks it can block a release?
 How do we as a community intend to keep Windows support from breaking?
We don't have any Jenkins slaves to be able to run nightly tests to
validate everything still compiles/runs.  This is not a blocker for me
because we often rely on individuals and groups to test Hadoop, but I do
think we need to have this discussion before we put it in.

--Bobby

On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:

>I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
>I
>am happy to announce that we are ready for the merge.
>
>Here is a brief recap on the highlights of the work done:
>- Command-line scripts for the Hadoop surface area
>- Mapping the HDFS permissions model to Windows
>- Abstracted and reconciled mismatches around differences in Path
>semantics
>in Java and Windows
>- Native Task Controller for Windows
>- Implementation of a Block Placement Policy to support cloud
>environments,
>more specifically Azure.
>- Implementation of Hadoop native libraries for Windows (compression
>codecs, native I/O)
>- Several reliability issues, including race-conditions, intermittent test
>failures, resource leaks.
>- Several new unit test cases written for the above changes
>
>Please find the details of the work in CHANGES.branch-trunk-win.txt -
>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
>ported from branch-1-win to a branch based on trunk.
>
>For details of the testing done, please see the thread -
>http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
>https://issues.apache.org/jira/browse/HADOOP-8562>.
>
>This was a large undertaking that involved developing code, testing the
>entire Hadoop stack, including scale tests. This is made possible only
>with
>the contribution from many many folks in the community. Following people
>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur
>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh
>Srinivas and Sanjay Radia. There are many others who contributed as well
>providing feedback and comments on numerous jiras.
>
>The vote will run for seven days and will end on March 5, 6:00PM PST.
>
>Regards,
>Suresh
>
>
>
>
>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
><[hidden email]>wrote:
>
>> It is super exciting to look at the prospect of these changes being
>>merged
>> to trunk. Having Windows as one of the supported Hadoop platforms is a
>> fantastic opportunity both for the Hadoop project and Microsoft
>>customers.
>>
>> This work began around a year back when a few of us started with a basic
>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have
>> made significant progress in the following areas:
>> (PS: Some of these items are already included in Suresh's email, but
>> including again for completeness)
>>
>> - Command-line scripts for the Hadoop surface area
>> - Mapping the HDFS permissions model to Windows
>> - Abstracted and reconciled mismatches around differences in Path
>> semantics in Java and Windows
>> - Native Task Controller for Windows
>> - Implementation of a Block Placement Policy to support cloud
>> environments, more specifically Azure.
>> - Implementation of Hadoop native libraries for Windows (compression
>> codecs, native I/O) - Several reliability issues, including
>> race-conditions, intermittent test failures, resource leaks.
>> - Several new unit test cases written for the above changes
>>
>> In the process, we have closely engaged with the Apache open source
>> community and have got great support and assistance from the community
>>in
>> terms of contributing fixes, code review comments and commits.
>>
>> In addition, the Hadoop team at Microsoft has also made good progress in
>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many
>>of
>> these changes have already been committed to the respective trunks with
>> help from various committers and contributors. It is great to see the
>> commitment of the community to support multiple platforms, and we look
>> forward to the day when a developer/customer is able to successfully
>>deploy
>> a complete solution stack based on Apache Hadoop releases.
>>
>> Next Steps:
>>
>> All of the above changes are part of the Windows Azure HDInsight and
>> HDInsight Server products from Microsoft. We have successfully
>>on-boarded
>> several internal customers and have been running production workloads on
>> Windows Azure HDInsight. Our vision is to create a big data platform
>>based
>> on Hadoop, and we are committed to helping make Hadoop a world-class
>> solution that anyone can use to solve their biggest data challenges.
>>
>> As an immediate next step, we would like to have a discussion around how
>> we can ensure that the quality of the mainline Hadoop branches on
>>Windows
>> is maintained. To this end, we would like to get to the state where we
>>have
>> pre-checkin validation gates and nightly test runs enabled on Windows.
>>If
>> you have any suggestions around this, please do send an email.  We are
>> committed to helping sustain the long-term quality of Hadoop on both
>>Linux
>> and Windows.
>>
>> We sincerely thank the community for their contribution and support so
>> far. And hope to continue having a close engagement in the future.
>>
>> -Microsoft HDInsight Team
>>
>>
>> -----Original Message-----
>> From: Suresh Srinivas [mailto:[hidden email]]
>> Sent: Thursday, February 7, 2013 5:42 PM
>> To: [hidden email]; [hidden email];
>> [hidden email]; [hidden email]
>> Subject: Heads up - merge branch-trunk-win to trunk
>>
>> The support for Hadoop on Windows was proposed in HADOOP-8079<
>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
>>The
>> goal was to make Hadoop natively integrated, full-featured, and
>>performance
>> and scalability tuned on Windows Server or Windows Azure.
>> We are happy to announce that a lot of progress has been made in this
>> regard.
>>
>> Initial work started in a feature branch, branch-1-win, based on
>>branch-1.
>> The details related to the work done in the branch can be seen in
>> CHANGES.txt<
>>
>>http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.
>>branch-1-win.txt?view=markup
>> >.
>> This work has been ported to a branch, branch-trunk-win, based on trunk.
>> Merge patch for this is available on
>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
>> .
>>
>> Highlights of the work done so far:
>> 1. Necessary changes in Hadoop to run natively on Windows. These changes
>> handle differences in platforms related to path names, process/task
>> management etc.
>> 2. Addition of winutils tools for managing file permissions and
>>ownership,
>> user group mapping, hardlinks, symbolic links, chmod, disk utilization,
>>and
>> process/task management.
>> 3. Added cmd scripts equivalent to existing shell scripts
>> hadoop-daemon.sh, start and stop scripts.
>> 4. Addition of block placement policy implemnation to support cloud
>> enviroment, more specifically Azure.
>>
>> We are very close to wrapping up the work in branch-trunk-win and
>>getting
>> ready for a merge. Currently the merge patch is passing close to 100% of
>> unit tests on Linux. Soon I will call for a vote to merge this branch
>>into
>> trunk.
>>
>> Next steps:
>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
>> completes and precommit build is clean.
>> 2. Start a discussion on adding Jenkins precommit builds on windows and
>> how to integrate that with the existing commit process.
>>
>> Let me know if you have any questions.
>>
>> Regards,
>> Suresh
>>
>>
>
>
>--
>http://hortonworks.com/download/

Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Harsh J-2
Similar personal concern as Robert: Does this bring about a
development process change? Do new features all need to work on
Windows as well to go into trunk (i.e. immediately or eventually,
either way requires a new policy for all of us devs)? Not that anyone
would be avoiding doing that, I just ask cause it could impact time
and effort required for any major undertaking.

While most of the project today is cross-platform (via Java, etc.,
which thankfully remove path handling problems and the sorts),
performance improvements at least are going the native side these
days, which is where I see this have some impact. We've not been
perfectly successful in having the natives continuously work on
Solaris/etc. in the past, mainly due to the platform focus of the
majority (not all) of devs working on the project(s). Some form of a
development policy here would ensure proper Windows support for the
features we intend to ship along are not very divergent such that we
end up having to maintain docs as well, detailing each task yet to be
done (these tend to grow if allowed).

Useful to also note from another OSS project KDE, that their working
builds of Windows are usually on 1-2 releases of the past (i.e. for
example, KDE release is currently at 4.10, but the last released
Windows port is still 4.8 today). KDE uses Qt, which is cross-platform
by itself, but there's still a port team and a ported release
maintained separately (but under the same org.) due to the major
development happening on Linux. Same case for *BSD as well. My own
patches there at some point have caused trouble cause I did something
that I only tested on one platform, and about a bit later things got
revised to support the other ones where it was unnecessarily breaking.

Or if am being too concerned about feature/performance divergence, let me know.

On Wed, Feb 27, 2013 at 9:47 PM, Robert Evans <[hidden email]> wrote:

> After this is merged in is Windows still going to be a second class
> citizen but happens to work for more than just development or is it a
> fully supported platform where if something breaks it can block a release?
>  How do we as a community intend to keep Windows support from breaking?
> We don't have any Jenkins slaves to be able to run nightly tests to
> validate everything still compiles/runs.  This is not a blocker for me
> because we often rely on individuals and groups to test Hadoop, but I do
> think we need to have this discussion before we put it in.
>
> --Bobby
>
> On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
>
>>I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
>>I
>>am happy to announce that we are ready for the merge.
>>
>>Here is a brief recap on the highlights of the work done:
>>- Command-line scripts for the Hadoop surface area
>>- Mapping the HDFS permissions model to Windows
>>- Abstracted and reconciled mismatches around differences in Path
>>semantics
>>in Java and Windows
>>- Native Task Controller for Windows
>>- Implementation of a Block Placement Policy to support cloud
>>environments,
>>more specifically Azure.
>>- Implementation of Hadoop native libraries for Windows (compression
>>codecs, native I/O)
>>- Several reliability issues, including race-conditions, intermittent test
>>failures, resource leaks.
>>- Several new unit test cases written for the above changes
>>
>>Please find the details of the work in CHANGES.branch-trunk-win.txt -
>>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
>>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
>>ported from branch-1-win to a branch based on trunk.
>>
>>For details of the testing done, please see the thread -
>>http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
>>https://issues.apache.org/jira/browse/HADOOP-8562>.
>>
>>This was a large undertaking that involved developing code, testing the
>>entire Hadoop stack, including scale tests. This is made possible only
>>with
>>the contribution from many many folks in the community. Following people
>>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
>>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur
>>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
>>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
>>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh
>>Srinivas and Sanjay Radia. There are many others who contributed as well
>>providing feedback and comments on numerous jiras.
>>
>>The vote will run for seven days and will end on March 5, 6:00PM PST.
>>
>>Regards,
>>Suresh
>>
>>
>>
>>
>>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
>><[hidden email]>wrote:
>>
>>> It is super exciting to look at the prospect of these changes being
>>>merged
>>> to trunk. Having Windows as one of the supported Hadoop platforms is a
>>> fantastic opportunity both for the Hadoop project and Microsoft
>>>customers.
>>>
>>> This work began around a year back when a few of us started with a basic
>>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have
>>> made significant progress in the following areas:
>>> (PS: Some of these items are already included in Suresh's email, but
>>> including again for completeness)
>>>
>>> - Command-line scripts for the Hadoop surface area
>>> - Mapping the HDFS permissions model to Windows
>>> - Abstracted and reconciled mismatches around differences in Path
>>> semantics in Java and Windows
>>> - Native Task Controller for Windows
>>> - Implementation of a Block Placement Policy to support cloud
>>> environments, more specifically Azure.
>>> - Implementation of Hadoop native libraries for Windows (compression
>>> codecs, native I/O) - Several reliability issues, including
>>> race-conditions, intermittent test failures, resource leaks.
>>> - Several new unit test cases written for the above changes
>>>
>>> In the process, we have closely engaged with the Apache open source
>>> community and have got great support and assistance from the community
>>>in
>>> terms of contributing fixes, code review comments and commits.
>>>
>>> In addition, the Hadoop team at Microsoft has also made good progress in
>>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many
>>>of
>>> these changes have already been committed to the respective trunks with
>>> help from various committers and contributors. It is great to see the
>>> commitment of the community to support multiple platforms, and we look
>>> forward to the day when a developer/customer is able to successfully
>>>deploy
>>> a complete solution stack based on Apache Hadoop releases.
>>>
>>> Next Steps:
>>>
>>> All of the above changes are part of the Windows Azure HDInsight and
>>> HDInsight Server products from Microsoft. We have successfully
>>>on-boarded
>>> several internal customers and have been running production workloads on
>>> Windows Azure HDInsight. Our vision is to create a big data platform
>>>based
>>> on Hadoop, and we are committed to helping make Hadoop a world-class
>>> solution that anyone can use to solve their biggest data challenges.
>>>
>>> As an immediate next step, we would like to have a discussion around how
>>> we can ensure that the quality of the mainline Hadoop branches on
>>>Windows
>>> is maintained. To this end, we would like to get to the state where we
>>>have
>>> pre-checkin validation gates and nightly test runs enabled on Windows.
>>>If
>>> you have any suggestions around this, please do send an email.  We are
>>> committed to helping sustain the long-term quality of Hadoop on both
>>>Linux
>>> and Windows.
>>>
>>> We sincerely thank the community for their contribution and support so
>>> far. And hope to continue having a close engagement in the future.
>>>
>>> -Microsoft HDInsight Team
>>>
>>>
>>> -----Original Message-----
>>> From: Suresh Srinivas [mailto:[hidden email]]
>>> Sent: Thursday, February 7, 2013 5:42 PM
>>> To: [hidden email]; [hidden email];
>>> [hidden email]; [hidden email]
>>> Subject: Heads up - merge branch-trunk-win to trunk
>>>
>>> The support for Hadoop on Windows was proposed in HADOOP-8079<
>>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
>>>The
>>> goal was to make Hadoop natively integrated, full-featured, and
>>>performance
>>> and scalability tuned on Windows Server or Windows Azure.
>>> We are happy to announce that a lot of progress has been made in this
>>> regard.
>>>
>>> Initial work started in a feature branch, branch-1-win, based on
>>>branch-1.
>>> The details related to the work done in the branch can be seen in
>>> CHANGES.txt<
>>>
>>>http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.
>>>branch-1-win.txt?view=markup
>>> >.
>>> This work has been ported to a branch, branch-trunk-win, based on trunk.
>>> Merge patch for this is available on
>>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
>>> .
>>>
>>> Highlights of the work done so far:
>>> 1. Necessary changes in Hadoop to run natively on Windows. These changes
>>> handle differences in platforms related to path names, process/task
>>> management etc.
>>> 2. Addition of winutils tools for managing file permissions and
>>>ownership,
>>> user group mapping, hardlinks, symbolic links, chmod, disk utilization,
>>>and
>>> process/task management.
>>> 3. Added cmd scripts equivalent to existing shell scripts
>>> hadoop-daemon.sh, start and stop scripts.
>>> 4. Addition of block placement policy implemnation to support cloud
>>> enviroment, more specifically Azure.
>>>
>>> We are very close to wrapping up the work in branch-trunk-win and
>>>getting
>>> ready for a merge. Currently the merge patch is passing close to 100% of
>>> unit tests on Linux. Soon I will call for a vote to merge this branch
>>>into
>>> trunk.
>>>
>>> Next steps:
>>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
>>> completes and precommit build is clean.
>>> 2. Start a discussion on adding Jenkins precommit builds on windows and
>>> how to integrate that with the existing commit process.
>>>
>>> Let me know if you have any questions.
>>>
>>> Regards,
>>> Suresh
>>>
>>>
>>
>>
>>--
>>http://hortonworks.com/download/
>



--
Harsh J
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Eli Collins
In reply to this post by Robert Evans
Bobby raises some good questions.  A related one, since most current
developers won't add Windows support for new features that are
platform specific is it assumed that Windows development will either
lag or will people actively work on keeping Windows up with the
latest?  And vice versa in case Windows support is implemented first.

Is there a jira for resolving the outstanding TODOs in the code base
(similar to HDFS-2148)?  Looks like this merge doesn't introduce many
which is great (just did a quick diff and grep).

Thanks,
Eli

On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]> wrote:

> After this is merged in is Windows still going to be a second class
> citizen but happens to work for more than just development or is it a
> fully supported platform where if something breaks it can block a release?
>  How do we as a community intend to keep Windows support from breaking?
> We don't have any Jenkins slaves to be able to run nightly tests to
> validate everything still compiles/runs.  This is not a blocker for me
> because we often rely on individuals and groups to test Hadoop, but I do
> think we need to have this discussion before we put it in.
>
> --Bobby
>
> On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
>
>>I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
>>I
>>am happy to announce that we are ready for the merge.
>>
>>Here is a brief recap on the highlights of the work done:
>>- Command-line scripts for the Hadoop surface area
>>- Mapping the HDFS permissions model to Windows
>>- Abstracted and reconciled mismatches around differences in Path
>>semantics
>>in Java and Windows
>>- Native Task Controller for Windows
>>- Implementation of a Block Placement Policy to support cloud
>>environments,
>>more specifically Azure.
>>- Implementation of Hadoop native libraries for Windows (compression
>>codecs, native I/O)
>>- Several reliability issues, including race-conditions, intermittent test
>>failures, resource leaks.
>>- Several new unit test cases written for the above changes
>>
>>Please find the details of the work in CHANGES.branch-trunk-win.txt -
>>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
>>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
>>ported from branch-1-win to a branch based on trunk.
>>
>>For details of the testing done, please see the thread -
>>http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
>>https://issues.apache.org/jira/browse/HADOOP-8562>.
>>
>>This was a large undertaking that involved developing code, testing the
>>entire Hadoop stack, including scale tests. This is made possible only
>>with
>>the contribution from many many folks in the community. Following people
>>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
>>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur
>>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
>>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
>>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh
>>Srinivas and Sanjay Radia. There are many others who contributed as well
>>providing feedback and comments on numerous jiras.
>>
>>The vote will run for seven days and will end on March 5, 6:00PM PST.
>>
>>Regards,
>>Suresh
>>
>>
>>
>>
>>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
>><[hidden email]>wrote:
>>
>>> It is super exciting to look at the prospect of these changes being
>>>merged
>>> to trunk. Having Windows as one of the supported Hadoop platforms is a
>>> fantastic opportunity both for the Hadoop project and Microsoft
>>>customers.
>>>
>>> This work began around a year back when a few of us started with a basic
>>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have
>>> made significant progress in the following areas:
>>> (PS: Some of these items are already included in Suresh's email, but
>>> including again for completeness)
>>>
>>> - Command-line scripts for the Hadoop surface area
>>> - Mapping the HDFS permissions model to Windows
>>> - Abstracted and reconciled mismatches around differences in Path
>>> semantics in Java and Windows
>>> - Native Task Controller for Windows
>>> - Implementation of a Block Placement Policy to support cloud
>>> environments, more specifically Azure.
>>> - Implementation of Hadoop native libraries for Windows (compression
>>> codecs, native I/O) - Several reliability issues, including
>>> race-conditions, intermittent test failures, resource leaks.
>>> - Several new unit test cases written for the above changes
>>>
>>> In the process, we have closely engaged with the Apache open source
>>> community and have got great support and assistance from the community
>>>in
>>> terms of contributing fixes, code review comments and commits.
>>>
>>> In addition, the Hadoop team at Microsoft has also made good progress in
>>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many
>>>of
>>> these changes have already been committed to the respective trunks with
>>> help from various committers and contributors. It is great to see the
>>> commitment of the community to support multiple platforms, and we look
>>> forward to the day when a developer/customer is able to successfully
>>>deploy
>>> a complete solution stack based on Apache Hadoop releases.
>>>
>>> Next Steps:
>>>
>>> All of the above changes are part of the Windows Azure HDInsight and
>>> HDInsight Server products from Microsoft. We have successfully
>>>on-boarded
>>> several internal customers and have been running production workloads on
>>> Windows Azure HDInsight. Our vision is to create a big data platform
>>>based
>>> on Hadoop, and we are committed to helping make Hadoop a world-class
>>> solution that anyone can use to solve their biggest data challenges.
>>>
>>> As an immediate next step, we would like to have a discussion around how
>>> we can ensure that the quality of the mainline Hadoop branches on
>>>Windows
>>> is maintained. To this end, we would like to get to the state where we
>>>have
>>> pre-checkin validation gates and nightly test runs enabled on Windows.
>>>If
>>> you have any suggestions around this, please do send an email.  We are
>>> committed to helping sustain the long-term quality of Hadoop on both
>>>Linux
>>> and Windows.
>>>
>>> We sincerely thank the community for their contribution and support so
>>> far. And hope to continue having a close engagement in the future.
>>>
>>> -Microsoft HDInsight Team
>>>
>>>
>>> -----Original Message-----
>>> From: Suresh Srinivas [mailto:[hidden email]]
>>> Sent: Thursday, February 7, 2013 5:42 PM
>>> To: [hidden email]; [hidden email];
>>> [hidden email]; [hidden email]
>>> Subject: Heads up - merge branch-trunk-win to trunk
>>>
>>> The support for Hadoop on Windows was proposed in HADOOP-8079<
>>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
>>>The
>>> goal was to make Hadoop natively integrated, full-featured, and
>>>performance
>>> and scalability tuned on Windows Server or Windows Azure.
>>> We are happy to announce that a lot of progress has been made in this
>>> regard.
>>>
>>> Initial work started in a feature branch, branch-1-win, based on
>>>branch-1.
>>> The details related to the work done in the branch can be seen in
>>> CHANGES.txt<
>>>
>>>http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.
>>>branch-1-win.txt?view=markup
>>> >.
>>> This work has been ported to a branch, branch-trunk-win, based on trunk.
>>> Merge patch for this is available on
>>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
>>> .
>>>
>>> Highlights of the work done so far:
>>> 1. Necessary changes in Hadoop to run natively on Windows. These changes
>>> handle differences in platforms related to path names, process/task
>>> management etc.
>>> 2. Addition of winutils tools for managing file permissions and
>>>ownership,
>>> user group mapping, hardlinks, symbolic links, chmod, disk utilization,
>>>and
>>> process/task management.
>>> 3. Added cmd scripts equivalent to existing shell scripts
>>> hadoop-daemon.sh, start and stop scripts.
>>> 4. Addition of block placement policy implemnation to support cloud
>>> enviroment, more specifically Azure.
>>>
>>> We are very close to wrapping up the work in branch-trunk-win and
>>>getting
>>> ready for a merge. Currently the merge patch is passing close to 100% of
>>> unit tests on Linux. Soon I will call for a vote to merge this branch
>>>into
>>> trunk.
>>>
>>> Next steps:
>>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
>>> completes and precommit build is clean.
>>> 2. Start a discussion on adding Jenkins precommit builds on windows and
>>> how to integrate that with the existing commit process.
>>>
>>> Let me know if you have any questions.
>>>
>>> Regards,
>>> Suresh
>>>
>>>
>>
>>
>>--
>>http://hortonworks.com/download/
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Arpit Agarwal
+1 non-binding.

I have extensively tested this on both Windows and Linux over the last few
months.

Thanks,
-Arpit

On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:

> Bobby raises some good questions.  A related one, since most current
> developers won't add Windows support for new features that are
> platform specific is it assumed that Windows development will either
> lag or will people actively work on keeping Windows up with the
> latest?  And vice versa in case Windows support is implemented first.
>
> Is there a jira for resolving the outstanding TODOs in the code base
> (similar to HDFS-2148)?  Looks like this merge doesn't introduce many
> which is great (just did a quick diff and grep).
>
> Thanks,
> Eli
>
> On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]> wrote:
> > After this is merged in is Windows still going to be a second class
> > citizen but happens to work for more than just development or is it a
> > fully supported platform where if something breaks it can block a
> release?
> >  How do we as a community intend to keep Windows support from breaking?
> > We don't have any Jenkins slaves to be able to run nightly tests to
> > validate everything still compiles/runs.  This is not a blocker for me
> > because we often rely on individuals and groups to test Hadoop, but I do
> > think we need to have this discussion before we put it in.
> >
> > --Bobby
> >
> > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
> >
> >>I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
> >>I
> >>am happy to announce that we are ready for the merge.
> >>
> >>Here is a brief recap on the highlights of the work done:
> >>- Command-line scripts for the Hadoop surface area
> >>- Mapping the HDFS permissions model to Windows
> >>- Abstracted and reconciled mismatches around differences in Path
> >>semantics
> >>in Java and Windows
> >>- Native Task Controller for Windows
> >>- Implementation of a Block Placement Policy to support cloud
> >>environments,
> >>more specifically Azure.
> >>- Implementation of Hadoop native libraries for Windows (compression
> >>codecs, native I/O)
> >>- Several reliability issues, including race-conditions, intermittent
> test
> >>failures, resource leaks.
> >>- Several new unit test cases written for the above changes
> >>
> >>Please find the details of the work in CHANGES.branch-trunk-win.txt -
> >>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9
> >,
> >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
> >>ported from branch-1-win to a branch based on trunk.
> >>
> >>For details of the testing done, please see the thread -
> >>http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
> >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> >>
> >>This was a large undertaking that involved developing code, testing the
> >>entire Hadoop stack, including scale tests. This is made possible only
> >>with
> >>the contribution from many many folks in the community. Following people
> >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
> >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> Sumadhur
> >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
> >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
> >>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze,
> Suresh
> >>Srinivas and Sanjay Radia. There are many others who contributed as well
> >>providing feedback and comments on numerous jiras.
> >>
> >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> >>
> >>Regards,
> >>Suresh
> >>
> >>
> >>
> >>
> >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> >><[hidden email]>wrote:
> >>
> >>> It is super exciting to look at the prospect of these changes being
> >>>merged
> >>> to trunk. Having Windows as one of the supported Hadoop platforms is a
> >>> fantastic opportunity both for the Hadoop project and Microsoft
> >>>customers.
> >>>
> >>> This work began around a year back when a few of us started with a
> basic
> >>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft
> have
> >>> made significant progress in the following areas:
> >>> (PS: Some of these items are already included in Suresh's email, but
> >>> including again for completeness)
> >>>
> >>> - Command-line scripts for the Hadoop surface area
> >>> - Mapping the HDFS permissions model to Windows
> >>> - Abstracted and reconciled mismatches around differences in Path
> >>> semantics in Java and Windows
> >>> - Native Task Controller for Windows
> >>> - Implementation of a Block Placement Policy to support cloud
> >>> environments, more specifically Azure.
> >>> - Implementation of Hadoop native libraries for Windows (compression
> >>> codecs, native I/O) - Several reliability issues, including
> >>> race-conditions, intermittent test failures, resource leaks.
> >>> - Several new unit test cases written for the above changes
> >>>
> >>> In the process, we have closely engaged with the Apache open source
> >>> community and have got great support and assistance from the community
> >>>in
> >>> terms of contributing fixes, code review comments and commits.
> >>>
> >>> In addition, the Hadoop team at Microsoft has also made good progress
> in
> >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many
> >>>of
> >>> these changes have already been committed to the respective trunks with
> >>> help from various committers and contributors. It is great to see the
> >>> commitment of the community to support multiple platforms, and we look
> >>> forward to the day when a developer/customer is able to successfully
> >>>deploy
> >>> a complete solution stack based on Apache Hadoop releases.
> >>>
> >>> Next Steps:
> >>>
> >>> All of the above changes are part of the Windows Azure HDInsight and
> >>> HDInsight Server products from Microsoft. We have successfully
> >>>on-boarded
> >>> several internal customers and have been running production workloads
> on
> >>> Windows Azure HDInsight. Our vision is to create a big data platform
> >>>based
> >>> on Hadoop, and we are committed to helping make Hadoop a world-class
> >>> solution that anyone can use to solve their biggest data challenges.
> >>>
> >>> As an immediate next step, we would like to have a discussion around
> how
> >>> we can ensure that the quality of the mainline Hadoop branches on
> >>>Windows
> >>> is maintained. To this end, we would like to get to the state where we
> >>>have
> >>> pre-checkin validation gates and nightly test runs enabled on Windows.
> >>>If
> >>> you have any suggestions around this, please do send an email.  We are
> >>> committed to helping sustain the long-term quality of Hadoop on both
> >>>Linux
> >>> and Windows.
> >>>
> >>> We sincerely thank the community for their contribution and support so
> >>> far. And hope to continue having a close engagement in the future.
> >>>
> >>> -Microsoft HDInsight Team
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Suresh Srinivas [mailto:[hidden email]]
> >>> Sent: Thursday, February 7, 2013 5:42 PM
> >>> To: [hidden email]; [hidden email];
> >>> [hidden email]; [hidden email]
> >>> Subject: Heads up - merge branch-trunk-win to trunk
> >>>
> >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> >>>The
> >>> goal was to make Hadoop natively integrated, full-featured, and
> >>>performance
> >>> and scalability tuned on Windows Server or Windows Azure.
> >>> We are happy to announce that a lot of progress has been made in this
> >>> regard.
> >>>
> >>> Initial work started in a feature branch, branch-1-win, based on
> >>>branch-1.
> >>> The details related to the work done in the branch can be seen in
> >>> CHANGES.txt<
> >>>
> >>>
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.
> >>>branch-1-win.txt?view=markup
> >>> >.
> >>> This work has been ported to a branch, branch-trunk-win, based on
> trunk.
> >>> Merge patch for this is available on
> >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> >>> .
> >>>
> >>> Highlights of the work done so far:
> >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> changes
> >>> handle differences in platforms related to path names, process/task
> >>> management etc.
> >>> 2. Addition of winutils tools for managing file permissions and
> >>>ownership,
> >>> user group mapping, hardlinks, symbolic links, chmod, disk utilization,
> >>>and
> >>> process/task management.
> >>> 3. Added cmd scripts equivalent to existing shell scripts
> >>> hadoop-daemon.sh, start and stop scripts.
> >>> 4. Addition of block placement policy implemnation to support cloud
> >>> enviroment, more specifically Azure.
> >>>
> >>> We are very close to wrapping up the work in branch-trunk-win and
> >>>getting
> >>> ready for a merge. Currently the merge patch is passing close to 100%
> of
> >>> unit tests on Linux. Soon I will call for a vote to merge this branch
> >>>into
> >>> trunk.
> >>>
> >>> Next steps:
> >>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
> >>> completes and precommit build is clean.
> >>> 2. Start a discussion on adding Jenkins precommit builds on windows and
> >>> how to integrate that with the existing commit process.
> >>>
> >>> Let me know if you have any questions.
> >>>
> >>> Regards,
> >>> Suresh
> >>>
> >>>
> >>
> >>
> >>--
> >>http://hortonworks.com/download/
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Suresh Srinivas-2
In reply to this post by Eli Collins
Thanks for raising good questions.

Currently the merge patch passes all the tests on Linux, hence
the proposal for merging the patch to trunk. But as Bobby, Harsh
and Eli pointed out, before declaring support for Windows, we need the
discussion on the following:

1. Precommit and development process
Jenkins infrastructure for Windows build will be made available.
Giri and Microsoft contributors have volunteered to help make
this happen.

With that we need to decide how our precommit process looks.
My inclination is to wait for +1 from precommit builds on
both the platforms to ensure no issues are introduced.
Thoughts?

2. Feature development impact
Some questions have been raised about would new features
need to be supported on both the platforms. Yes. I do not see a
reason why features cannot work on both the platforms, with
the exception of platform specific optimizations. This what Java
gives us.

3. Platform specific features/optimizations
As regards platform specific optimization, each platform can
evolve at its own pace and should not block progress of a
specific platform.

As indicated in my earlier email, there is a sizable number
of contributors to work on issues and support of Hadoop on Windows
platform. I am excited to see Hadoop reach the other large
part of server market.

Eli, as pointed out by you, the TODO items need to be addressed.
Also we realized we still need to add information on how to
build on Windows in BUILDING.txt. We will address this ASAP.
Giri and Matt have some expirience with this and should be able
to provide more information.



On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:

> Bobby raises some good questions.  A related one, since most current
> developers won't add Windows support for new features that are
> platform specific is it assumed that Windows development will either
> lag or will people actively work on keeping Windows up with the
> latest?  And vice versa in case Windows support is implemented first.
>
> Is there a jira for resolving the outstanding TODOs in the code base
> (similar to HDFS-2148)?  Looks like this merge doesn't introduce many
> which is great (just did a quick diff and grep).
>
> Thanks,
> Eli
>
> On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]> wrote:
> > After this is merged in is Windows still going to be a second class
> > citizen but happens to work for more than just development or is it a
> > fully supported platform where if something breaks it can block a
> release?
> >  How do we as a community intend to keep Windows support from breaking?
> > We don't have any Jenkins slaves to be able to run nightly tests to
> > validate everything still compiles/runs.  This is not a blocker for me
> > because we often rely on individuals and groups to test Hadoop, but I do
> > think we need to have this discussion before we put it in.
> >
> > --Bobby
> >
> > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
> >
> >>I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.
> >>I
> >>am happy to announce that we are ready for the merge.
> >>
> >>Here is a brief recap on the highlights of the work done:
> >>- Command-line scripts for the Hadoop surface area
> >>- Mapping the HDFS permissions model to Windows
> >>- Abstracted and reconciled mismatches around differences in Path
> >>semantics
> >>in Java and Windows
> >>- Native Task Controller for Windows
> >>- Implementation of a Block Placement Policy to support cloud
> >>environments,
> >>more specifically Azure.
> >>- Implementation of Hadoop native libraries for Windows (compression
> >>codecs, native I/O)
> >>- Several reliability issues, including race-conditions, intermittent
> test
> >>failures, resource leaks.
> >>- Several new unit test cases written for the above changes
> >>
> >>Please find the details of the work in CHANGES.branch-trunk-win.txt -
> >>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9
> >,
> >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
> >>ported from branch-1-win to a branch based on trunk.
> >>
> >>For details of the testing done, please see the thread -
> >>http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
> >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> >>
> >>This was a large undertaking that involved developing code, testing the
> >>entire Hadoop stack, including scale tests. This is made possible only
> >>with
> >>the contribution from many many folks in the community. Following people
> >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
> >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> Sumadhur
> >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
> >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
> >>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze,
> Suresh
> >>Srinivas and Sanjay Radia. There are many others who contributed as well
> >>providing feedback and comments on numerous jiras.
> >>
> >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> >>
> >>Regards,
> >>Suresh
> >>
> >>
> >>
> >>
> >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> >><[hidden email]>wrote:
> >>
> >>> It is super exciting to look at the prospect of these changes being
> >>>merged
> >>> to trunk. Having Windows as one of the supported Hadoop platforms is a
> >>> fantastic opportunity both for the Hadoop project and Microsoft
> >>>customers.
> >>>
> >>> This work began around a year back when a few of us started with a
> basic
> >>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft
> have
> >>> made significant progress in the following areas:
> >>> (PS: Some of these items are already included in Suresh's email, but
> >>> including again for completeness)
> >>>
> >>> - Command-line scripts for the Hadoop surface area
> >>> - Mapping the HDFS permissions model to Windows
> >>> - Abstracted and reconciled mismatches around differences in Path
> >>> semantics in Java and Windows
> >>> - Native Task Controller for Windows
> >>> - Implementation of a Block Placement Policy to support cloud
> >>> environments, more specifically Azure.
> >>> - Implementation of Hadoop native libraries for Windows (compression
> >>> codecs, native I/O) - Several reliability issues, including
> >>> race-conditions, intermittent test failures, resource leaks.
> >>> - Several new unit test cases written for the above changes
> >>>
> >>> In the process, we have closely engaged with the Apache open source
> >>> community and have got great support and assistance from the community
> >>>in
> >>> terms of contributing fixes, code review comments and commits.
> >>>
> >>> In addition, the Hadoop team at Microsoft has also made good progress
> in
> >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many
> >>>of
> >>> these changes have already been committed to the respective trunks with
> >>> help from various committers and contributors. It is great to see the
> >>> commitment of the community to support multiple platforms, and we look
> >>> forward to the day when a developer/customer is able to successfully
> >>>deploy
> >>> a complete solution stack based on Apache Hadoop releases.
> >>>
> >>> Next Steps:
> >>>
> >>> All of the above changes are part of the Windows Azure HDInsight and
> >>> HDInsight Server products from Microsoft. We have successfully
> >>>on-boarded
> >>> several internal customers and have been running production workloads
> on
> >>> Windows Azure HDInsight. Our vision is to create a big data platform
> >>>based
> >>> on Hadoop, and we are committed to helping make Hadoop a world-class
> >>> solution that anyone can use to solve their biggest data challenges.
> >>>
> >>> As an immediate next step, we would like to have a discussion around
> how
> >>> we can ensure that the quality of the mainline Hadoop branches on
> >>>Windows
> >>> is maintained. To this end, we would like to get to the state where we
> >>>have
> >>> pre-checkin validation gates and nightly test runs enabled on Windows.
> >>>If
> >>> you have any suggestions around this, please do send an email.  We are
> >>> committed to helping sustain the long-term quality of Hadoop on both
> >>>Linux
> >>> and Windows.
> >>>
> >>> We sincerely thank the community for their contribution and support so
> >>> far. And hope to continue having a close engagement in the future.
> >>>
> >>> -Microsoft HDInsight Team
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Suresh Srinivas [mailto:[hidden email]]
> >>> Sent: Thursday, February 7, 2013 5:42 PM
> >>> To: [hidden email]; [hidden email];
> >>> [hidden email]; [hidden email]
> >>> Subject: Heads up - merge branch-trunk-win to trunk
> >>>
> >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> >>>The
> >>> goal was to make Hadoop natively integrated, full-featured, and
> >>>performance
> >>> and scalability tuned on Windows Server or Windows Azure.
> >>> We are happy to announce that a lot of progress has been made in this
> >>> regard.
> >>>
> >>> Initial work started in a feature branch, branch-1-win, based on
> >>>branch-1.
> >>> The details related to the work done in the branch can be seen in
> >>> CHANGES.txt<
> >>>
> >>>
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.
> >>>branch-1-win.txt?view=markup
> >>> >.
> >>> This work has been ported to a branch, branch-trunk-win, based on
> trunk.
> >>> Merge patch for this is available on
> >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> >>> .
> >>>
> >>> Highlights of the work done so far:
> >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> changes
> >>> handle differences in platforms related to path names, process/task
> >>> management etc.
> >>> 2. Addition of winutils tools for managing file permissions and
> >>>ownership,
> >>> user group mapping, hardlinks, symbolic links, chmod, disk utilization,
> >>>and
> >>> process/task management.
> >>> 3. Added cmd scripts equivalent to existing shell scripts
> >>> hadoop-daemon.sh, start and stop scripts.
> >>> 4. Addition of block placement policy implemnation to support cloud
> >>> enviroment, more specifically Azure.
> >>>
> >>> We are very close to wrapping up the work in branch-trunk-win and
> >>>getting
> >>> ready for a merge. Currently the merge patch is passing close to 100%
> of
> >>> unit tests on Linux. Soon I will call for a vote to merge this branch
> >>>into
> >>> trunk.
> >>>
> >>> Next steps:
> >>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
> >>> completes and precommit build is clean.
> >>> 2. Start a discussion on adding Jenkins precommit builds on windows and
> >>> how to integrate that with the existing commit process.
> >>>
> >>> Let me know if you have any questions.
> >>>
> >>> Regards,
> >>> Suresh
> >>>
> >>>
> >>
> >>
> >>--
> >>http://hortonworks.com/download/
> >
>



--
http://hortonworks.com/download/
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Todd Lipcon
On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <[hidden email]>wrote:

> With that we need to decide how our precommit process looks.
> My inclination is to wait for +1 from precommit builds on
> both the platforms to ensure no issues are introduced.
> Thoughts?
>
> 2. Feature development impact
> Some questions have been raised about would new features
> need to be supported on both the platforms. Yes. I do not see a
> reason why features cannot work on both the platforms, with
> the exception of platform specific optimizations. This what Java
> gives us.
>
>
I'm concerned about the above. Personally, I don't have access to any
Windows boxes with development tools, and I know nothing about developing
on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated,
for powerpoint :)

If I submit a patch and it gets -1 "tests failed" on the Windows slave, how
am I supposed to proceed?

I think a reasonable compromise would be that the tests should always
*build* on Windows before commit, and contributors should do their best to
look at the test logs for any Windows-specific failures. But, beyond
looking at the logs, a "-1 Tests failed on windows" should not block a
commit.

Those contributors who are interested in Windows being a first-class
platform should be responsible for watching the Windows builds and
debugging/fixing any regressions that might be Windows-specific.

I also think the KDE model that Harsh pointed out is an interesting one --
ie the idea that we would not merge windows support to trunk, but rather
treat is as a "parallel code line" which lives in the ASF and has its own
builds and releases. The windows team would periodically merge trunk->win
to pick up any new changes, and do a separate test/release process. I'm not
convinced this is the best idea, but worth discussion of pros and cons.

-Todd


>
> On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:
>
> > Bobby raises some good questions.  A related one, since most current
> > developers won't add Windows support for new features that are
> > platform specific is it assumed that Windows development will either
> > lag or will people actively work on keeping Windows up with the
> > latest?  And vice versa in case Windows support is implemented first.
> >
> > Is there a jira for resolving the outstanding TODOs in the code base
> > (similar to HDFS-2148)?  Looks like this merge doesn't introduce many
> > which is great (just did a quick diff and grep).
> >
> > Thanks,
> > Eli
> >
> > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
> wrote:
> > > After this is merged in is Windows still going to be a second class
> > > citizen but happens to work for more than just development or is it a
> > > fully supported platform where if something breaks it can block a
> > release?
> > >  How do we as a community intend to keep Windows support from breaking?
> > > We don't have any Jenkins slaves to be able to run nightly tests to
> > > validate everything still compiles/runs.  This is not a blocker for me
> > > because we often rely on individuals and groups to test Hadoop, but I
> do
> > > think we need to have this discussion before we put it in.
> > >
> > > --Bobby
> > >
> > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
> > >
> > >>I had posted heads up about merging branch-trunk-win to trunk on Feb
> 8th.
> > >>I
> > >>am happy to announce that we are ready for the merge.
> > >>
> > >>Here is a brief recap on the highlights of the work done:
> > >>- Command-line scripts for the Hadoop surface area
> > >>- Mapping the HDFS permissions model to Windows
> > >>- Abstracted and reconciled mismatches around differences in Path
> > >>semantics
> > >>in Java and Windows
> > >>- Native Task Controller for Windows
> > >>- Implementation of a Block Placement Policy to support cloud
> > >>environments,
> > >>more specifically Azure.
> > >>- Implementation of Hadoop native libraries for Windows (compression
> > >>codecs, native I/O)
> > >>- Several reliability issues, including race-conditions, intermittent
> > test
> > >>failures, resource leaks.
> > >>- Several new unit test cases written for the above changes
> > >>
> > >>Please find the details of the work in CHANGES.branch-trunk-win.txt -
> > >>Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> http://bit.ly/13QOSo9
> > >,
> > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the
> work
> > >>ported from branch-1-win to a branch based on trunk.
> > >>
> > >>For details of the testing done, please see the thread -
> > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> HADOOP-8562<
> > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> > >>
> > >>This was a large undertaking that involved developing code, testing the
> > >>entire Hadoop stack, including scale tests. This is made possible only
> > >>with
> > >>the contribution from many many folks in the community. Following
> people
> > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas
> Saha,
> > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> > Sumadhur
> > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
> Thejas
> > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
> > >>Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze,
> > Suresh
> > >>Srinivas and Sanjay Radia. There are many others who contributed as
> well
> > >>providing feedback and comments on numerous jiras.
> > >>
> > >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> > >>
> > >>Regards,
> > >>Suresh
> > >>
> > >>
> > >>
> > >>
> > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> > >><[hidden email]>wrote:
> > >>
> > >>> It is super exciting to look at the prospect of these changes being
> > >>>merged
> > >>> to trunk. Having Windows as one of the supported Hadoop platforms is
> a
> > >>> fantastic opportunity both for the Hadoop project and Microsoft
> > >>>customers.
> > >>>
> > >>> This work began around a year back when a few of us started with a
> > basic
> > >>> port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft
> > have
> > >>> made significant progress in the following areas:
> > >>> (PS: Some of these items are already included in Suresh's email, but
> > >>> including again for completeness)
> > >>>
> > >>> - Command-line scripts for the Hadoop surface area
> > >>> - Mapping the HDFS permissions model to Windows
> > >>> - Abstracted and reconciled mismatches around differences in Path
> > >>> semantics in Java and Windows
> > >>> - Native Task Controller for Windows
> > >>> - Implementation of a Block Placement Policy to support cloud
> > >>> environments, more specifically Azure.
> > >>> - Implementation of Hadoop native libraries for Windows (compression
> > >>> codecs, native I/O) - Several reliability issues, including
> > >>> race-conditions, intermittent test failures, resource leaks.
> > >>> - Several new unit test cases written for the above changes
> > >>>
> > >>> In the process, we have closely engaged with the Apache open source
> > >>> community and have got great support and assistance from the
> community
> > >>>in
> > >>> terms of contributing fixes, code review comments and commits.
> > >>>
> > >>> In addition, the Hadoop team at Microsoft has also made good progress
> > in
> > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> Many
> > >>>of
> > >>> these changes have already been committed to the respective trunks
> with
> > >>> help from various committers and contributors. It is great to see the
> > >>> commitment of the community to support multiple platforms, and we
> look
> > >>> forward to the day when a developer/customer is able to successfully
> > >>>deploy
> > >>> a complete solution stack based on Apache Hadoop releases.
> > >>>
> > >>> Next Steps:
> > >>>
> > >>> All of the above changes are part of the Windows Azure HDInsight and
> > >>> HDInsight Server products from Microsoft. We have successfully
> > >>>on-boarded
> > >>> several internal customers and have been running production workloads
> > on
> > >>> Windows Azure HDInsight. Our vision is to create a big data platform
> > >>>based
> > >>> on Hadoop, and we are committed to helping make Hadoop a world-class
> > >>> solution that anyone can use to solve their biggest data challenges.
> > >>>
> > >>> As an immediate next step, we would like to have a discussion around
> > how
> > >>> we can ensure that the quality of the mainline Hadoop branches on
> > >>>Windows
> > >>> is maintained. To this end, we would like to get to the state where
> we
> > >>>have
> > >>> pre-checkin validation gates and nightly test runs enabled on
> Windows.
> > >>>If
> > >>> you have any suggestions around this, please do send an email.  We
> are
> > >>> committed to helping sustain the long-term quality of Hadoop on both
> > >>>Linux
> > >>> and Windows.
> > >>>
> > >>> We sincerely thank the community for their contribution and support
> so
> > >>> far. And hope to continue having a close engagement in the future.
> > >>>
> > >>> -Microsoft HDInsight Team
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: Suresh Srinivas [mailto:[hidden email]]
> > >>> Sent: Thursday, February 7, 2013 5:42 PM
> > >>> To: [hidden email]; [hidden email];
> > >>> [hidden email]; [hidden email]
> > >>> Subject: Heads up - merge branch-trunk-win to trunk
> > >>>
> > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year
> ago.
> > >>>The
> > >>> goal was to make Hadoop natively integrated, full-featured, and
> > >>>performance
> > >>> and scalability tuned on Windows Server or Windows Azure.
> > >>> We are happy to announce that a lot of progress has been made in this
> > >>> regard.
> > >>>
> > >>> Initial work started in a feature branch, branch-1-win, based on
> > >>>branch-1.
> > >>> The details related to the work done in the branch can be seen in
> > >>> CHANGES.txt<
> > >>>
> > >>>
> > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES
> .
> > >>>branch-1-win.txt?view=markup
> > >>> >.
> > >>> This work has been ported to a branch, branch-trunk-win, based on
> > trunk.
> > >>> Merge patch for this is available on
> > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > >>> .
> > >>>
> > >>> Highlights of the work done so far:
> > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> > changes
> > >>> handle differences in platforms related to path names, process/task
> > >>> management etc.
> > >>> 2. Addition of winutils tools for managing file permissions and
> > >>>ownership,
> > >>> user group mapping, hardlinks, symbolic links, chmod, disk
> utilization,
> > >>>and
> > >>> process/task management.
> > >>> 3. Added cmd scripts equivalent to existing shell scripts
> > >>> hadoop-daemon.sh, start and stop scripts.
> > >>> 4. Addition of block placement policy implemnation to support cloud
> > >>> enviroment, more specifically Azure.
> > >>>
> > >>> We are very close to wrapping up the work in branch-trunk-win and
> > >>>getting
> > >>> ready for a merge. Currently the merge patch is passing close to 100%
> > of
> > >>> unit tests on Linux. Soon I will call for a vote to merge this branch
> > >>>into
> > >>> trunk.
> > >>>
> > >>> Next steps:
> > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the work
> > >>> completes and precommit build is clean.
> > >>> 2. Start a discussion on adding Jenkins precommit builds on windows
> and
> > >>> how to integrate that with the existing commit process.
> > >>>
> > >>> Let me know if you have any questions.
> > >>>
> > >>> Regards,
> > >>> Suresh
> > >>>
> > >>>
> > >>
> > >>
> > >>--
> > >>http://hortonworks.com/download/
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>



--
Todd Lipcon
Software Engineer, Cloudera
Reply | Threaded
Open this post in threaded view
|

RE: [Vote] Merge branch-trunk-win to trunk

Ivan Mitic
+1 (non-binding)

I am really glad to see this happening! As people already mentioned, this has been a great engineering effort involving many people!


Folks raised some valid concerns below and I thought it would be good to share my 2 cents. In my opinion, we don't have to solve all these problems right now. As we move forward with two platforms, we can start addressing one problem at a time and incrementally improve. In the first iteration, maintaining Hadoop on Windows could be just everyone trying to do their best effort (make sure Jenkins build succeeds at least). We already have people who are building/running trunk on Windows daily, so they would jump in and fix problems as needed (we've been doing this in branch-trunk-win for a while now). Although I see that the problems could arise with platform specific features/optimizations, I don't think these are frequent, so in most cases everything will just work. Merging the two branches sooner rather than later does seems like the right thing to do if the ultimate goal is to have Hadoop on both platforms. Now that the port has completed, we will have people in Microsoft (and elsewhere) wanting to contribute features/improvements to the trunk branch. A separate branch would just make things more difficult and confusing for everyone :) Hope this makes sense.

-----Original Message-----
From: Todd Lipcon [mailto:[hidden email]]
Sent: Wednesday, February 27, 2013 3:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [Vote] Merge branch-trunk-win to trunk

On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <[hidden email]>wrote:

> With that we need to decide how our precommit process looks.
> My inclination is to wait for +1 from precommit builds on both the
> platforms to ensure no issues are introduced.
> Thoughts?
>
> 2. Feature development impact
> Some questions have been raised about would new features need to be
> supported on both the platforms. Yes. I do not see a reason why
> features cannot work on both the platforms, with the exception of
> platform specific optimizations. This what Java gives us.
>
>
I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :)

If I submit a patch and it gets -1 "tests failed" on the Windows slave, how am I supposed to proceed?

I think a reasonable compromise would be that the tests should always
*build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a "-1 Tests failed on windows" should not block a commit.

Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific.

I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a "parallel code line" which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk->win to pick up any new changes, and do a separate test/release process. I'm not convinced this is the best idea, but worth discussion of pros and cons.

-Todd


>
> On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:
>
> > Bobby raises some good questions.  A related one, since most current
> > developers won't add Windows support for new features that are
> > platform specific is it assumed that Windows development will either
> > lag or will people actively work on keeping Windows up with the
> > latest?  And vice versa in case Windows support is implemented first.
> >
> > Is there a jira for resolving the outstanding TODOs in the code base
> > (similar to HDFS-2148)?  Looks like this merge doesn't introduce
> > many which is great (just did a quick diff and grep).
> >
> > Thanks,
> > Eli
> >
> > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
> wrote:
> > > After this is merged in is Windows still going to be a second
> > > class citizen but happens to work for more than just development
> > > or is it a fully supported platform where if something breaks it
> > > can block a
> > release?
> > >  How do we as a community intend to keep Windows support from breaking?
> > > We don't have any Jenkins slaves to be able to run nightly tests
> > > to validate everything still compiles/runs.  This is not a blocker
> > > for me because we often rely on individuals and groups to test
> > > Hadoop, but I
> do
> > > think we need to have this discussion before we put it in.
> > >
> > > --Bobby
> > >
> > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
> > >
> > >>I had posted heads up about merging branch-trunk-win to trunk on
> > >>Feb
> 8th.
> > >>I
> > >>am happy to announce that we are ready for the merge.
> > >>
> > >>Here is a brief recap on the highlights of the work done:
> > >>- Command-line scripts for the Hadoop surface area
> > >>- Mapping the HDFS permissions model to Windows
> > >>- Abstracted and reconciled mismatches around differences in Path
> > >>semantics in Java and Windows
> > >>- Native Task Controller for Windows
> > >>- Implementation of a Block Placement Policy to support cloud
> > >>environments, more specifically Azure.
> > >>- Implementation of Hadoop native libraries for Windows
> > >>(compression codecs, native I/O)
> > >>- Several reliability issues, including race-conditions,
> > >>intermittent
> > test
> > >>failures, resource leaks.
> > >>- Several new unit test cases written for the above changes
> > >>
> > >>Please find the details of the work in
> > >>CHANGES.branch-trunk-win.txt - Common
> > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> http://bit.ly/13QOSo9
> > >,
> > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is
> > >>the
> work
> > >>ported from branch-1-win to a branch based on trunk.
> > >>
> > >>For details of the testing done, please see the thread -
> > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> HADOOP-8562<
> > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> > >>
> > >>This was a large undertaking that involved developing code,
> > >>testing the entire Hadoop stack, including scale tests. This is
> > >>made possible only with the contribution from many many folks in
> > >>the community. Following
> people
> > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
> > >>Bikas
> Saha,
> > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> > Sumadhur
> > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
> Thejas
> > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan,
> > >>Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
> > >>Nicholas Sze,
> > Suresh
> > >>Srinivas and Sanjay Radia. There are many others who contributed
> > >>as
> well
> > >>providing feedback and comments on numerous jiras.
> > >>
> > >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> > >>
> > >>Regards,
> > >>Suresh
> > >>
> > >>
> > >>
> > >>
> > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> > >><[hidden email]>wrote:
> > >>
> > >>> It is super exciting to look at the prospect of these changes
> > >>>being merged  to trunk. Having Windows as one of the supported
> > >>>Hadoop platforms is
> a
> > >>> fantastic opportunity both for the Hadoop project and Microsoft
> > >>>customers.
> > >>>
> > >>> This work began around a year back when a few of us started with
> > >>> a
> > basic
> > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
> > >>> Microsoft
> > have
> > >>> made significant progress in the following areas:
> > >>> (PS: Some of these items are already included in Suresh's email,
> > >>> but including again for completeness)
> > >>>
> > >>> - Command-line scripts for the Hadoop surface area
> > >>> - Mapping the HDFS permissions model to Windows
> > >>> - Abstracted and reconciled mismatches around differences in
> > >>> Path semantics in Java and Windows
> > >>> - Native Task Controller for Windows
> > >>> - Implementation of a Block Placement Policy to support cloud
> > >>> environments, more specifically Azure.
> > >>> - Implementation of Hadoop native libraries for Windows
> > >>> (compression codecs, native I/O) - Several reliability issues,
> > >>> including race-conditions, intermittent test failures, resource leaks.
> > >>> - Several new unit test cases written for the above changes
> > >>>
> > >>> In the process, we have closely engaged with the Apache open
> > >>> source community and have got great support and assistance from
> > >>> the
> community
> > >>>in
> > >>> terms of contributing fixes, code review comments and commits.
> > >>>
> > >>> In addition, the Hadoop team at Microsoft has also made good
> > >>> progress
> > in
> > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> Many
> > >>>of
> > >>> these changes have already been committed to the respective
> > >>>trunks
> with
> > >>> help from various committers and contributors. It is great to
> > >>> see the commitment of the community to support multiple
> > >>> platforms, and we
> look
> > >>> forward to the day when a developer/customer is able to
> > >>>successfully deploy  a complete solution stack based on Apache
> > >>>Hadoop releases.
> > >>>
> > >>> Next Steps:
> > >>>
> > >>> All of the above changes are part of the Windows Azure HDInsight
> > >>>and  HDInsight Server products from Microsoft. We have
> > >>>successfully on-boarded  several internal customers and have been
> > >>>running production workloads
> > on
> > >>> Windows Azure HDInsight. Our vision is to create a big data
> > >>>platform based  on Hadoop, and we are committed to helping make
> > >>>Hadoop a world-class  solution that anyone can use to solve their
> > >>>biggest data challenges.
> > >>>
> > >>> As an immediate next step, we would like to have a discussion
> > >>> around
> > how
> > >>> we can ensure that the quality of the mainline Hadoop branches
> > >>>on Windows  is maintained. To this end, we would like to get to
> > >>>the state where
> we
> > >>>have
> > >>> pre-checkin validation gates and nightly test runs enabled on
> Windows.
> > >>>If
> > >>> you have any suggestions around this, please do send an email.  
> > >>>We
> are
> > >>> committed to helping sustain the long-term quality of Hadoop on
> > >>>both Linux  and Windows.
> > >>>
> > >>> We sincerely thank the community for their contribution and
> > >>> support
> so
> > >>> far. And hope to continue having a close engagement in the future.
> > >>>
> > >>> -Microsoft HDInsight Team
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: Suresh Srinivas [mailto:[hidden email]]
> > >>> Sent: Thursday, February 7, 2013 5:42 PM
> > >>> To: [hidden email]; [hidden email];
> > >>> [hidden email]; [hidden email]
> > >>> Subject: Heads up - merge branch-trunk-win to trunk
> > >>>
> > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year
> ago.
> > >>>The
> > >>> goal was to make Hadoop natively integrated, full-featured, and
> > >>>performance  and scalability tuned on Windows Server or Windows
> > >>>Azure.
> > >>> We are happy to announce that a lot of progress has been made in
> > >>>this  regard.
> > >>>
> > >>> Initial work started in a feature branch, branch-1-win, based on
> > >>>branch-1.
> > >>> The details related to the work done in the branch can be seen
> > >>>in  CHANGES.txt<
> > >>>
> > >>>
> > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
> > NGES
> .
> > >>>branch-1-win.txt?view=markup
> > >>> >.
> > >>> This work has been ported to a branch, branch-trunk-win, based
> > >>> on
> > trunk.
> > >>> Merge patch for this is available on
> > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > >>> .
> > >>>
> > >>> Highlights of the work done so far:
> > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> > changes
> > >>> handle differences in platforms related to path names,
> > >>>process/task  management etc.
> > >>> 2. Addition of winutils tools for managing file permissions and
> > >>>ownership,  user group mapping, hardlinks, symbolic links, chmod,
> > >>>disk
> utilization,
> > >>>and
> > >>> process/task management.
> > >>> 3. Added cmd scripts equivalent to existing shell scripts  
> > >>>hadoop-daemon.sh, start and stop scripts.
> > >>> 4. Addition of block placement policy implemnation to support
> > >>>cloud  enviroment, more specifically Azure.
> > >>>
> > >>> We are very close to wrapping up the work in branch-trunk-win
> > >>>and getting  ready for a merge. Currently the merge patch is
> > >>>passing close to 100%
> > of
> > >>> unit tests on Linux. Soon I will call for a vote to merge this
> > >>>branch into  trunk.
> > >>>
> > >>> Next steps:
> > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the
> > >>> work completes and precommit build is clean.
> > >>> 2. Start a discussion on adding Jenkins precommit builds on
> > >>> windows
> and
> > >>> how to integrate that with the existing commit process.
> > >>>
> > >>> Let me know if you have any questions.
> > >>>
> > >>> Regards,
> > >>> Suresh
> > >>>
> > >>>
> > >>
> > >>
> > >>--
> > >>http://hortonworks.com/download/
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>



--
Todd Lipcon
Software Engineer, Cloudera

Reply | Threaded
Open this post in threaded view
|

RE: [Vote] Merge branch-trunk-win to trunk

John Gordon
+1 (non-binding)

I want to share my vote of confidence in this community.  If motivated to do so, this community can keep this project cross-platform and continue to rapidly innovate without breaking a sweat.

The day we started working on this, I saw the foundations of greatness in the quality and volume of dev tests, the code itself, and the Apache values themselves.

1.) Hadoop's unit tests and their frameworks are very well thought out and the consideration and energy that went into their design is worthy of praise.  The MiniCluster abstractions utilize very few resources and put all the processes into one JVM for easy debugging.  It is very easy to select specific tests from the full suite to reproduce an issue reported in another environment - like the Jenkins build server or another contributor's environment.  
2.) This community has done an excellent job of incorporating well-placed log messages to make it easy to post mortem troubleshoot most failures.  The logs are very useful, and it is extremely rare that troubleshooting a failure requires debugging a live repro.
3.) Hadoop is written primarily in Java, a cross-platform language that provides its own platform in the form of the JVM to insulate most of the code from the specifics of the OS layer.
4.) CoPDoC - The right priorities, and well stated.


Thank you,

John

-----Original Message-----
From: Ivan Mitic [mailto:[hidden email]]
Sent: Wednesday, February 27, 2013 6:32 PM
To: [hidden email]; [hidden email]
Cc: [hidden email]; [hidden email]
Subject: RE: [Vote] Merge branch-trunk-win to trunk

+1 (non-binding)

I am really glad to see this happening! As people already mentioned, this has been a great engineering effort involving many people!


Folks raised some valid concerns below and I thought it would be good to share my 2 cents. In my opinion, we don't have to solve all these problems right now. As we move forward with two platforms, we can start addressing one problem at a time and incrementally improve. In the first iteration, maintaining Hadoop on Windows could be just everyone trying to do their best effort (make sure Jenkins build succeeds at least). We already have people who are building/running trunk on Windows daily, so they would jump in and fix problems as needed (we've been doing this in branch-trunk-win for a while now). Although I see that the problems could arise with platform specific features/optimizations, I don't think these are frequent, so in most cases everything will just work. Merging the two branches sooner rather than later does seems like the right thing to do if the ultimate goal is to have Hadoop on both platforms. Now that the port has completed, we will have people in Microsoft (and elsewhere) wanting to contribute features/improvements to the trunk branch. A separate branch would just make things more difficult and confusing for everyone :) Hope this makes sense.

-----Original Message-----
From: Todd Lipcon [mailto:[hidden email]]
Sent: Wednesday, February 27, 2013 3:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [Vote] Merge branch-trunk-win to trunk

On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <[hidden email]>wrote:

> With that we need to decide how our precommit process looks.
> My inclination is to wait for +1 from precommit builds on both the
> platforms to ensure no issues are introduced.
> Thoughts?
>
> 2. Feature development impact
> Some questions have been raised about would new features need to be
> supported on both the platforms. Yes. I do not see a reason why
> features cannot work on both the platforms, with the exception of
> platform specific optimizations. This what Java gives us.
>
>
I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :)

If I submit a patch and it gets -1 "tests failed" on the Windows slave, how am I supposed to proceed?

I think a reasonable compromise would be that the tests should always
*build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a "-1 Tests failed on windows" should not block a commit.

Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific.

I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a "parallel code line" which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk->win to pick up any new changes, and do a separate test/release process. I'm not convinced this is the best idea, but worth discussion of pros and cons.

-Todd


>
> On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:
>
> > Bobby raises some good questions.  A related one, since most current
> > developers won't add Windows support for new features that are
> > platform specific is it assumed that Windows development will either
> > lag or will people actively work on keeping Windows up with the
> > latest?  And vice versa in case Windows support is implemented first.
> >
> > Is there a jira for resolving the outstanding TODOs in the code base
> > (similar to HDFS-2148)?  Looks like this merge doesn't introduce
> > many which is great (just did a quick diff and grep).
> >
> > Thanks,
> > Eli
> >
> > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
> wrote:
> > > After this is merged in is Windows still going to be a second
> > > class citizen but happens to work for more than just development
> > > or is it a fully supported platform where if something breaks it
> > > can block a
> > release?
> > >  How do we as a community intend to keep Windows support from breaking?
> > > We don't have any Jenkins slaves to be able to run nightly tests
> > > to validate everything still compiles/runs.  This is not a blocker
> > > for me because we often rely on individuals and groups to test
> > > Hadoop, but I
> do
> > > think we need to have this discussion before we put it in.
> > >
> > > --Bobby
> > >
> > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]> wrote:
> > >
> > >>I had posted heads up about merging branch-trunk-win to trunk on
> > >>Feb
> 8th.
> > >>I
> > >>am happy to announce that we are ready for the merge.
> > >>
> > >>Here is a brief recap on the highlights of the work done:
> > >>- Command-line scripts for the Hadoop surface area
> > >>- Mapping the HDFS permissions model to Windows
> > >>- Abstracted and reconciled mismatches around differences in Path
> > >>semantics in Java and Windows
> > >>- Native Task Controller for Windows
> > >>- Implementation of a Block Placement Policy to support cloud
> > >>environments, more specifically Azure.
> > >>- Implementation of Hadoop native libraries for Windows
> > >>(compression codecs, native I/O)
> > >>- Several reliability issues, including race-conditions,
> > >>intermittent
> > test
> > >>failures, resource leaks.
> > >>- Several new unit test cases written for the above changes
> > >>
> > >>Please find the details of the work in
> > >>CHANGES.branch-trunk-win.txt - Common
> > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> http://bit.ly/13QOSo9
> > >,
> > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is
> > >>the
> work
> > >>ported from branch-1-win to a branch based on trunk.
> > >>
> > >>For details of the testing done, please see the thread -
> > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> HADOOP-8562<
> > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> > >>
> > >>This was a large undertaking that involved developing code,
> > >>testing the entire Hadoop stack, including scale tests. This is
> > >>made possible only with the contribution from many many folks in
> > >>the community. Following
> people
> > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
> > >>Bikas
> Saha,
> > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> > Sumadhur
> > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
> Thejas
> > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan,
> > >>Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
> > >>Nicholas Sze,
> > Suresh
> > >>Srinivas and Sanjay Radia. There are many others who contributed
> > >>as
> well
> > >>providing feedback and comments on numerous jiras.
> > >>
> > >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> > >>
> > >>Regards,
> > >>Suresh
> > >>
> > >>
> > >>
> > >>
> > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> > >><[hidden email]>wrote:
> > >>
> > >>> It is super exciting to look at the prospect of these changes
> > >>>being merged  to trunk. Having Windows as one of the supported
> > >>>Hadoop platforms is
> a
> > >>> fantastic opportunity both for the Hadoop project and Microsoft
> > >>>customers.
> > >>>
> > >>> This work began around a year back when a few of us started with
> > >>> a
> > basic
> > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
> > >>> Microsoft
> > have
> > >>> made significant progress in the following areas:
> > >>> (PS: Some of these items are already included in Suresh's email,
> > >>> but including again for completeness)
> > >>>
> > >>> - Command-line scripts for the Hadoop surface area
> > >>> - Mapping the HDFS permissions model to Windows
> > >>> - Abstracted and reconciled mismatches around differences in
> > >>> Path semantics in Java and Windows
> > >>> - Native Task Controller for Windows
> > >>> - Implementation of a Block Placement Policy to support cloud
> > >>> environments, more specifically Azure.
> > >>> - Implementation of Hadoop native libraries for Windows
> > >>> (compression codecs, native I/O) - Several reliability issues,
> > >>> including race-conditions, intermittent test failures, resource leaks.
> > >>> - Several new unit test cases written for the above changes
> > >>>
> > >>> In the process, we have closely engaged with the Apache open
> > >>> source community and have got great support and assistance from
> > >>> the
> community
> > >>>in
> > >>> terms of contributing fixes, code review comments and commits.
> > >>>
> > >>> In addition, the Hadoop team at Microsoft has also made good
> > >>> progress
> > in
> > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> Many
> > >>>of
> > >>> these changes have already been committed to the respective
> > >>>trunks
> with
> > >>> help from various committers and contributors. It is great to
> > >>> see the commitment of the community to support multiple
> > >>> platforms, and we
> look
> > >>> forward to the day when a developer/customer is able to
> > >>>successfully deploy  a complete solution stack based on Apache
> > >>>Hadoop releases.
> > >>>
> > >>> Next Steps:
> > >>>
> > >>> All of the above changes are part of the Windows Azure HDInsight
> > >>>and  HDInsight Server products from Microsoft. We have
> > >>>successfully on-boarded  several internal customers and have been
> > >>>running production workloads
> > on
> > >>> Windows Azure HDInsight. Our vision is to create a big data
> > >>>platform based  on Hadoop, and we are committed to helping make
> > >>>Hadoop a world-class  solution that anyone can use to solve their
> > >>>biggest data challenges.
> > >>>
> > >>> As an immediate next step, we would like to have a discussion
> > >>> around
> > how
> > >>> we can ensure that the quality of the mainline Hadoop branches
> > >>>on Windows  is maintained. To this end, we would like to get to
> > >>>the state where
> we
> > >>>have
> > >>> pre-checkin validation gates and nightly test runs enabled on
> Windows.
> > >>>If
> > >>> you have any suggestions around this, please do send an email.  
> > >>>We
> are
> > >>> committed to helping sustain the long-term quality of Hadoop on
> > >>>both Linux  and Windows.
> > >>>
> > >>> We sincerely thank the community for their contribution and
> > >>> support
> so
> > >>> far. And hope to continue having a close engagement in the future.
> > >>>
> > >>> -Microsoft HDInsight Team
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: Suresh Srinivas [mailto:[hidden email]]
> > >>> Sent: Thursday, February 7, 2013 5:42 PM
> > >>> To: [hidden email]; [hidden email];
> > >>> [hidden email]; [hidden email]
> > >>> Subject: Heads up - merge branch-trunk-win to trunk
> > >>>
> > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year
> ago.
> > >>>The
> > >>> goal was to make Hadoop natively integrated, full-featured, and
> > >>>performance  and scalability tuned on Windows Server or Windows
> > >>>Azure.
> > >>> We are happy to announce that a lot of progress has been made in
> > >>>this  regard.
> > >>>
> > >>> Initial work started in a feature branch, branch-1-win, based on
> > >>>branch-1.
> > >>> The details related to the work done in the branch can be seen
> > >>>in  CHANGES.txt<
> > >>>
> > >>>
> > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
> > NGES
> .
> > >>>branch-1-win.txt?view=markup
> > >>> >.
> > >>> This work has been ported to a branch, branch-trunk-win, based
> > >>> on
> > trunk.
> > >>> Merge patch for this is available on
> > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > >>> .
> > >>>
> > >>> Highlights of the work done so far:
> > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> > changes
> > >>> handle differences in platforms related to path names,
> > >>>process/task  management etc.
> > >>> 2. Addition of winutils tools for managing file permissions and
> > >>>ownership,  user group mapping, hardlinks, symbolic links, chmod,
> > >>>disk
> utilization,
> > >>>and
> > >>> process/task management.
> > >>> 3. Added cmd scripts equivalent to existing shell scripts
> > >>>hadoop-daemon.sh, start and stop scripts.
> > >>> 4. Addition of block placement policy implemnation to support
> > >>>cloud  enviroment, more specifically Azure.
> > >>>
> > >>> We are very close to wrapping up the work in branch-trunk-win
> > >>>and getting  ready for a merge. Currently the merge patch is
> > >>>passing close to 100%
> > of
> > >>> unit tests on Linux. Soon I will call for a vote to merge this
> > >>>branch into  trunk.
> > >>>
> > >>> Next steps:
> > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the
> > >>> work completes and precommit build is clean.
> > >>> 2. Start a discussion on adding Jenkins precommit builds on
> > >>> windows
> and
> > >>> how to integrate that with the existing commit process.
> > >>>
> > >>> Let me know if you have any questions.
> > >>>
> > >>> Regards,
> > >>> Suresh
> > >>>
> > >>>
> > >>
> > >>
> > >>--
> > >>http://hortonworks.com/download/
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>



--
Todd Lipcon
Software Engineer, Cloudera




Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Chris Nauroth
I'd like to share a few anecdotes about developing cross-platform,
hopefully to address some of the concerns about adding overhead to the
development process.  By reviewing past cases of cross-platform Linux vs.
Windows bugs, we can get a sense for how the development process could look
in the future.

HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run on
Windows.  As part of an earlier jira, HADOOP-8962, there was a new test
committed on trunk covering the case of a local file system interaction on
a file containing a ':'.  On Windows, ':' in a path has special meaning as
part of the drive specifier (i.e. C:), so this test cannot pass when
running on Windows.  In this kind of case, the cross-platform bug is
obvious, and the fix is obvious (assumeTrue(!Shell.WINDOWS)).  Ideally,
this would get fixed pre-commit after seeing a -1 from the Windows Jenkins
slave.

HDFS-4274: BlockPoolSliceScanner does not close verification log during
shutdown.  This caused problems for MiniDFSCluster-based tests running on
Windows.  Failure to close the verification log meant that we didn't
release file locks, so the tests couldn't delete/recreate working
directories during teardown/setup.  Arguably, this was always a bug, and
running on Windows just exposed it because of its stricter rules about file
locking.  This is a more complex fix, but it doesn't require
platform-specific knowledge.  If some future patch accidentally regresses
this, then we'll likely see +1 from Linux Jenkins and -1 from Windows
Jenkins.  Ideally, it would get fixed pre-commit, because it doesn't
require Windows-specific knowledge.  There is also the matter of impact.
 Re-breaking this would re-break many test suites on Windows.

HADOOP-9232: JniBasedUnixGroupsMappingWithFallback fails on Windows with
UnsatisfiedLinkError.  This was introduced by HADOOP-8712, which switched
to JniBasedUnixGroupsMappingWithFallback as the default
hadoop.security.group.mapping, but did not provide a Windows implementation
of the JNI function.  In this case, there was a strong desire to get
HADOOP-8712 into a release, fixing it on Windows required native Windows
API knowledge, and Windows users had a simple workaround available by
changing their configs back to ShellBasedUnixGroupsMapping.  I think this
is the kind of situation where we could allow HADOOP-8712 to commit despite
-1 from Windows Jenkins, with fairly quick follow-up from an engineer with
the Windows expertise to fix it.

To summarize, I don't think it needs to differ greatly from our current
development process.  We're all responsible for breadth of understanding
and maintenance of the whole codebase, but we also rely on specific
individuals with deep expertise in particular areas for certain issues.
 Sometimes we commit despite a -1 from Jenkins, based on the community's
judgment.

Virtualization greatly simplifies cross-platform development.  I use
VirtualBox on a Mac host and run VMs for Windows and Ubuntu with a shared
drive so that they can all see the same copy of the source code.  There are
plenty of variations on this depending on your preference, such as
offloading the VMs to a separate server or cloud service to free up local
RAM.  I'm planning on submitting BUILDING.txt changes later today that
fully describe how to build on Windows.  After some initial setup, it's
nearly identical to the mvn commands that you already use today.

Hope this helps,
--Chris


On Thu, Feb 28, 2013 at 3:25 AM, John Gordon <[hidden email]>wrote:

> +1 (non-binding)
>
> I want to share my vote of confidence in this community.  If motivated to
> do so, this community can keep this project cross-platform and continue to
> rapidly innovate without breaking a sweat.
>
> The day we started working on this, I saw the foundations of greatness in
> the quality and volume of dev tests, the code itself, and the Apache values
> themselves.
>
> 1.) Hadoop's unit tests and their frameworks are very well thought out and
> the consideration and energy that went into their design is worthy of
> praise.  The MiniCluster abstractions utilize very few resources and put
> all the processes into one JVM for easy debugging.  It is very easy to
> select specific tests from the full suite to reproduce an issue reported in
> another environment - like the Jenkins build server or another
> contributor's environment.
> 2.) This community has done an excellent job of incorporating well-placed
> log messages to make it easy to post mortem troubleshoot most failures.
>  The logs are very useful, and it is extremely rare that troubleshooting a
> failure requires debugging a live repro.
> 3.) Hadoop is written primarily in Java, a cross-platform language that
> provides its own platform in the form of the JVM to insulate most of the
> code from the specifics of the OS layer.
> 4.) CoPDoC - The right priorities, and well stated.
>
>
> Thank you,
>
> John
>
> -----Original Message-----
> From: Ivan Mitic [mailto:[hidden email]]
> Sent: Wednesday, February 27, 2013 6:32 PM
> To: [hidden email]; [hidden email]
> Cc: [hidden email]; [hidden email]
> Subject: RE: [Vote] Merge branch-trunk-win to trunk
>
> +1 (non-binding)
>
> I am really glad to see this happening! As people already mentioned, this
> has been a great engineering effort involving many people!
>
>
> Folks raised some valid concerns below and I thought it would be good to
> share my 2 cents. In my opinion, we don't have to solve all these problems
> right now. As we move forward with two platforms, we can start addressing
> one problem at a time and incrementally improve. In the first iteration,
> maintaining Hadoop on Windows could be just everyone trying to do their
> best effort (make sure Jenkins build succeeds at least). We already have
> people who are building/running trunk on Windows daily, so they would jump
> in and fix problems as needed (we've been doing this in branch-trunk-win
> for a while now). Although I see that the problems could arise with
> platform specific features/optimizations, I don't think these are frequent,
> so in most cases everything will just work. Merging the two branches sooner
> rather than later does seems like the right thing to do if the ultimate
> goal is to have Hadoop on both platforms. Now that the port has completed,
> we will have people in Microsoft (and elsewhere) wanting to contribute
> features/improvements to the trunk branch. A separate branch would just
> make things more difficult and confusing for everyone :) Hope this makes
> sense.
>
> -----Original Message-----
> From: Todd Lipcon [mailto:[hidden email]]
> Sent: Wednesday, February 27, 2013 3:43 PM
> To: [hidden email]
> Cc: [hidden email]; [hidden email];
> [hidden email]
> Subject: Re: [Vote] Merge branch-trunk-win to trunk
>
> On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <[hidden email]
> >wrote:
>
> > With that we need to decide how our precommit process looks.
> > My inclination is to wait for +1 from precommit builds on both the
> > platforms to ensure no issues are introduced.
> > Thoughts?
> >
> > 2. Feature development impact
> > Some questions have been raised about would new features need to be
> > supported on both the platforms. Yes. I do not see a reason why
> > features cannot work on both the platforms, with the exception of
> > platform specific optimizations. This what Java gives us.
> >
> >
> I'm concerned about the above. Personally, I don't have access to any
> Windows boxes with development tools, and I know nothing about developing
> on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated,
> for powerpoint :)
>
> If I submit a patch and it gets -1 "tests failed" on the Windows slave,
> how am I supposed to proceed?
>
> I think a reasonable compromise would be that the tests should always
> *build* on Windows before commit, and contributors should do their best to
> look at the test logs for any Windows-specific failures. But, beyond
> looking at the logs, a "-1 Tests failed on windows" should not block a
> commit.
>
> Those contributors who are interested in Windows being a first-class
> platform should be responsible for watching the Windows builds and
> debugging/fixing any regressions that might be Windows-specific.
>
> I also think the KDE model that Harsh pointed out is an interesting one --
> ie the idea that we would not merge windows support to trunk, but rather
> treat is as a "parallel code line" which lives in the ASF and has its own
> builds and releases. The windows team would periodically merge trunk->win
> to pick up any new changes, and do a separate test/release process. I'm not
> convinced this is the best idea, but worth discussion of pros and cons.
>
> -Todd
>
>
> >
> > On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]> wrote:
> >
> > > Bobby raises some good questions.  A related one, since most current
> > > developers won't add Windows support for new features that are
> > > platform specific is it assumed that Windows development will either
> > > lag or will people actively work on keeping Windows up with the
> > > latest?  And vice versa in case Windows support is implemented first.
> > >
> > > Is there a jira for resolving the outstanding TODOs in the code base
> > > (similar to HDFS-2148)?  Looks like this merge doesn't introduce
> > > many which is great (just did a quick diff and grep).
> > >
> > > Thanks,
> > > Eli
> > >
> > > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
> > wrote:
> > > > After this is merged in is Windows still going to be a second
> > > > class citizen but happens to work for more than just development
> > > > or is it a fully supported platform where if something breaks it
> > > > can block a
> > > release?
> > > >  How do we as a community intend to keep Windows support from
> breaking?
> > > > We don't have any Jenkins slaves to be able to run nightly tests
> > > > to validate everything still compiles/runs.  This is not a blocker
> > > > for me because we often rely on individuals and groups to test
> > > > Hadoop, but I
> > do
> > > > think we need to have this discussion before we put it in.
> > > >
> > > > --Bobby
> > > >
> > > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]>
> wrote:
> > > >
> > > >>I had posted heads up about merging branch-trunk-win to trunk on
> > > >>Feb
> > 8th.
> > > >>I
> > > >>am happy to announce that we are ready for the merge.
> > > >>
> > > >>Here is a brief recap on the highlights of the work done:
> > > >>- Command-line scripts for the Hadoop surface area
> > > >>- Mapping the HDFS permissions model to Windows
> > > >>- Abstracted and reconciled mismatches around differences in Path
> > > >>semantics in Java and Windows
> > > >>- Native Task Controller for Windows
> > > >>- Implementation of a Block Placement Policy to support cloud
> > > >>environments, more specifically Azure.
> > > >>- Implementation of Hadoop native libraries for Windows
> > > >>(compression codecs, native I/O)
> > > >>- Several reliability issues, including race-conditions,
> > > >>intermittent
> > > test
> > > >>failures, resource leaks.
> > > >>- Several new unit test cases written for the above changes
> > > >>
> > > >>Please find the details of the work in
> > > >>CHANGES.branch-trunk-win.txt - Common
> > > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> > http://bit.ly/13QOSo9
> > > >,
> > > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is
> > > >>the
> > work
> > > >>ported from branch-1-win to a branch based on trunk.
> > > >>
> > > >>For details of the testing done, please see the thread -
> > > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> > HADOOP-8562<
> > > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> > > >>
> > > >>This was a large undertaking that involved developing code,
> > > >>testing the entire Hadoop stack, including scale tests. This is
> > > >>made possible only with the contribution from many many folks in
> > > >>the community. Following
> > people
> > > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
> > > >>Bikas
> > Saha,
> > > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> > > Sumadhur
> > > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
> > Thejas
> > > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan,
> > > >>Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
> > > >>Nicholas Sze,
> > > Suresh
> > > >>Srinivas and Sanjay Radia. There are many others who contributed
> > > >>as
> > well
> > > >>providing feedback and comments on numerous jiras.
> > > >>
> > > >>The vote will run for seven days and will end on March 5, 6:00PM PST.
> > > >>
> > > >>Regards,
> > > >>Suresh
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> > > >><[hidden email]>wrote:
> > > >>
> > > >>> It is super exciting to look at the prospect of these changes
> > > >>>being merged  to trunk. Having Windows as one of the supported
> > > >>>Hadoop platforms is
> > a
> > > >>> fantastic opportunity both for the Hadoop project and Microsoft
> > > >>>customers.
> > > >>>
> > > >>> This work began around a year back when a few of us started with
> > > >>> a
> > > basic
> > > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
> > > >>> Microsoft
> > > have
> > > >>> made significant progress in the following areas:
> > > >>> (PS: Some of these items are already included in Suresh's email,
> > > >>> but including again for completeness)
> > > >>>
> > > >>> - Command-line scripts for the Hadoop surface area
> > > >>> - Mapping the HDFS permissions model to Windows
> > > >>> - Abstracted and reconciled mismatches around differences in
> > > >>> Path semantics in Java and Windows
> > > >>> - Native Task Controller for Windows
> > > >>> - Implementation of a Block Placement Policy to support cloud
> > > >>> environments, more specifically Azure.
> > > >>> - Implementation of Hadoop native libraries for Windows
> > > >>> (compression codecs, native I/O) - Several reliability issues,
> > > >>> including race-conditions, intermittent test failures, resource
> leaks.
> > > >>> - Several new unit test cases written for the above changes
> > > >>>
> > > >>> In the process, we have closely engaged with the Apache open
> > > >>> source community and have got great support and assistance from
> > > >>> the
> > community
> > > >>>in
> > > >>> terms of contributing fixes, code review comments and commits.
> > > >>>
> > > >>> In addition, the Hadoop team at Microsoft has also made good
> > > >>> progress
> > > in
> > > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase.
> > Many
> > > >>>of
> > > >>> these changes have already been committed to the respective
> > > >>>trunks
> > with
> > > >>> help from various committers and contributors. It is great to
> > > >>> see the commitment of the community to support multiple
> > > >>> platforms, and we
> > look
> > > >>> forward to the day when a developer/customer is able to
> > > >>>successfully deploy  a complete solution stack based on Apache
> > > >>>Hadoop releases.
> > > >>>
> > > >>> Next Steps:
> > > >>>
> > > >>> All of the above changes are part of the Windows Azure HDInsight
> > > >>>and  HDInsight Server products from Microsoft. We have
> > > >>>successfully on-boarded  several internal customers and have been
> > > >>>running production workloads
> > > on
> > > >>> Windows Azure HDInsight. Our vision is to create a big data
> > > >>>platform based  on Hadoop, and we are committed to helping make
> > > >>>Hadoop a world-class  solution that anyone can use to solve their
> > > >>>biggest data challenges.
> > > >>>
> > > >>> As an immediate next step, we would like to have a discussion
> > > >>> around
> > > how
> > > >>> we can ensure that the quality of the mainline Hadoop branches
> > > >>>on Windows  is maintained. To this end, we would like to get to
> > > >>>the state where
> > we
> > > >>>have
> > > >>> pre-checkin validation gates and nightly test runs enabled on
> > Windows.
> > > >>>If
> > > >>> you have any suggestions around this, please do send an email.
> > > >>>We
> > are
> > > >>> committed to helping sustain the long-term quality of Hadoop on
> > > >>>both Linux  and Windows.
> > > >>>
> > > >>> We sincerely thank the community for their contribution and
> > > >>> support
> > so
> > > >>> far. And hope to continue having a close engagement in the future.
> > > >>>
> > > >>> -Microsoft HDInsight Team
> > > >>>
> > > >>>
> > > >>> -----Original Message-----
> > > >>> From: Suresh Srinivas [mailto:[hidden email]]
> > > >>> Sent: Thursday, February 7, 2013 5:42 PM
> > > >>> To: [hidden email]; [hidden email];
> > > >>> [hidden email]; [hidden email]
> > > >>> Subject: Heads up - merge branch-trunk-win to trunk
> > > >>>
> > > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> > > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year
> > ago.
> > > >>>The
> > > >>> goal was to make Hadoop natively integrated, full-featured, and
> > > >>>performance  and scalability tuned on Windows Server or Windows
> > > >>>Azure.
> > > >>> We are happy to announce that a lot of progress has been made in
> > > >>>this  regard.
> > > >>>
> > > >>> Initial work started in a feature branch, branch-1-win, based on
> > > >>>branch-1.
> > > >>> The details related to the work done in the branch can be seen
> > > >>>in  CHANGES.txt<
> > > >>>
> > > >>>
> > > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
> > > NGES
> > .
> > > >>>branch-1-win.txt?view=markup
> > > >>> >.
> > > >>> This work has been ported to a branch, branch-trunk-win, based
> > > >>> on
> > > trunk.
> > > >>> Merge patch for this is available on
> > > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > > >>> .
> > > >>>
> > > >>> Highlights of the work done so far:
> > > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> > > changes
> > > >>> handle differences in platforms related to path names,
> > > >>>process/task  management etc.
> > > >>> 2. Addition of winutils tools for managing file permissions and
> > > >>>ownership,  user group mapping, hardlinks, symbolic links, chmod,
> > > >>>disk
> > utilization,
> > > >>>and
> > > >>> process/task management.
> > > >>> 3. Added cmd scripts equivalent to existing shell scripts
> > > >>>hadoop-daemon.sh, start and stop scripts.
> > > >>> 4. Addition of block placement policy implemnation to support
> > > >>>cloud  enviroment, more specifically Azure.
> > > >>>
> > > >>> We are very close to wrapping up the work in branch-trunk-win
> > > >>>and getting  ready for a merge. Currently the merge patch is
> > > >>>passing close to 100%
> > > of
> > > >>> unit tests on Linux. Soon I will call for a vote to merge this
> > > >>>branch into  trunk.
> > > >>>
> > > >>> Next steps:
> > > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the
> > > >>> work completes and precommit build is clean.
> > > >>> 2. Start a discussion on adding Jenkins precommit builds on
> > > >>> windows
> > and
> > > >>> how to integrate that with the existing commit process.
> > > >>>
> > > >>> Let me know if you have any questions.
> > > >>>
> > > >>> Regards,
> > > >>> Suresh
> > > >>>
> > > >>>
> > > >>
> > > >>
> > > >>--
> > > >>http://hortonworks.com/download/
> > > >
> > >
> >
> >
> >
> > --
> > http://hortonworks.com/download/
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Raja Aluri
In reply to this post by Suresh Srinivas-2
+1 non-binding
Nice to see that this work is going to trunk.

Raja  Aluri


On Tue, Feb 26, 2013 at 2:55 PM, Suresh Srinivas <[hidden email]>wrote:

> I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I
> am happy to announce that we are ready for the merge.
>
> Here is a brief recap on the highlights of the work done:
> - Command-line scripts for the Hadoop surface area
> - Mapping the HDFS permissions model to Windows
> - Abstracted and reconciled mismatches around differences in Path semantics
> in Java and Windows
> - Native Task Controller for Windows
> - Implementation of a Block Placement Policy to support cloud environments,
> more specifically Azure.
> - Implementation of Hadoop native libraries for Windows (compression
> codecs, native I/O)
> - Several reliability issues, including race-conditions, intermittent test
> failures, resource leaks.
> - Several new unit test cases written for the above changes
>
> Please find the details of the work in CHANGES.branch-trunk-win.txt -
> Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,
> and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work
> ported from branch-1-win to a branch based on trunk.
>
> For details of the testing done, please see the thread -
> http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<
> https://issues.apache.org/jira/browse/HADOOP-8562>.
>
> This was a large undertaking that involved developing code, testing the
> entire Hadoop stack, including scale tests. This is made possible only with
> the contribution from many many folks in the community. Following people
> contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,
> Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur
> Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas
> Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya
> Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh
> Srinivas and Sanjay Radia. There are many others who contributed as well
> providing feedback and comments on numerous jiras.
>
> The vote will run for seven days and will end on March 5, 6:00PM PST.
>
> Regards,
> Suresh
>
>
>
>
> On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> <[hidden email]>wrote:
>
> > It is super exciting to look at the prospect of these changes being
> merged
> > to trunk. Having Windows as one of the supported Hadoop platforms is a
> > fantastic opportunity both for the Hadoop project and Microsoft
> customers.
> >
> > This work began around a year back when a few of us started with a basic
> > port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have
> > made significant progress in the following areas:
> > (PS: Some of these items are already included in Suresh's email, but
> > including again for completeness)
> >
> > - Command-line scripts for the Hadoop surface area
> > - Mapping the HDFS permissions model to Windows
> > - Abstracted and reconciled mismatches around differences in Path
> > semantics in Java and Windows
> > - Native Task Controller for Windows
> > - Implementation of a Block Placement Policy to support cloud
> > environments, more specifically Azure.
> > - Implementation of Hadoop native libraries for Windows (compression
> > codecs, native I/O) - Several reliability issues, including
> > race-conditions, intermittent test failures, resource leaks.
> > - Several new unit test cases written for the above changes
> >
> > In the process, we have closely engaged with the Apache open source
> > community and have got great support and assistance from the community in
> > terms of contributing fixes, code review comments and commits.
> >
> > In addition, the Hadoop team at Microsoft has also made good progress in
> > other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of
> > these changes have already been committed to the respective trunks with
> > help from various committers and contributors. It is great to see the
> > commitment of the community to support multiple platforms, and we look
> > forward to the day when a developer/customer is able to successfully
> deploy
> > a complete solution stack based on Apache Hadoop releases.
> >
> > Next Steps:
> >
> > All of the above changes are part of the Windows Azure HDInsight and
> > HDInsight Server products from Microsoft. We have successfully on-boarded
> > several internal customers and have been running production workloads on
> > Windows Azure HDInsight. Our vision is to create a big data platform
> based
> > on Hadoop, and we are committed to helping make Hadoop a world-class
> > solution that anyone can use to solve their biggest data challenges.
> >
> > As an immediate next step, we would like to have a discussion around how
> > we can ensure that the quality of the mainline Hadoop branches on Windows
> > is maintained. To this end, we would like to get to the state where we
> have
> > pre-checkin validation gates and nightly test runs enabled on Windows. If
> > you have any suggestions around this, please do send an email.  We are
> > committed to helping sustain the long-term quality of Hadoop on both
> Linux
> > and Windows.
> >
> > We sincerely thank the community for their contribution and support so
> > far. And hope to continue having a close engagement in the future.
> >
> > -Microsoft HDInsight Team
> >
> >
> > -----Original Message-----
> > From: Suresh Srinivas [mailto:[hidden email]]
> > Sent: Thursday, February 7, 2013 5:42 PM
> > To: [hidden email]; [hidden email];
> > [hidden email]; [hidden email]
> > Subject: Heads up - merge branch-trunk-win to trunk
> >
> > The support for Hadoop on Windows was proposed in HADOOP-8079<
> > https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> The
> > goal was to make Hadoop natively integrated, full-featured, and
> performance
> > and scalability tuned on Windows Server or Windows Azure.
> > We are happy to announce that a lot of progress has been made in this
> > regard.
> >
> > Initial work started in a feature branch, branch-1-win, based on
> branch-1.
> > The details related to the work done in the branch can be seen in
> > CHANGES.txt<
> >
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANGES.branch-1-win.txt?view=markup
> > >.
> > This work has been ported to a branch, branch-trunk-win, based on trunk.
> > Merge patch for this is available on
> > HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > .
> >
> > Highlights of the work done so far:
> > 1. Necessary changes in Hadoop to run natively on Windows. These changes
> > handle differences in platforms related to path names, process/task
> > management etc.
> > 2. Addition of winutils tools for managing file permissions and
> ownership,
> > user group mapping, hardlinks, symbolic links, chmod, disk utilization,
> and
> > process/task management.
> > 3. Added cmd scripts equivalent to existing shell scripts
> > hadoop-daemon.sh, start and stop scripts.
> > 4. Addition of block placement policy implemnation to support cloud
> > enviroment, more specifically Azure.
> >
> > We are very close to wrapping up the work in branch-trunk-win and getting
> > ready for a merge. Currently the merge patch is passing close to 100% of
> > unit tests on Linux. Soon I will call for a vote to merge this branch
> into
> > trunk.
> >
> > Next steps:
> > 1. Call for vote to merge branch-trunk-win to trunk, when the work
> > completes and precommit build is clean.
> > 2. Start a discussion on adding Jenkins precommit builds on windows and
> > how to integrate that with the existing commit process.
> >
> > Let me know if you have any questions.
> >
> > Regards,
> > Suresh
> >
> >
>
>
> --
> http://hortonworks.com/download/
>
Reply | Threaded
Open this post in threaded view
|

RE: [Vote] Merge branch-trunk-win to trunk

Kanna Karanam
+1 non-binding

I am playing with it for several months in a multi-node Windows cluster environment and found it is very stable. I am sure that it can help us to bring more developers like me (JAVA & Windows Developers) to contribute more and help the Hadoop customer & developer communities.

Thanks,
Kanna

-----Original Message-----
From: Raja Aluri [mailto:[hidden email]]
Sent: Thursday, February 28, 2013 11:17 AM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [Vote] Merge branch-trunk-win to trunk

+1 non-binding
Nice to see that this work is going to trunk.

Raja  Aluri


On Tue, Feb 26, 2013 at 2:55 PM, Suresh Srinivas <[hidden email]>wrote:

> I had posted heads up about merging branch-trunk-win to trunk on Feb
> 8th. I am happy to announce that we are ready for the merge.
>
> Here is a brief recap on the highlights of the work done:
> - Command-line scripts for the Hadoop surface area
> - Mapping the HDFS permissions model to Windows
> - Abstracted and reconciled mismatches around differences in Path
> semantics in Java and Windows
> - Native Task Controller for Windows
> - Implementation of a Block Placement Policy to support cloud
> environments, more specifically Azure.
> - Implementation of Hadoop native libraries for Windows (compression
> codecs, native I/O)
> - Several reliability issues, including race-conditions, intermittent
> test failures, resource leaks.
> - Several new unit test cases written for the above changes
>
> Please find the details of the work in CHANGES.branch-trunk-win.txt -
> Common changes<http://bit.ly/Xe7Ynv>, HDFS
> changes<http://bit.ly/13QOSo9>, and YARN and MapReduce changes
> <http://bit.ly/128zzMt>. This is the work ported from branch-1-win to a branch based on trunk.
>
> For details of the testing done, please see the thread -
> http://bit.ly/WpavJ4. Merge patch for this is available on
> HADOOP-8562< https://issues.apache.org/jira/browse/HADOOP-8562>.
>
> This was a large undertaking that involved developing code, testing
> the entire Hadoop stack, including scale tests. This is made possible
> only with the contribution from many many folks in the community.
> Following people contributed to this work: Ivan Mitic, Chuan Liu,
> Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris
> Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz,
> Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja
> Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp,
> Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia.
> There are many others who contributed as well providing feedback and comments on numerous jiras.
>
> The vote will run for seven days and will end on March 5, 6:00PM PST.
>
> Regards,
> Suresh
>
>
>
>
> On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> <[hidden email]>wrote:
>
> > It is super exciting to look at the prospect of these changes being
> merged
> > to trunk. Having Windows as one of the supported Hadoop platforms is
> > a fantastic opportunity both for the Hadoop project and Microsoft
> customers.
> >
> > This work began around a year back when a few of us started with a
> > basic port of Hadoop on Windows. Ever since, the Hadoop team in
> > Microsoft have made significant progress in the following areas:
> > (PS: Some of these items are already included in Suresh's email, but
> > including again for completeness)
> >
> > - Command-line scripts for the Hadoop surface area
> > - Mapping the HDFS permissions model to Windows
> > - Abstracted and reconciled mismatches around differences in Path
> > semantics in Java and Windows
> > - Native Task Controller for Windows
> > - Implementation of a Block Placement Policy to support cloud
> > environments, more specifically Azure.
> > - Implementation of Hadoop native libraries for Windows (compression
> > codecs, native I/O) - Several reliability issues, including
> > race-conditions, intermittent test failures, resource leaks.
> > - Several new unit test cases written for the above changes
> >
> > In the process, we have closely engaged with the Apache open source
> > community and have got great support and assistance from the
> > community in terms of contributing fixes, code review comments and commits.
> >
> > In addition, the Hadoop team at Microsoft has also made good
> > progress in other projects including Hive, Pig, Sqoop, Oozie, HCat
> > and HBase. Many of these changes have already been committed to the
> > respective trunks with help from various committers and
> > contributors. It is great to see the commitment of the community to
> > support multiple platforms, and we look forward to the day when a
> > developer/customer is able to successfully
> deploy
> > a complete solution stack based on Apache Hadoop releases.
> >
> > Next Steps:
> >
> > All of the above changes are part of the Windows Azure HDInsight and
> > HDInsight Server products from Microsoft. We have successfully
> > on-boarded several internal customers and have been running
> > production workloads on Windows Azure HDInsight. Our vision is to
> > create a big data platform
> based
> > on Hadoop, and we are committed to helping make Hadoop a world-class
> > solution that anyone can use to solve their biggest data challenges.
> >
> > As an immediate next step, we would like to have a discussion around
> > how we can ensure that the quality of the mainline Hadoop branches
> > on Windows is maintained. To this end, we would like to get to the
> > state where we
> have
> > pre-checkin validation gates and nightly test runs enabled on
> > Windows. If you have any suggestions around this, please do send an
> > email.  We are committed to helping sustain the long-term quality of
> > Hadoop on both
> Linux
> > and Windows.
> >
> > We sincerely thank the community for their contribution and support
> > so far. And hope to continue having a close engagement in the future.
> >
> > -Microsoft HDInsight Team
> >
> >
> > -----Original Message-----
> > From: Suresh Srinivas [mailto:[hidden email]]
> > Sent: Thursday, February 7, 2013 5:42 PM
> > To: [hidden email]; [hidden email];
> > [hidden email]; [hidden email]
> > Subject: Heads up - merge branch-trunk-win to trunk
> >
> > The support for Hadoop on Windows was proposed in HADOOP-8079<
> > https://issues.apache.org/jira/browse/HADOOP-8079> almost a year ago.
> The
> > goal was to make Hadoop natively integrated, full-featured, and
> performance
> > and scalability tuned on Windows Server or Windows Azure.
> > We are happy to announce that a lot of progress has been made in
> > this regard.
> >
> > Initial work started in a feature branch, branch-1-win, based on
> branch-1.
> > The details related to the work done in the branch can be seen in
> > CHANGES.txt<
> >
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHANG
> ES.branch-1-win.txt?view=markup
> > >.
> > This work has been ported to a branch, branch-trunk-win, based on trunk.
> > Merge patch for this is available on
> > HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> > .
> >
> > Highlights of the work done so far:
> > 1. Necessary changes in Hadoop to run natively on Windows. These
> > changes handle differences in platforms related to path names,
> > process/task management etc.
> > 2. Addition of winutils tools for managing file permissions and
> ownership,
> > user group mapping, hardlinks, symbolic links, chmod, disk
> > utilization,
> and
> > process/task management.
> > 3. Added cmd scripts equivalent to existing shell scripts
> > hadoop-daemon.sh, start and stop scripts.
> > 4. Addition of block placement policy implemnation to support cloud
> > enviroment, more specifically Azure.
> >
> > We are very close to wrapping up the work in branch-trunk-win and
> > getting ready for a merge. Currently the merge patch is passing
> > close to 100% of unit tests on Linux. Soon I will call for a vote to
> > merge this branch
> into
> > trunk.
> >
> > Next steps:
> > 1. Call for vote to merge branch-trunk-win to trunk, when the work
> > completes and precommit build is clean.
> > 2. Start a discussion on adding Jenkins precommit builds on windows
> > and how to integrate that with the existing commit process.
> >
> > Let me know if you have any questions.
> >
> > Regards,
> > Suresh
> >
> >
>
>
> --
> http://hortonworks.com/download/
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Robert Evans
In reply to this post by Chris Nauroth
My initial question was mostly intended to understand the desired new
classification of Windows after the merge, and how we plan to maintain
Windows support.  I am happy to hear that hardware for Jenkins will be
provided.  I am also fine, at least initially, with us trying to treat
Windows as a first class supported platform.  But I realize that there are
a lot of people that do not have easy access to Windows for
development/debugging, myself included. I also don't want to slow down the
pace of development too much because of this.  It will cause some
organizations that do not use or support Windows to be more likely to run
software that has diverged from an official release.  It also has the
potential to make the patch submission process even more difficult, which
increases the likelihood of submitters abandoning patches.  However, the
great thing about being in a community is we can change if we need to.

I am +0 for the merge.  I am not a Windows expert so I don't feel
comfortable giving it a true +1.

--Bobby


On 2/28/13 10:45 AM, "Chris Nauroth" <[hidden email]> wrote:

>I'd like to share a few anecdotes about developing cross-platform,
>hopefully to address some of the concerns about adding overhead to the
>development process.  By reviewing past cases of cross-platform Linux vs.
>Windows bugs, we can get a sense for how the development process could
>look
>in the future.
>
>HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run on
>Windows.  As part of an earlier jira, HADOOP-8962, there was a new test
>committed on trunk covering the case of a local file system interaction on
>a file containing a ':'.  On Windows, ':' in a path has special meaning as
>part of the drive specifier (i.e. C:), so this test cannot pass when
>running on Windows.  In this kind of case, the cross-platform bug is
>obvious, and the fix is obvious (assumeTrue(!Shell.WINDOWS)).  Ideally,
>this would get fixed pre-commit after seeing a -1 from the Windows Jenkins
>slave.
>
>HDFS-4274: BlockPoolSliceScanner does not close verification log during
>shutdown.  This caused problems for MiniDFSCluster-based tests running on
>Windows.  Failure to close the verification log meant that we didn't
>release file locks, so the tests couldn't delete/recreate working
>directories during teardown/setup.  Arguably, this was always a bug, and
>running on Windows just exposed it because of its stricter rules about
>file
>locking.  This is a more complex fix, but it doesn't require
>platform-specific knowledge.  If some future patch accidentally regresses
>this, then we'll likely see +1 from Linux Jenkins and -1 from Windows
>Jenkins.  Ideally, it would get fixed pre-commit, because it doesn't
>require Windows-specific knowledge.  There is also the matter of impact.
> Re-breaking this would re-break many test suites on Windows.
>
>HADOOP-9232: JniBasedUnixGroupsMappingWithFallback fails on Windows with
>UnsatisfiedLinkError.  This was introduced by HADOOP-8712, which switched
>to JniBasedUnixGroupsMappingWithFallback as the default
>hadoop.security.group.mapping, but did not provide a Windows
>implementation
>of the JNI function.  In this case, there was a strong desire to get
>HADOOP-8712 into a release, fixing it on Windows required native Windows
>API knowledge, and Windows users had a simple workaround available by
>changing their configs back to ShellBasedUnixGroupsMapping.  I think this
>is the kind of situation where we could allow HADOOP-8712 to commit
>despite
>-1 from Windows Jenkins, with fairly quick follow-up from an engineer with
>the Windows expertise to fix it.
>
>To summarize, I don't think it needs to differ greatly from our current
>development process.  We're all responsible for breadth of understanding
>and maintenance of the whole codebase, but we also rely on specific
>individuals with deep expertise in particular areas for certain issues.
> Sometimes we commit despite a -1 from Jenkins, based on the community's
>judgment.
>
>Virtualization greatly simplifies cross-platform development.  I use
>VirtualBox on a Mac host and run VMs for Windows and Ubuntu with a shared
>drive so that they can all see the same copy of the source code.  There
>are
>plenty of variations on this depending on your preference, such as
>offloading the VMs to a separate server or cloud service to free up local
>RAM.  I'm planning on submitting BUILDING.txt changes later today that
>fully describe how to build on Windows.  After some initial setup, it's
>nearly identical to the mvn commands that you already use today.
>
>Hope this helps,
>--Chris
>
>
>On Thu, Feb 28, 2013 at 3:25 AM, John Gordon
><[hidden email]>wrote:
>
>> +1 (non-binding)
>>
>> I want to share my vote of confidence in this community.  If motivated
>>to
>> do so, this community can keep this project cross-platform and continue
>>to
>> rapidly innovate without breaking a sweat.
>>
>> The day we started working on this, I saw the foundations of greatness
>>in
>> the quality and volume of dev tests, the code itself, and the Apache
>>values
>> themselves.
>>
>> 1.) Hadoop's unit tests and their frameworks are very well thought out
>>and
>> the consideration and energy that went into their design is worthy of
>> praise.  The MiniCluster abstractions utilize very few resources and put
>> all the processes into one JVM for easy debugging.  It is very easy to
>> select specific tests from the full suite to reproduce an issue
>>reported in
>> another environment - like the Jenkins build server or another
>> contributor's environment.
>> 2.) This community has done an excellent job of incorporating
>>well-placed
>> log messages to make it easy to post mortem troubleshoot most failures.
>>  The logs are very useful, and it is extremely rare that
>>troubleshooting a
>> failure requires debugging a live repro.
>> 3.) Hadoop is written primarily in Java, a cross-platform language that
>> provides its own platform in the form of the JVM to insulate most of the
>> code from the specifics of the OS layer.
>> 4.) CoPDoC - The right priorities, and well stated.
>>
>>
>> Thank you,
>>
>> John
>>
>> -----Original Message-----
>> From: Ivan Mitic [mailto:[hidden email]]
>> Sent: Wednesday, February 27, 2013 6:32 PM
>> To: [hidden email]; [hidden email]
>> Cc: [hidden email]; [hidden email]
>> Subject: RE: [Vote] Merge branch-trunk-win to trunk
>>
>> +1 (non-binding)
>>
>> I am really glad to see this happening! As people already mentioned,
>>this
>> has been a great engineering effort involving many people!
>>
>>
>> Folks raised some valid concerns below and I thought it would be good to
>> share my 2 cents. In my opinion, we don't have to solve all these
>>problems
>> right now. As we move forward with two platforms, we can start
>>addressing
>> one problem at a time and incrementally improve. In the first iteration,
>> maintaining Hadoop on Windows could be just everyone trying to do their
>> best effort (make sure Jenkins build succeeds at least). We already have
>> people who are building/running trunk on Windows daily, so they would
>>jump
>> in and fix problems as needed (we've been doing this in branch-trunk-win
>> for a while now). Although I see that the problems could arise with
>> platform specific features/optimizations, I don't think these are
>>frequent,
>> so in most cases everything will just work. Merging the two branches
>>sooner
>> rather than later does seems like the right thing to do if the ultimate
>> goal is to have Hadoop on both platforms. Now that the port has
>>completed,
>> we will have people in Microsoft (and elsewhere) wanting to contribute
>> features/improvements to the trunk branch. A separate branch would just
>> make things more difficult and confusing for everyone :) Hope this makes
>> sense.
>>
>> -----Original Message-----
>> From: Todd Lipcon [mailto:[hidden email]]
>> Sent: Wednesday, February 27, 2013 3:43 PM
>> To: [hidden email]
>> Cc: [hidden email]; [hidden email];
>> [hidden email]
>> Subject: Re: [Vote] Merge branch-trunk-win to trunk
>>
>> On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <[hidden email]
>> >wrote:
>>
>> > With that we need to decide how our precommit process looks.
>> > My inclination is to wait for +1 from precommit builds on both the
>> > platforms to ensure no issues are introduced.
>> > Thoughts?
>> >
>> > 2. Feature development impact
>> > Some questions have been raised about would new features need to be
>> > supported on both the platforms. Yes. I do not see a reason why
>> > features cannot work on both the platforms, with the exception of
>> > platform specific optimizations. This what Java gives us.
>> >
>> >
>> I'm concerned about the above. Personally, I don't have access to any
>> Windows boxes with development tools, and I know nothing about
>>developing
>> on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated,
>> for powerpoint :)
>>
>> If I submit a patch and it gets -1 "tests failed" on the Windows slave,
>> how am I supposed to proceed?
>>
>> I think a reasonable compromise would be that the tests should always
>> *build* on Windows before commit, and contributors should do their best
>>to
>> look at the test logs for any Windows-specific failures. But, beyond
>> looking at the logs, a "-1 Tests failed on windows" should not block a
>> commit.
>>
>> Those contributors who are interested in Windows being a first-class
>> platform should be responsible for watching the Windows builds and
>> debugging/fixing any regressions that might be Windows-specific.
>>
>> I also think the KDE model that Harsh pointed out is an interesting one
>>--
>> ie the idea that we would not merge windows support to trunk, but rather
>> treat is as a "parallel code line" which lives in the ASF and has its
>>own
>> builds and releases. The windows team would periodically merge
>>trunk->win
>> to pick up any new changes, and do a separate test/release process. I'm
>>not
>> convinced this is the best idea, but worth discussion of pros and cons.
>>
>> -Todd
>>
>>
>> >
>> > On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]>
>>wrote:
>> >
>> > > Bobby raises some good questions.  A related one, since most current
>> > > developers won't add Windows support for new features that are
>> > > platform specific is it assumed that Windows development will either
>> > > lag or will people actively work on keeping Windows up with the
>> > > latest?  And vice versa in case Windows support is implemented
>>first.
>> > >
>> > > Is there a jira for resolving the outstanding TODOs in the code base
>> > > (similar to HDFS-2148)?  Looks like this merge doesn't introduce
>> > > many which is great (just did a quick diff and grep).
>> > >
>> > > Thanks,
>> > > Eli
>> > >
>> > > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
>> > wrote:
>> > > > After this is merged in is Windows still going to be a second
>> > > > class citizen but happens to work for more than just development
>> > > > or is it a fully supported platform where if something breaks it
>> > > > can block a
>> > > release?
>> > > >  How do we as a community intend to keep Windows support from
>> breaking?
>> > > > We don't have any Jenkins slaves to be able to run nightly tests
>> > > > to validate everything still compiles/runs.  This is not a blocker
>> > > > for me because we often rely on individuals and groups to test
>> > > > Hadoop, but I
>> > do
>> > > > think we need to have this discussion before we put it in.
>> > > >
>> > > > --Bobby
>> > > >
>> > > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]>
>> wrote:
>> > > >
>> > > >>I had posted heads up about merging branch-trunk-win to trunk on
>> > > >>Feb
>> > 8th.
>> > > >>I
>> > > >>am happy to announce that we are ready for the merge.
>> > > >>
>> > > >>Here is a brief recap on the highlights of the work done:
>> > > >>- Command-line scripts for the Hadoop surface area
>> > > >>- Mapping the HDFS permissions model to Windows
>> > > >>- Abstracted and reconciled mismatches around differences in Path
>> > > >>semantics in Java and Windows
>> > > >>- Native Task Controller for Windows
>> > > >>- Implementation of a Block Placement Policy to support cloud
>> > > >>environments, more specifically Azure.
>> > > >>- Implementation of Hadoop native libraries for Windows
>> > > >>(compression codecs, native I/O)
>> > > >>- Several reliability issues, including race-conditions,
>> > > >>intermittent
>> > > test
>> > > >>failures, resource leaks.
>> > > >>- Several new unit test cases written for the above changes
>> > > >>
>> > > >>Please find the details of the work in
>> > > >>CHANGES.branch-trunk-win.txt - Common
>> > > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
>> > http://bit.ly/13QOSo9
>> > > >,
>> > > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is
>> > > >>the
>> > work
>> > > >>ported from branch-1-win to a branch based on trunk.
>> > > >>
>> > > >>For details of the testing done, please see the thread -
>> > > >>http://bit.ly/WpavJ4. Merge patch for this is available on
>> > HADOOP-8562<
>> > > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
>> > > >>
>> > > >>This was a large undertaking that involved developing code,
>> > > >>testing the entire Hadoop stack, including scale tests. This is
>> > > >>made possible only with the contribution from many many folks in
>> > > >>the community. Following
>> > people
>> > > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
>> > > >>Bikas
>> > Saha,
>> > > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
>> > > Sumadhur
>> > > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
>> > Thejas
>> > > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan,
>> > > >>Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
>> > > >>Nicholas Sze,
>> > > Suresh
>> > > >>Srinivas and Sanjay Radia. There are many others who contributed
>> > > >>as
>> > well
>> > > >>providing feedback and comments on numerous jiras.
>> > > >>
>> > > >>The vote will run for seven days and will end on March 5, 6:00PM
>>PST.
>> > > >>
>> > > >>Regards,
>> > > >>Suresh
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
>> > > >><[hidden email]>wrote:
>> > > >>
>> > > >>> It is super exciting to look at the prospect of these changes
>> > > >>>being merged  to trunk. Having Windows as one of the supported
>> > > >>>Hadoop platforms is
>> > a
>> > > >>> fantastic opportunity both for the Hadoop project and Microsoft
>> > > >>>customers.
>> > > >>>
>> > > >>> This work began around a year back when a few of us started with
>> > > >>> a
>> > > basic
>> > > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
>> > > >>> Microsoft
>> > > have
>> > > >>> made significant progress in the following areas:
>> > > >>> (PS: Some of these items are already included in Suresh's email,
>> > > >>> but including again for completeness)
>> > > >>>
>> > > >>> - Command-line scripts for the Hadoop surface area
>> > > >>> - Mapping the HDFS permissions model to Windows
>> > > >>> - Abstracted and reconciled mismatches around differences in
>> > > >>> Path semantics in Java and Windows
>> > > >>> - Native Task Controller for Windows
>> > > >>> - Implementation of a Block Placement Policy to support cloud
>> > > >>> environments, more specifically Azure.
>> > > >>> - Implementation of Hadoop native libraries for Windows
>> > > >>> (compression codecs, native I/O) - Several reliability issues,
>> > > >>> including race-conditions, intermittent test failures, resource
>> leaks.
>> > > >>> - Several new unit test cases written for the above changes
>> > > >>>
>> > > >>> In the process, we have closely engaged with the Apache open
>> > > >>> source community and have got great support and assistance from
>> > > >>> the
>> > community
>> > > >>>in
>> > > >>> terms of contributing fixes, code review comments and commits.
>> > > >>>
>> > > >>> In addition, the Hadoop team at Microsoft has also made good
>> > > >>> progress
>> > > in
>> > > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and
>>HBase.
>> > Many
>> > > >>>of
>> > > >>> these changes have already been committed to the respective
>> > > >>>trunks
>> > with
>> > > >>> help from various committers and contributors. It is great to
>> > > >>> see the commitment of the community to support multiple
>> > > >>> platforms, and we
>> > look
>> > > >>> forward to the day when a developer/customer is able to
>> > > >>>successfully deploy  a complete solution stack based on Apache
>> > > >>>Hadoop releases.
>> > > >>>
>> > > >>> Next Steps:
>> > > >>>
>> > > >>> All of the above changes are part of the Windows Azure HDInsight
>> > > >>>and  HDInsight Server products from Microsoft. We have
>> > > >>>successfully on-boarded  several internal customers and have been
>> > > >>>running production workloads
>> > > on
>> > > >>> Windows Azure HDInsight. Our vision is to create a big data
>> > > >>>platform based  on Hadoop, and we are committed to helping make
>> > > >>>Hadoop a world-class  solution that anyone can use to solve their
>> > > >>>biggest data challenges.
>> > > >>>
>> > > >>> As an immediate next step, we would like to have a discussion
>> > > >>> around
>> > > how
>> > > >>> we can ensure that the quality of the mainline Hadoop branches
>> > > >>>on Windows  is maintained. To this end, we would like to get to
>> > > >>>the state where
>> > we
>> > > >>>have
>> > > >>> pre-checkin validation gates and nightly test runs enabled on
>> > Windows.
>> > > >>>If
>> > > >>> you have any suggestions around this, please do send an email.
>> > > >>>We
>> > are
>> > > >>> committed to helping sustain the long-term quality of Hadoop on
>> > > >>>both Linux  and Windows.
>> > > >>>
>> > > >>> We sincerely thank the community for their contribution and
>> > > >>> support
>> > so
>> > > >>> far. And hope to continue having a close engagement in the
>>future.
>> > > >>>
>> > > >>> -Microsoft HDInsight Team
>> > > >>>
>> > > >>>
>> > > >>> -----Original Message-----
>> > > >>> From: Suresh Srinivas [mailto:[hidden email]]
>> > > >>> Sent: Thursday, February 7, 2013 5:42 PM
>> > > >>> To: [hidden email]; [hidden email];
>> > > >>> [hidden email]; [hidden email]
>> > > >>> Subject: Heads up - merge branch-trunk-win to trunk
>> > > >>>
>> > > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
>> > > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a year
>> > ago.
>> > > >>>The
>> > > >>> goal was to make Hadoop natively integrated, full-featured, and
>> > > >>>performance  and scalability tuned on Windows Server or Windows
>> > > >>>Azure.
>> > > >>> We are happy to announce that a lot of progress has been made in
>> > > >>>this  regard.
>> > > >>>
>> > > >>> Initial work started in a feature branch, branch-1-win, based on
>> > > >>>branch-1.
>> > > >>> The details related to the work done in the branch can be seen
>> > > >>>in  CHANGES.txt<
>> > > >>>
>> > > >>>
>> > > http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
>> > > NGES
>> > .
>> > > >>>branch-1-win.txt?view=markup
>> > > >>> >.
>> > > >>> This work has been ported to a branch, branch-trunk-win, based
>> > > >>> on
>> > > trunk.
>> > > >>> Merge patch for this is available on
>> > > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
>> > > >>> .
>> > > >>>
>> > > >>> Highlights of the work done so far:
>> > > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
>> > > changes
>> > > >>> handle differences in platforms related to path names,
>> > > >>>process/task  management etc.
>> > > >>> 2. Addition of winutils tools for managing file permissions and
>> > > >>>ownership,  user group mapping, hardlinks, symbolic links, chmod,
>> > > >>>disk
>> > utilization,
>> > > >>>and
>> > > >>> process/task management.
>> > > >>> 3. Added cmd scripts equivalent to existing shell scripts
>> > > >>>hadoop-daemon.sh, start and stop scripts.
>> > > >>> 4. Addition of block placement policy implemnation to support
>> > > >>>cloud  enviroment, more specifically Azure.
>> > > >>>
>> > > >>> We are very close to wrapping up the work in branch-trunk-win
>> > > >>>and getting  ready for a merge. Currently the merge patch is
>> > > >>>passing close to 100%
>> > > of
>> > > >>> unit tests on Linux. Soon I will call for a vote to merge this
>> > > >>>branch into  trunk.
>> > > >>>
>> > > >>> Next steps:
>> > > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the
>> > > >>> work completes and precommit build is clean.
>> > > >>> 2. Start a discussion on adding Jenkins precommit builds on
>> > > >>> windows
>> > and
>> > > >>> how to integrate that with the existing commit process.
>> > > >>>
>> > > >>> Let me know if you have any questions.
>> > > >>>
>> > > >>> Regards,
>> > > >>> Suresh
>> > > >>>
>> > > >>>
>> > > >>
>> > > >>
>> > > >>--
>> > > >>http://hortonworks.com/download/
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > http://hortonworks.com/download/
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>>
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Chris Nauroth
> Is there a jira for resolving the outstanding TODOs in the code base
> (similar to HDFS-2148)?  Looks like this merge doesn't introduce many
> which is great (just did a quick diff and grep).

I found 2 remaining TODOs introduced in the current merge patch.  One is in
ContainerLaunch.java.  The container launch script was trying to set a
CLASSPATH that exceeded the Windows maximum command line length.  The fix
was to wrap the long classpath into an intermediate jar containing only a
manifest file with a Class-Path entry.  (See YARN-316.)  Just to be
conservative, we wrapped this logic in an if (Shell.WINDOWS) guard and
marked a TODO to remove it later and use that approach on all platforms
after additional testing.  I've tested this code path successfully on Mac
too, but several people wanted additional testing and performance checks
before removing the if (Shell.WINDOWS) guard.  That work is tracked in an
existing jira: YARN-358.

The other TODO is for winutils to print more usage information and
examples.  At this point, I think winutils is printing sufficient
information, and we can just remove the TODO.  I just submitted a new jira
to start that conversation: HADOOP-9348.

Thank you,
--Chris


On Thu, Feb 28, 2013 at 11:29 AM, Robert Evans <[hidden email]> wrote:

> My initial question was mostly intended to understand the desired new
> classification of Windows after the merge, and how we plan to maintain
> Windows support.  I am happy to hear that hardware for Jenkins will be
> provided.  I am also fine, at least initially, with us trying to treat
> Windows as a first class supported platform.  But I realize that there are
> a lot of people that do not have easy access to Windows for
> development/debugging, myself included. I also don't want to slow down the
> pace of development too much because of this.  It will cause some
> organizations that do not use or support Windows to be more likely to run
> software that has diverged from an official release.  It also has the
> potential to make the patch submission process even more difficult, which
> increases the likelihood of submitters abandoning patches.  However, the
> great thing about being in a community is we can change if we need to.
>
> I am +0 for the merge.  I am not a Windows expert so I don't feel
> comfortable giving it a true +1.
>
> --Bobby
>
>
> On 2/28/13 10:45 AM, "Chris Nauroth" <[hidden email]> wrote:
>
> >I'd like to share a few anecdotes about developing cross-platform,
> >hopefully to address some of the concerns about adding overhead to the
> >development process.  By reviewing past cases of cross-platform Linux vs.
> >Windows bugs, we can get a sense for how the development process could
> >look
> >in the future.
> >
> >HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run on
> >Windows.  As part of an earlier jira, HADOOP-8962, there was a new test
> >committed on trunk covering the case of a local file system interaction on
> >a file containing a ':'.  On Windows, ':' in a path has special meaning as
> >part of the drive specifier (i.e. C:), so this test cannot pass when
> >running on Windows.  In this kind of case, the cross-platform bug is
> >obvious, and the fix is obvious (assumeTrue(!Shell.WINDOWS)).  Ideally,
> >this would get fixed pre-commit after seeing a -1 from the Windows Jenkins
> >slave.
> >
> >HDFS-4274: BlockPoolSliceScanner does not close verification log during
> >shutdown.  This caused problems for MiniDFSCluster-based tests running on
> >Windows.  Failure to close the verification log meant that we didn't
> >release file locks, so the tests couldn't delete/recreate working
> >directories during teardown/setup.  Arguably, this was always a bug, and
> >running on Windows just exposed it because of its stricter rules about
> >file
> >locking.  This is a more complex fix, but it doesn't require
> >platform-specific knowledge.  If some future patch accidentally regresses
> >this, then we'll likely see +1 from Linux Jenkins and -1 from Windows
> >Jenkins.  Ideally, it would get fixed pre-commit, because it doesn't
> >require Windows-specific knowledge.  There is also the matter of impact.
> > Re-breaking this would re-break many test suites on Windows.
> >
> >HADOOP-9232: JniBasedUnixGroupsMappingWithFallback fails on Windows with
> >UnsatisfiedLinkError.  This was introduced by HADOOP-8712, which switched
> >to JniBasedUnixGroupsMappingWithFallback as the default
> >hadoop.security.group.mapping, but did not provide a Windows
> >implementation
> >of the JNI function.  In this case, there was a strong desire to get
> >HADOOP-8712 into a release, fixing it on Windows required native Windows
> >API knowledge, and Windows users had a simple workaround available by
> >changing their configs back to ShellBasedUnixGroupsMapping.  I think this
> >is the kind of situation where we could allow HADOOP-8712 to commit
> >despite
> >-1 from Windows Jenkins, with fairly quick follow-up from an engineer with
> >the Windows expertise to fix it.
> >
> >To summarize, I don't think it needs to differ greatly from our current
> >development process.  We're all responsible for breadth of understanding
> >and maintenance of the whole codebase, but we also rely on specific
> >individuals with deep expertise in particular areas for certain issues.
> > Sometimes we commit despite a -1 from Jenkins, based on the community's
> >judgment.
> >
> >Virtualization greatly simplifies cross-platform development.  I use
> >VirtualBox on a Mac host and run VMs for Windows and Ubuntu with a shared
> >drive so that they can all see the same copy of the source code.  There
> >are
> >plenty of variations on this depending on your preference, such as
> >offloading the VMs to a separate server or cloud service to free up local
> >RAM.  I'm planning on submitting BUILDING.txt changes later today that
> >fully describe how to build on Windows.  After some initial setup, it's
> >nearly identical to the mvn commands that you already use today.
> >
> >Hope this helps,
> >--Chris
> >
> >
> >On Thu, Feb 28, 2013 at 3:25 AM, John Gordon
> ><[hidden email]>wrote:
> >
> >> +1 (non-binding)
> >>
> >> I want to share my vote of confidence in this community.  If motivated
> >>to
> >> do so, this community can keep this project cross-platform and continue
> >>to
> >> rapidly innovate without breaking a sweat.
> >>
> >> The day we started working on this, I saw the foundations of greatness
> >>in
> >> the quality and volume of dev tests, the code itself, and the Apache
> >>values
> >> themselves.
> >>
> >> 1.) Hadoop's unit tests and their frameworks are very well thought out
> >>and
> >> the consideration and energy that went into their design is worthy of
> >> praise.  The MiniCluster abstractions utilize very few resources and put
> >> all the processes into one JVM for easy debugging.  It is very easy to
> >> select specific tests from the full suite to reproduce an issue
> >>reported in
> >> another environment - like the Jenkins build server or another
> >> contributor's environment.
> >> 2.) This community has done an excellent job of incorporating
> >>well-placed
> >> log messages to make it easy to post mortem troubleshoot most failures.
> >>  The logs are very useful, and it is extremely rare that
> >>troubleshooting a
> >> failure requires debugging a live repro.
> >> 3.) Hadoop is written primarily in Java, a cross-platform language that
> >> provides its own platform in the form of the JVM to insulate most of the
> >> code from the specifics of the OS layer.
> >> 4.) CoPDoC - The right priorities, and well stated.
> >>
> >>
> >> Thank you,
> >>
> >> John
> >>
> >> -----Original Message-----
> >> From: Ivan Mitic [mailto:[hidden email]]
> >> Sent: Wednesday, February 27, 2013 6:32 PM
> >> To: [hidden email]; [hidden email]
> >> Cc: [hidden email]; [hidden email]
> >> Subject: RE: [Vote] Merge branch-trunk-win to trunk
> >>
> >> +1 (non-binding)
> >>
> >> I am really glad to see this happening! As people already mentioned,
> >>this
> >> has been a great engineering effort involving many people!
> >>
> >>
> >> Folks raised some valid concerns below and I thought it would be good to
> >> share my 2 cents. In my opinion, we don't have to solve all these
> >>problems
> >> right now. As we move forward with two platforms, we can start
> >>addressing
> >> one problem at a time and incrementally improve. In the first iteration,
> >> maintaining Hadoop on Windows could be just everyone trying to do their
> >> best effort (make sure Jenkins build succeeds at least). We already have
> >> people who are building/running trunk on Windows daily, so they would
> >>jump
> >> in and fix problems as needed (we've been doing this in branch-trunk-win
> >> for a while now). Although I see that the problems could arise with
> >> platform specific features/optimizations, I don't think these are
> >>frequent,
> >> so in most cases everything will just work. Merging the two branches
> >>sooner
> >> rather than later does seems like the right thing to do if the ultimate
> >> goal is to have Hadoop on both platforms. Now that the port has
> >>completed,
> >> we will have people in Microsoft (and elsewhere) wanting to contribute
> >> features/improvements to the trunk branch. A separate branch would just
> >> make things more difficult and confusing for everyone :) Hope this makes
> >> sense.
> >>
> >> -----Original Message-----
> >> From: Todd Lipcon [mailto:[hidden email]]
> >> Sent: Wednesday, February 27, 2013 3:43 PM
> >> To: [hidden email]
> >> Cc: [hidden email]; [hidden email];
> >> [hidden email]
> >> Subject: Re: [Vote] Merge branch-trunk-win to trunk
> >>
> >> On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <
> [hidden email]
> >> >wrote:
> >>
> >> > With that we need to decide how our precommit process looks.
> >> > My inclination is to wait for +1 from precommit builds on both the
> >> > platforms to ensure no issues are introduced.
> >> > Thoughts?
> >> >
> >> > 2. Feature development impact
> >> > Some questions have been raised about would new features need to be
> >> > supported on both the platforms. Yes. I do not see a reason why
> >> > features cannot work on both the platforms, with the exception of
> >> > platform specific optimizations. This what Java gives us.
> >> >
> >> >
> >> I'm concerned about the above. Personally, I don't have access to any
> >> Windows boxes with development tools, and I know nothing about
> >>developing
> >> on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated,
> >> for powerpoint :)
> >>
> >> If I submit a patch and it gets -1 "tests failed" on the Windows slave,
> >> how am I supposed to proceed?
> >>
> >> I think a reasonable compromise would be that the tests should always
> >> *build* on Windows before commit, and contributors should do their best
> >>to
> >> look at the test logs for any Windows-specific failures. But, beyond
> >> looking at the logs, a "-1 Tests failed on windows" should not block a
> >> commit.
> >>
> >> Those contributors who are interested in Windows being a first-class
> >> platform should be responsible for watching the Windows builds and
> >> debugging/fixing any regressions that might be Windows-specific.
> >>
> >> I also think the KDE model that Harsh pointed out is an interesting one
> >>--
> >> ie the idea that we would not merge windows support to trunk, but rather
> >> treat is as a "parallel code line" which lives in the ASF and has its
> >>own
> >> builds and releases. The windows team would periodically merge
> >>trunk->win
> >> to pick up any new changes, and do a separate test/release process. I'm
> >>not
> >> convinced this is the best idea, but worth discussion of pros and cons.
> >>
> >> -Todd
> >>
> >>
> >> >
> >> > On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]>
> >>wrote:
> >> >
> >> > > Bobby raises some good questions.  A related one, since most current
> >> > > developers won't add Windows support for new features that are
> >> > > platform specific is it assumed that Windows development will either
> >> > > lag or will people actively work on keeping Windows up with the
> >> > > latest?  And vice versa in case Windows support is implemented
> >>first.
> >> > >
> >> > > Is there a jira for resolving the outstanding TODOs in the code base
> >> > > (similar to HDFS-2148)?  Looks like this merge doesn't introduce
> >> > > many which is great (just did a quick diff and grep).
> >> > >
> >> > > Thanks,
> >> > > Eli
> >> > >
> >> > > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans <[hidden email]>
> >> > wrote:
> >> > > > After this is merged in is Windows still going to be a second
> >> > > > class citizen but happens to work for more than just development
> >> > > > or is it a fully supported platform where if something breaks it
> >> > > > can block a
> >> > > release?
> >> > > >  How do we as a community intend to keep Windows support from
> >> breaking?
> >> > > > We don't have any Jenkins slaves to be able to run nightly tests
> >> > > > to validate everything still compiles/runs.  This is not a blocker
> >> > > > for me because we often rely on individuals and groups to test
> >> > > > Hadoop, but I
> >> > do
> >> > > > think we need to have this discussion before we put it in.
> >> > > >
> >> > > > --Bobby
> >> > > >
> >> > > > On 2/26/13 4:55 PM, "Suresh Srinivas" <[hidden email]>
> >> wrote:
> >> > > >
> >> > > >>I had posted heads up about merging branch-trunk-win to trunk on
> >> > > >>Feb
> >> > 8th.
> >> > > >>I
> >> > > >>am happy to announce that we are ready for the merge.
> >> > > >>
> >> > > >>Here is a brief recap on the highlights of the work done:
> >> > > >>- Command-line scripts for the Hadoop surface area
> >> > > >>- Mapping the HDFS permissions model to Windows
> >> > > >>- Abstracted and reconciled mismatches around differences in Path
> >> > > >>semantics in Java and Windows
> >> > > >>- Native Task Controller for Windows
> >> > > >>- Implementation of a Block Placement Policy to support cloud
> >> > > >>environments, more specifically Azure.
> >> > > >>- Implementation of Hadoop native libraries for Windows
> >> > > >>(compression codecs, native I/O)
> >> > > >>- Several reliability issues, including race-conditions,
> >> > > >>intermittent
> >> > > test
> >> > > >>failures, resource leaks.
> >> > > >>- Several new unit test cases written for the above changes
> >> > > >>
> >> > > >>Please find the details of the work in
> >> > > >>CHANGES.branch-trunk-win.txt - Common
> >> > > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> >> > http://bit.ly/13QOSo9
> >> > > >,
> >> > > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is
> >> > > >>the
> >> > work
> >> > > >>ported from branch-1-win to a branch based on trunk.
> >> > > >>
> >> > > >>For details of the testing done, please see the thread -
> >> > > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> >> > HADOOP-8562<
> >> > > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> >> > > >>
> >> > > >>This was a large undertaking that involved developing code,
> >> > > >>testing the entire Hadoop stack, including scale tests. This is
> >> > > >>made possible only with the contribution from many many folks in
> >> > > >>the community. Following
> >> > people
> >> > > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
> >> > > >>Bikas
> >> > Saha,
> >> > > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,
> >> > > Sumadhur
> >> > > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao,
> >> > Thejas
> >> > > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan,
> >> > > >>Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo
> >> > > >>Nicholas Sze,
> >> > > Suresh
> >> > > >>Srinivas and Sanjay Radia. There are many others who contributed
> >> > > >>as
> >> > well
> >> > > >>providing feedback and comments on numerous jiras.
> >> > > >>
> >> > > >>The vote will run for seven days and will end on March 5, 6:00PM
> >>PST.
> >> > > >>
> >> > > >>Regards,
> >> > > >>Suresh
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> >> > > >><[hidden email]>wrote:
> >> > > >>
> >> > > >>> It is super exciting to look at the prospect of these changes
> >> > > >>>being merged  to trunk. Having Windows as one of the supported
> >> > > >>>Hadoop platforms is
> >> > a
> >> > > >>> fantastic opportunity both for the Hadoop project and Microsoft
> >> > > >>>customers.
> >> > > >>>
> >> > > >>> This work began around a year back when a few of us started with
> >> > > >>> a
> >> > > basic
> >> > > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
> >> > > >>> Microsoft
> >> > > have
> >> > > >>> made significant progress in the following areas:
> >> > > >>> (PS: Some of these items are already included in Suresh's email,
> >> > > >>> but including again for completeness)
> >> > > >>>
> >> > > >>> - Command-line scripts for the Hadoop surface area
> >> > > >>> - Mapping the HDFS permissions model to Windows
> >> > > >>> - Abstracted and reconciled mismatches around differences in
> >> > > >>> Path semantics in Java and Windows
> >> > > >>> - Native Task Controller for Windows
> >> > > >>> - Implementation of a Block Placement Policy to support cloud
> >> > > >>> environments, more specifically Azure.
> >> > > >>> - Implementation of Hadoop native libraries for Windows
> >> > > >>> (compression codecs, native I/O) - Several reliability issues,
> >> > > >>> including race-conditions, intermittent test failures, resource
> >> leaks.
> >> > > >>> - Several new unit test cases written for the above changes
> >> > > >>>
> >> > > >>> In the process, we have closely engaged with the Apache open
> >> > > >>> source community and have got great support and assistance from
> >> > > >>> the
> >> > community
> >> > > >>>in
> >> > > >>> terms of contributing fixes, code review comments and commits.
> >> > > >>>
> >> > > >>> In addition, the Hadoop team at Microsoft has also made good
> >> > > >>> progress
> >> > > in
> >> > > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and
> >>HBase.
> >> > Many
> >> > > >>>of
> >> > > >>> these changes have already been committed to the respective
> >> > > >>>trunks
> >> > with
> >> > > >>> help from various committers and contributors. It is great to
> >> > > >>> see the commitment of the community to support multiple
> >> > > >>> platforms, and we
> >> > look
> >> > > >>> forward to the day when a developer/customer is able to
> >> > > >>>successfully deploy  a complete solution stack based on Apache
> >> > > >>>Hadoop releases.
> >> > > >>>
> >> > > >>> Next Steps:
> >> > > >>>
> >> > > >>> All of the above changes are part of the Windows Azure HDInsight
> >> > > >>>and  HDInsight Server products from Microsoft. We have
> >> > > >>>successfully on-boarded  several internal customers and have been
> >> > > >>>running production workloads
> >> > > on
> >> > > >>> Windows Azure HDInsight. Our vision is to create a big data
> >> > > >>>platform based  on Hadoop, and we are committed to helping make
> >> > > >>>Hadoop a world-class  solution that anyone can use to solve their
> >> > > >>>biggest data challenges.
> >> > > >>>
> >> > > >>> As an immediate next step, we would like to have a discussion
> >> > > >>> around
> >> > > how
> >> > > >>> we can ensure that the quality of the mainline Hadoop branches
> >> > > >>>on Windows  is maintained. To this end, we would like to get to
> >> > > >>>the state where
> >> > we
> >> > > >>>have
> >> > > >>> pre-checkin validation gates and nightly test runs enabled on
> >> > Windows.
> >> > > >>>If
> >> > > >>> you have any suggestions around this, please do send an email.
> >> > > >>>We
> >> > are
> >> > > >>> committed to helping sustain the long-term quality of Hadoop on
> >> > > >>>both Linux  and Windows.
> >> > > >>>
> >> > > >>> We sincerely thank the community for their contribution and
> >> > > >>> support
> >> > so
> >> > > >>> far. And hope to continue having a close engagement in the
> >>future.
> >> > > >>>
> >> > > >>> -Microsoft HDInsight Team
> >> > > >>>
> >> > > >>>
> >> > > >>> -----Original Message-----
> >> > > >>> From: Suresh Srinivas [mailto:[hidden email]]
> >> > > >>> Sent: Thursday, February 7, 2013 5:42 PM
> >> > > >>> To: [hidden email]; [hidden email];
> >> > > >>> [hidden email]; [hidden email]
> >> > > >>> Subject: Heads up - merge branch-trunk-win to trunk
> >> > > >>>
> >> > > >>> The support for Hadoop on Windows was proposed in HADOOP-8079<
> >> > > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a
> year
> >> > ago.
> >> > > >>>The
> >> > > >>> goal was to make Hadoop natively integrated, full-featured, and
> >> > > >>>performance  and scalability tuned on Windows Server or Windows
> >> > > >>>Azure.
> >> > > >>> We are happy to announce that a lot of progress has been made in
> >> > > >>>this  regard.
> >> > > >>>
> >> > > >>> Initial work started in a feature branch, branch-1-win, based on
> >> > > >>>branch-1.
> >> > > >>> The details related to the work done in the branch can be seen
> >> > > >>>in  CHANGES.txt<
> >> > > >>>
> >> > > >>>
> >> > >
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
> >> > > NGES
> >> > .
> >> > > >>>branch-1-win.txt?view=markup
> >> > > >>> >.
> >> > > >>> This work has been ported to a branch, branch-trunk-win, based
> >> > > >>> on
> >> > > trunk.
> >> > > >>> Merge patch for this is available on
> >> > > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-8562>
> >> > > >>> .
> >> > > >>>
> >> > > >>> Highlights of the work done so far:
> >> > > >>> 1. Necessary changes in Hadoop to run natively on Windows. These
> >> > > changes
> >> > > >>> handle differences in platforms related to path names,
> >> > > >>>process/task  management etc.
> >> > > >>> 2. Addition of winutils tools for managing file permissions and
> >> > > >>>ownership,  user group mapping, hardlinks, symbolic links, chmod,
> >> > > >>>disk
> >> > utilization,
> >> > > >>>and
> >> > > >>> process/task management.
> >> > > >>> 3. Added cmd scripts equivalent to existing shell scripts
> >> > > >>>hadoop-daemon.sh, start and stop scripts.
> >> > > >>> 4. Addition of block placement policy implemnation to support
> >> > > >>>cloud  enviroment, more specifically Azure.
> >> > > >>>
> >> > > >>> We are very close to wrapping up the work in branch-trunk-win
> >> > > >>>and getting  ready for a merge. Currently the merge patch is
> >> > > >>>passing close to 100%
> >> > > of
> >> > > >>> unit tests on Linux. Soon I will call for a vote to merge this
> >> > > >>>branch into  trunk.
> >> > > >>>
> >> > > >>> Next steps:
> >> > > >>> 1. Call for vote to merge branch-trunk-win to trunk, when the
> >> > > >>> work completes and precommit build is clean.
> >> > > >>> 2. Start a discussion on adding Jenkins precommit builds on
> >> > > >>> windows
> >> > and
> >> > > >>> how to integrate that with the existing commit process.
> >> > > >>>
> >> > > >>> Let me know if you have any questions.
> >> > > >>>
> >> > > >>> Regards,
> >> > > >>> Suresh
> >> > > >>>
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >>--
> >> > > >>http://hortonworks.com/download/
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > http://hortonworks.com/download/
> >> >
> >>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >>
> >>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: [Vote] Merge branch-trunk-win to trunk

Chuan Liu
+1 (non-binding)

As someone also contributed to porting Hadoop to Windows, I think Java already provided a very good platform independent platform.
For features that are not available in Java, we will try to provide our platform independent APIs that abstract OS tasks away.
Most features should have no difficulty running on Windows and Linux by using Java and those platform independent APIs.

For concerns raise on new features that may fail on Windows, I think we don't need to require passing on Windows a mandate at the moment. We can simply mark it unavailable to Windows and port it later if the feature is important.

-Chuan

-----Original Message-----
From: Chris Nauroth [mailto:[hidden email]]
Sent: Thursday, February 28, 2013 11:51 AM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [Vote] Merge branch-trunk-win to trunk

> Is there a jira for resolving the outstanding TODOs in the code base
> (similar to HDFS-2148)?  Looks like this merge doesn't introduce many
> which is great (just did a quick diff and grep).

I found 2 remaining TODOs introduced in the current merge patch.  One is in ContainerLaunch.java.  The container launch script was trying to set a CLASSPATH that exceeded the Windows maximum command line length.  The fix was to wrap the long classpath into an intermediate jar containing only a manifest file with a Class-Path entry.  (See YARN-316.)  Just to be conservative, we wrapped this logic in an if (Shell.WINDOWS) guard and marked a TODO to remove it later and use that approach on all platforms after additional testing.  I've tested this code path successfully on Mac too, but several people wanted additional testing and performance checks before removing the if (Shell.WINDOWS) guard.  That work is tracked in an existing jira: YARN-358.

The other TODO is for winutils to print more usage information and examples.  At this point, I think winutils is printing sufficient information, and we can just remove the TODO.  I just submitted a new jira to start that conversation: HADOOP-9348.

Thank you,
--Chris


On Thu, Feb 28, 2013 at 11:29 AM, Robert Evans <[hidden email]> wrote:

> My initial question was mostly intended to understand the desired new
> classification of Windows after the merge, and how we plan to maintain
> Windows support.  I am happy to hear that hardware for Jenkins will be
> provided.  I am also fine, at least initially, with us trying to treat
> Windows as a first class supported platform.  But I realize that there
> are a lot of people that do not have easy access to Windows for
> development/debugging, myself included. I also don't want to slow down
> the pace of development too much because of this.  It will cause some
> organizations that do not use or support Windows to be more likely to
> run software that has diverged from an official release.  It also has
> the potential to make the patch submission process even more
> difficult, which increases the likelihood of submitters abandoning
> patches.  However, the great thing about being in a community is we can change if we need to.
>
> I am +0 for the merge.  I am not a Windows expert so I don't feel
> comfortable giving it a true +1.
>
> --Bobby
>
>
> On 2/28/13 10:45 AM, "Chris Nauroth" <[hidden email]> wrote:
>
> >I'd like to share a few anecdotes about developing cross-platform,
> >hopefully to address some of the concerns about adding overhead to
> >the development process.  By reviewing past cases of cross-platform Linux vs.
> >Windows bugs, we can get a sense for how the development process
> >could look in the future.
> >
> >HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run
> >on Windows.  As part of an earlier jira, HADOOP-8962, there was a new
> >test committed on trunk covering the case of a local file system
> >interaction on a file containing a ':'.  On Windows, ':' in a path
> >has special meaning as part of the drive specifier (i.e. C:), so this
> >test cannot pass when running on Windows.  In this kind of case, the
> >cross-platform bug is obvious, and the fix is obvious
> >(assumeTrue(!Shell.WINDOWS)).  Ideally, this would get fixed
> >pre-commit after seeing a -1 from the Windows Jenkins slave.
> >
> >HDFS-4274: BlockPoolSliceScanner does not close verification log
> >during shutdown.  This caused problems for MiniDFSCluster-based tests
> >running on Windows.  Failure to close the verification log meant that
> >we didn't release file locks, so the tests couldn't delete/recreate
> >working directories during teardown/setup.  Arguably, this was always
> >a bug, and running on Windows just exposed it because of its stricter
> >rules about file locking.  This is a more complex fix, but it doesn't
> >require platform-specific knowledge.  If some future patch
> >accidentally regresses this, then we'll likely see +1 from Linux
> >Jenkins and -1 from Windows Jenkins.  Ideally, it would get fixed
> >pre-commit, because it doesn't require Windows-specific knowledge.  
> >There is also the matter of impact.
> > Re-breaking this would re-break many test suites on Windows.
> >
> >HADOOP-9232: JniBasedUnixGroupsMappingWithFallback fails on Windows
> >with UnsatisfiedLinkError.  This was introduced by HADOOP-8712, which
> >switched to JniBasedUnixGroupsMappingWithFallback as the default
> >hadoop.security.group.mapping, but did not provide a Windows
> >implementation of the JNI function.  In this case, there was a strong
> >desire to get
> >HADOOP-8712 into a release, fixing it on Windows required native
> >Windows API knowledge, and Windows users had a simple workaround
> >available by changing their configs back to
> >ShellBasedUnixGroupsMapping.  I think this is the kind of situation
> >where we could allow HADOOP-8712 to commit despite
> >-1 from Windows Jenkins, with fairly quick follow-up from an engineer
> >with the Windows expertise to fix it.
> >
> >To summarize, I don't think it needs to differ greatly from our
> >current development process.  We're all responsible for breadth of
> >understanding and maintenance of the whole codebase, but we also rely
> >on specific individuals with deep expertise in particular areas for certain issues.
> > Sometimes we commit despite a -1 from Jenkins, based on the
> >community's judgment.
> >
> >Virtualization greatly simplifies cross-platform development.  I use
> >VirtualBox on a Mac host and run VMs for Windows and Ubuntu with a
> >shared drive so that they can all see the same copy of the source
> >code.  There are plenty of variations on this depending on your
> >preference, such as offloading the VMs to a separate server or cloud
> >service to free up local RAM.  I'm planning on submitting
> >BUILDING.txt changes later today that fully describe how to build on
> >Windows.  After some initial setup, it's nearly identical to the mvn
> >commands that you already use today.
> >
> >Hope this helps,
> >--Chris
> >
> >
> >On Thu, Feb 28, 2013 at 3:25 AM, John Gordon
> ><[hidden email]>wrote:
> >
> >> +1 (non-binding)
> >>
> >> I want to share my vote of confidence in this community.  If
> >>motivated to  do so, this community can keep this project
> >>cross-platform and continue to  rapidly innovate without breaking a
> >>sweat.
> >>
> >> The day we started working on this, I saw the foundations of
> >>greatness in  the quality and volume of dev tests, the code itself,
> >>and the Apache values  themselves.
> >>
> >> 1.) Hadoop's unit tests and their frameworks are very well thought
> >>out and  the consideration and energy that went into their design is
> >>worthy of  praise.  The MiniCluster abstractions utilize very few
> >>resources and put  all the processes into one JVM for easy
> >>debugging.  It is very easy to  select specific tests from the full
> >>suite to reproduce an issue reported in  another environment - like
> >>the Jenkins build server or another  contributor's environment.
> >> 2.) This community has done an excellent job of incorporating
> >>well-placed  log messages to make it easy to post mortem
> >>troubleshoot most failures.
> >>  The logs are very useful, and it is extremely rare that
> >>troubleshooting a  failure requires debugging a live repro.
> >> 3.) Hadoop is written primarily in Java, a cross-platform language
> >>that  provides its own platform in the form of the JVM to insulate
> >>most of the  code from the specifics of the OS layer.
> >> 4.) CoPDoC - The right priorities, and well stated.
> >>
> >>
> >> Thank you,
> >>
> >> John
> >>
> >> -----Original Message-----
> >> From: Ivan Mitic [mailto:[hidden email]]
> >> Sent: Wednesday, February 27, 2013 6:32 PM
> >> To: [hidden email]; [hidden email]
> >> Cc: [hidden email]; [hidden email]
> >> Subject: RE: [Vote] Merge branch-trunk-win to trunk
> >>
> >> +1 (non-binding)
> >>
> >> I am really glad to see this happening! As people already
> >>mentioned, this  has been a great engineering effort involving many
> >>people!
> >>
> >>
> >> Folks raised some valid concerns below and I thought it would be
> >>good to  share my 2 cents. In my opinion, we don't have to solve all
> >>these problems  right now. As we move forward with two platforms, we
> >>can start addressing  one problem at a time and incrementally
> >>improve. In the first iteration,  maintaining Hadoop on Windows
> >>could be just everyone trying to do their  best effort (make sure
> >>Jenkins build succeeds at least). We already have  people who are
> >>building/running trunk on Windows daily, so they would jump  in and
> >>fix problems as needed (we've been doing this in branch-trunk-win  
> >>for a while now). Although I see that the problems could arise with  
> >>platform specific features/optimizations, I don't think these are
> >>frequent,  so in most cases everything will just work. Merging the
> >>two branches sooner  rather than later does seems like the right
> >>thing to do if the ultimate  goal is to have Hadoop on both
> >>platforms. Now that the port has completed,  we will have people in
> >>Microsoft (and elsewhere) wanting to contribute  
> >>features/improvements to the trunk branch. A separate branch would
> >>just  make things more difficult and confusing for everyone :) Hope
> >>this makes  sense.
> >>
> >> -----Original Message-----
> >> From: Todd Lipcon [mailto:[hidden email]]
> >> Sent: Wednesday, February 27, 2013 3:43 PM
> >> To: [hidden email]
> >> Cc: [hidden email]; [hidden email];
> >> [hidden email]
> >> Subject: Re: [Vote] Merge branch-trunk-win to trunk
> >>
> >> On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas <
> [hidden email]
> >> >wrote:
> >>
> >> > With that we need to decide how our precommit process looks.
> >> > My inclination is to wait for +1 from precommit builds on both
> >> > the platforms to ensure no issues are introduced.
> >> > Thoughts?
> >> >
> >> > 2. Feature development impact
> >> > Some questions have been raised about would new features need to
> >> > be supported on both the platforms. Yes. I do not see a reason
> >> > why features cannot work on both the platforms, with the
> >> > exception of platform specific optimizations. This what Java gives us.
> >> >
> >> >
> >> I'm concerned about the above. Personally, I don't have access to
> >>any  Windows boxes with development tools, and I know nothing about
> >>developing  on Windows. The only Windows I run is an 8GB VM with 1
> >>GB RAM allocated,  for powerpoint :)
> >>
> >> If I submit a patch and it gets -1 "tests failed" on the Windows
> >> slave, how am I supposed to proceed?
> >>
> >> I think a reasonable compromise would be that the tests should
> >>always
> >> *build* on Windows before commit, and contributors should do their
> >>best to  look at the test logs for any Windows-specific failures.
> >>But, beyond  looking at the logs, a "-1 Tests failed on windows"
> >>should not block a  commit.
> >>
> >> Those contributors who are interested in Windows being a
> >> first-class platform should be responsible for watching the Windows
> >> builds and debugging/fixing any regressions that might be Windows-specific.
> >>
> >> I also think the KDE model that Harsh pointed out is an interesting
> >>one
> >>--
> >> ie the idea that we would not merge windows support to trunk, but
> >>rather  treat is as a "parallel code line" which lives in the ASF
> >>and has its own  builds and releases. The windows team would
> >>periodically merge
> >>trunk->win
> >> to pick up any new changes, and do a separate test/release process.
> >>I'm not  convinced this is the best idea, but worth discussion of
> >>pros and cons.
> >>
> >> -Todd
> >>
> >>
> >> >
> >> > On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins <[hidden email]>
> >>wrote:
> >> >
> >> > > Bobby raises some good questions.  A related one, since most
> >> > > current developers won't add Windows support for new features
> >> > > that are platform specific is it assumed that Windows
> >> > > development will either lag or will people actively work on
> >> > > keeping Windows up with the latest?  And vice versa in case
> >> > > Windows support is implemented
> >>first.
> >> > >
> >> > > Is there a jira for resolving the outstanding TODOs in the code
> >> > > base (similar to HDFS-2148)?  Looks like this merge doesn't
> >> > > introduce many which is great (just did a quick diff and grep).
> >> > >
> >> > > Thanks,
> >> > > Eli
> >> > >
> >> > > On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans
> >> > > <[hidden email]>
> >> > wrote:
> >> > > > After this is merged in is Windows still going to be a second
> >> > > > class citizen but happens to work for more than just
> >> > > > development or is it a fully supported platform where if
> >> > > > something breaks it can block a
> >> > > release?
> >> > > >  How do we as a community intend to keep Windows support from
> >> breaking?
> >> > > > We don't have any Jenkins slaves to be able to run nightly
> >> > > > tests to validate everything still compiles/runs.  This is
> >> > > > not a blocker for me because we often rely on individuals and
> >> > > > groups to test Hadoop, but I
> >> > do
> >> > > > think we need to have this discussion before we put it in.
> >> > > >
> >> > > > --Bobby
> >> > > >
> >> > > > On 2/26/13 4:55 PM, "Suresh Srinivas"
> >> > > > <[hidden email]>
> >> wrote:
> >> > > >
> >> > > >>I had posted heads up about merging branch-trunk-win to trunk
> >> > > >>on Feb
> >> > 8th.
> >> > > >>I
> >> > > >>am happy to announce that we are ready for the merge.
> >> > > >>
> >> > > >>Here is a brief recap on the highlights of the work done:
> >> > > >>- Command-line scripts for the Hadoop surface area
> >> > > >>- Mapping the HDFS permissions model to Windows
> >> > > >>- Abstracted and reconciled mismatches around differences in
> >> > > >>Path semantics in Java and Windows
> >> > > >>- Native Task Controller for Windows
> >> > > >>- Implementation of a Block Placement Policy to support cloud
> >> > > >>environments, more specifically Azure.
> >> > > >>- Implementation of Hadoop native libraries for Windows
> >> > > >>(compression codecs, native I/O)
> >> > > >>- Several reliability issues, including race-conditions,
> >> > > >>intermittent
> >> > > test
> >> > > >>failures, resource leaks.
> >> > > >>- Several new unit test cases written for the above changes
> >> > > >>
> >> > > >>Please find the details of the work in
> >> > > >>CHANGES.branch-trunk-win.txt - Common
> >> > > >>changes<http://bit.ly/Xe7Ynv>, HDFS changes<
> >> > http://bit.ly/13QOSo9
> >> > > >,
> >> > > >>and YARN and MapReduce changes <http://bit.ly/128zzMt>. This
> >> > > >>is the
> >> > work
> >> > > >>ported from branch-1-win to a branch based on trunk.
> >> > > >>
> >> > > >>For details of the testing done, please see the thread -
> >> > > >>http://bit.ly/WpavJ4. Merge patch for this is available on
> >> > HADOOP-8562<
> >> > > >>https://issues.apache.org/jira/browse/HADOOP-8562>.
> >> > > >>
> >> > > >>This was a large undertaking that involved developing code,
> >> > > >>testing the entire Hadoop stack, including scale tests. This
> >> > > >>is made possible only with the contribution from many many
> >> > > >>folks in the community. Following
> >> > people
> >> > > >>contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil,
> >> > > >>Bikas
> >> > Saha,
> >> > > >>Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David
> >> > > >>Lao,
> >> > > Sumadhur
> >> > > >>Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing
> >> > > >>Zhao,
> >> > Thejas
> >> > > >>Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan
> >> > > >>Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy,
> >> > > >>Tsz-Wo Nicholas Sze,
> >> > > Suresh
> >> > > >>Srinivas and Sanjay Radia. There are many others who
> >> > > >>contributed as
> >> > well
> >> > > >>providing feedback and comments on numerous jiras.
> >> > > >>
> >> > > >>The vote will run for seven days and will end on March 5,
> >> > > >>6:00PM
> >>PST.
> >> > > >>
> >> > > >>Regards,
> >> > > >>Suresh
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman
> >> > > >><[hidden email]>wrote:
> >> > > >>
> >> > > >>> It is super exciting to look at the prospect of these
> >> > > >>>changes being merged  to trunk. Having Windows as one of the
> >> > > >>>supported Hadoop platforms is
> >> > a
> >> > > >>> fantastic opportunity both for the Hadoop project and
> >> > > >>>Microsoft customers.
> >> > > >>>
> >> > > >>> This work began around a year back when a few of us started
> >> > > >>> with a
> >> > > basic
> >> > > >>> port of Hadoop on Windows. Ever since, the Hadoop team in
> >> > > >>> Microsoft
> >> > > have
> >> > > >>> made significant progress in the following areas:
> >> > > >>> (PS: Some of these items are already included in Suresh's
> >> > > >>> email, but including again for completeness)
> >> > > >>>
> >> > > >>> - Command-line scripts for the Hadoop surface area
> >> > > >>> - Mapping the HDFS permissions model to Windows
> >> > > >>> - Abstracted and reconciled mismatches around differences
> >> > > >>> in Path semantics in Java and Windows
> >> > > >>> - Native Task Controller for Windows
> >> > > >>> - Implementation of a Block Placement Policy to support
> >> > > >>> cloud environments, more specifically Azure.
> >> > > >>> - Implementation of Hadoop native libraries for Windows
> >> > > >>> (compression codecs, native I/O) - Several reliability
> >> > > >>> issues, including race-conditions, intermittent test
> >> > > >>> failures, resource
> >> leaks.
> >> > > >>> - Several new unit test cases written for the above changes
> >> > > >>>
> >> > > >>> In the process, we have closely engaged with the Apache
> >> > > >>> open source community and have got great support and
> >> > > >>> assistance from the
> >> > community
> >> > > >>>in
> >> > > >>> terms of contributing fixes, code review comments and commits.
> >> > > >>>
> >> > > >>> In addition, the Hadoop team at Microsoft has also made
> >> > > >>> good progress
> >> > > in
> >> > > >>> other projects including Hive, Pig, Sqoop, Oozie, HCat and
> >>HBase.
> >> > Many
> >> > > >>>of
> >> > > >>> these changes have already been committed to the respective
> >> > > >>>trunks
> >> > with
> >> > > >>> help from various committers and contributors. It is great
> >> > > >>> to see the commitment of the community to support multiple
> >> > > >>> platforms, and we
> >> > look
> >> > > >>> forward to the day when a developer/customer is able to
> >> > > >>>successfully deploy  a complete solution stack based on
> >> > > >>>Apache Hadoop releases.
> >> > > >>>
> >> > > >>> Next Steps:
> >> > > >>>
> >> > > >>> All of the above changes are part of the Windows Azure
> >> > > >>>HDInsight and  HDInsight Server products from Microsoft. We
> >> > > >>>have successfully on-boarded  several internal customers and
> >> > > >>>have been running production workloads
> >> > > on
> >> > > >>> Windows Azure HDInsight. Our vision is to create a big data
> >> > > >>>platform based  on Hadoop, and we are committed to helping
> >> > > >>>make Hadoop a world-class  solution that anyone can use to
> >> > > >>>solve their biggest data challenges.
> >> > > >>>
> >> > > >>> As an immediate next step, we would like to have a
> >> > > >>> discussion around
> >> > > how
> >> > > >>> we can ensure that the quality of the mainline Hadoop
> >> > > >>>branches on Windows  is maintained. To this end, we would
> >> > > >>>like to get to the state where
> >> > we
> >> > > >>>have
> >> > > >>> pre-checkin validation gates and nightly test runs enabled
> >> > > >>>on
> >> > Windows.
> >> > > >>>If
> >> > > >>> you have any suggestions around this, please do send an email.
> >> > > >>>We
> >> > are
> >> > > >>> committed to helping sustain the long-term quality of
> >> > > >>>Hadoop on both Linux  and Windows.
> >> > > >>>
> >> > > >>> We sincerely thank the community for their contribution and
> >> > > >>> support
> >> > so
> >> > > >>> far. And hope to continue having a close engagement in the
> >>future.
> >> > > >>>
> >> > > >>> -Microsoft HDInsight Team
> >> > > >>>
> >> > > >>>
> >> > > >>> -----Original Message-----
> >> > > >>> From: Suresh Srinivas [mailto:[hidden email]]
> >> > > >>> Sent: Thursday, February 7, 2013 5:42 PM
> >> > > >>> To: [hidden email];
> >> > > >>> [hidden email]; [hidden email];
> >> > > >>> [hidden email]
> >> > > >>> Subject: Heads up - merge branch-trunk-win to trunk
> >> > > >>>
> >> > > >>> The support for Hadoop on Windows was proposed in
> >> > > >>> HADOOP-8079<
> >> > > >>> https://issues.apache.org/jira/browse/HADOOP-8079> almost a
> year
> >> > ago.
> >> > > >>>The
> >> > > >>> goal was to make Hadoop natively integrated, full-featured,
> >> > > >>>and performance  and scalability tuned on Windows Server or
> >> > > >>>Windows Azure.
> >> > > >>> We are happy to announce that a lot of progress has been
> >> > > >>>made in this  regard.
> >> > > >>>
> >> > > >>> Initial work started in a feature branch, branch-1-win,
> >> > > >>>based on branch-1.
> >> > > >>> The details related to the work done in the branch can be
> >> > > >>>seen in  CHANGES.txt<
> >> > > >>>
> >> > > >>>
> >> > >
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1-win/CHA
> >> > > NGES
> >> > .
> >> > > >>>branch-1-win.txt?view=markup
> >> > > >>> >.
> >> > > >>> This work has been ported to a branch, branch-trunk-win,
> >> > > >>> based on
> >> > > trunk.
> >> > > >>> Merge patch for this is available on
> >> > > >>> HADOOP-8562<https://issues.apache.org/jira/browse/HADOOP-85
> >> > > >>> 62>
> >> > > >>> .
> >> > > >>>
> >> > > >>> Highlights of the work done so far:
> >> > > >>> 1. Necessary changes in Hadoop to run natively on Windows.
> >> > > >>> These
> >> > > changes
> >> > > >>> handle differences in platforms related to path names,
> >> > > >>>process/task  management etc.
> >> > > >>> 2. Addition of winutils tools for managing file permissions
> >> > > >>>and ownership,  user group mapping, hardlinks, symbolic
> >> > > >>>links, chmod, disk
> >> > utilization,
> >> > > >>>and
> >> > > >>> process/task management.
> >> > > >>> 3. Added cmd scripts equivalent to existing shell scripts
> >> > > >>>hadoop-daemon.sh, start and stop scripts.
> >> > > >>> 4. Addition of block placement policy implemnation to
> >> > > >>>support cloud  enviroment, more specifically Azure.
> >> > > >>>
> >> > > >>> We are very close to wrapping up the work in
> >> > > >>>branch-trunk-win and getting  ready for a merge. Currently
> >> > > >>>the merge patch is passing close to 100%
> >> > > of
> >> > > >>> unit tests on Linux. Soon I will call for a vote to merge
> >> > > >>>this branch into  trunk.
> >> > > >>>
> >> > > >>> Next steps:
> >> > > >>> 1. Call for vote to merge branch-trunk-win to trunk, when
> >> > > >>> the work completes and precommit build is clean.
> >> > > >>> 2. Start a discussion on adding Jenkins precommit builds on
> >> > > >>> windows
> >> > and
> >> > > >>> how to integrate that with the existing commit process.
> >> > > >>>
> >> > > >>> Let me know if you have any questions.
> >> > > >>>
> >> > > >>> Regards,
> >> > > >>> Suresh
> >> > > >>>
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >>--
> >> > > >>http://hortonworks.com/download/
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > http://hortonworks.com/download/
> >> >
> >>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >>
> >>
> >>
> >>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

sanjay Radia-2
In reply to this post by Suresh Srinivas-2
+1
Java has done the bulk of the work in making Hadoop multi-platform.
Windows specific code is a tiny percentage of the code.
Jeninks support for windows is going help us keep the platform portable going forward.
I expect that the vast majority of new commits have  no problems. I propose that we start by fixing problems that Jenkins raises but not block new commits for too long if the author does not have a windows box or if a volunteer does not step up.

sanjay



Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Ramya Sunil
+1 for the merge.

As someone who has been testing the code for many months now, both on
singlenode and multinode clusters, I am very confident about the stability
and the quality of the code. I have run several regression tests to verify
distributed cache, streaming, compression, capacity scheduler, job history
and many more features in HDFS and MR.

- Ramya

On Thu, Feb 28, 2013 at 3:08 PM, sanjay Radia <[hidden email]>wrote:

> +1
> Java has done the bulk of the work in making Hadoop multi-platform.
> Windows specific code is a tiny percentage of the code.
> Jeninks support for windows is going help us keep the platform portable
> going forward.
> I expect that the vast majority of new commits have  no problems. I
> propose that we start by fixing problems that Jenkins raises but not block
> new commits for too long if the author does not have a windows box or if a
> volunteer does not step up.
>
> sanjay
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Vote] Merge branch-trunk-win to trunk

Konstantin Boudnik-2
In reply to this post by sanjay Radia-2
On Thu, Feb 28, 2013 at 03:08PM, sanjay Radia wrote:
> +1
> Java has done the bulk of the work in making Hadoop multi-platform.
> Windows specific code is a tiny percentage of the code.
> Jeninks support for windows is going help us keep the platform portable going forward.
> I expect that the vast majority of new commits have  no problems. I propose
> that we start by fixing problems that Jenkins raises but not block new
> commits for too long if the author does not have a windows box or if a
> volunteer does not step up.

Considering a typical set of software most of the people here work with it
would be completely inappropriate to block commits for failing Windows
specific features. After all, Microsoft never did bother to check what
features or compatibilty matters they have broke in Java and elsewhere, so why
should we?

I believe this kind of rules have to be set and discussed before the merge is
done.

Cheers,
  Cos

signature.asc (237 bytes) Download Attachment
123