[VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

classic Classic list List threaded Threaded
57 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Matt Foley-2
For discussion, please see previous thread "[PROPOSAL] introduce Python as
build-time and run-time dependency for Hadoop and throughout Hadoop stack".

This vote consists of three separate items:

1. Contributors shall be allowed to use Python as a platform-independent
scripting language for build-time tasks, and add Python as a build-time
dependency.
Please vote +1, 0, -1.

2. Contributors shall be encouraged to use Maven tasks in combination with
either plug-ins or Groovy scripts to do cross-platform build-time tasks,
even under ant in Hadoop-1.
Please vote +1, 0, -1.

3. Contributors shall be allowed to use Python as a platform-independent
scripting language for run-time tasks, and add Python as a run-time
dependency.
Please vote +1, 0, -1.

Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
use Maven plug-ins or Groovy as the only means of cross-platform build-time
tasks, or to simply continue using platform-dependent scripts as is being
done today.

Vote closes at 12:30pm PST on Saturday 1 December.
---------
Personally, my vote is +1, +1, +1.
I think #2 is preferable to #1, but still has many unknowns in it, and
until those are worked out I don't want to delay moving to cross-platform
scripts for build-time tasks.

Best regards,
--Matt
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Chris Nauroth
+1, +1, +1 (non-binding)

On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>
> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Steve Loughran-3
In reply to this post by Matt Foley-2
On 24 November 2012 20:13, Matt Foley <[hidden email]> wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
>
+1



> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
>
+1

My feelings on Maven are well known, but Groovy can mitigate things. And
I'm not going to advocate post-M2 build tools such as Gradle.

It's ironic that Maven's utter inflexibility forces people to use scripting
languages to get their work done, but Groovy is fairly nimble here -and
easy to learn for any Java programmer. "Groovy in Action" is the book to
own.



> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>


+1. I look forward to never having to debug shell script env variable
inheritance ever again.

This does not mean that I advocate writing big bits of the system in .py;
as someone who is debugging OpenStack request throttling this weekend, I
know that Python is not "the solution" to problems. For Hadoop it has a
role, but the role should be ('better than bash') and ('streaming
integration').


> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Robert Evans
In reply to this post by Matt Foley-2
+1, +1, 0

On 11/24/12 2:13 PM, "Matt Foley" <[hidden email]> wrote:

>For discussion, please see previous thread "[PROPOSAL] introduce Python as
>build-time and run-time dependency for Hadoop and throughout Hadoop
>stack".
>
>This vote consists of three separate items:
>
>1. Contributors shall be allowed to use Python as a platform-independent
>scripting language for build-time tasks, and add Python as a build-time
>dependency.
>Please vote +1, 0, -1.
>
>2. Contributors shall be encouraged to use Maven tasks in combination with
>either plug-ins or Groovy scripts to do cross-platform build-time tasks,
>even under ant in Hadoop-1.
>Please vote +1, 0, -1.
>
>3. Contributors shall be allowed to use Python as a platform-independent
>scripting language for run-time tasks, and add Python as a run-time
>dependency.
>Please vote +1, 0, -1.
>
>Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors
>to
>use Maven plug-ins or Groovy as the only means of cross-platform
>build-time
>tasks, or to simply continue using platform-dependent scripts as is being
>done today.
>
>Vote closes at 12:30pm PST on Saturday 1 December.
>---------
>Personally, my vote is +1, +1, +1.
>I think #2 is preferable to #1, but still has many unknowns in it, and
>until those are worked out I don't want to delay moving to cross-platform
>scripts for build-time tasks.
>
>Best regards,
>--Matt

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Adam Berry

0, +1, -1 (non-binding)

Also, it feels like maybe the discussion should have been kept open a little longer, thanksgiving holidays last week meant that people may have missed it.

Cheers,
Adam

On Nov 26, 2012, at 10:16 AM, Robert Evans wrote:

> +1, +1, 0
>
> On 11/24/12 2:13 PM, "Matt Foley" <[hidden email]> wrote:
>
>> For discussion, please see previous thread "[PROPOSAL] introduce Python as
>> build-time and run-time dependency for Hadoop and throughout Hadoop
>> stack".
>>
>> This vote consists of three separate items:
>>
>> 1. Contributors shall be allowed to use Python as a platform-independent
>> scripting language for build-time tasks, and add Python as a build-time
>> dependency.
>> Please vote +1, 0, -1.
>>
>> 2. Contributors shall be encouraged to use Maven tasks in combination with
>> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
>> even under ant in Hadoop-1.
>> Please vote +1, 0, -1.
>>
>> 3. Contributors shall be allowed to use Python as a platform-independent
>> scripting language for run-time tasks, and add Python as a run-time
>> dependency.
>> Please vote +1, 0, -1.
>>
>> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors
>> to
>> use Maven plug-ins or Groovy as the only means of cross-platform
>> build-time
>> tasks, or to simply continue using platform-dependent scripts as is being
>> done today.
>>
>> Vote closes at 12:30pm PST on Saturday 1 December.
>> ---------
>> Personally, my vote is +1, +1, +1.
>> I think #2 is preferable to #1, but still has many unknowns in it, and
>> until those are worked out I don't want to delay moving to cross-platform
>> scripts for build-time tasks.
>>
>> Best regards,
>> --Matt
>

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Colin McCabe
In reply to this post by Matt Foley-2
Nonbinding, but:

+1, +1, 0.

Also, let's please clearly define the versions of Python we support if
we do chooes to go this route.  Something like 2.4+ would be
reasonable.  The process launching APIs in particular changed a lot in
those early 2.x releases.

best,
Colin


On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>
> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Luke Lu
-1, +1, -1.

If we want to introduce a "platform independent" scripting language, we
should not choose python, as it has a bad track record for compatibility
(between versions/platforms).

+1 to use groovy, as we can control the version of groovy jars included in
our distribution.

__Luke


On Mon, Nov 26, 2012 at 8:53 AM, Colin McCabe <[hidden email]>wrote:

> Nonbinding, but:
>
> +1, +1, 0.
>
> Also, let's please clearly define the versions of Python we support if
> we do chooes to go this route.  Something like 2.4+ would be
> reasonable.  The process launching APIs in particular changed a lot in
> those early 2.x releases.
>
> best,
> Colin
>
>
> On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
> > For discussion, please see previous thread "[PROPOSAL] introduce Python
> as
> > build-time and run-time dependency for Hadoop and throughout Hadoop
> stack".
> >
> > This vote consists of three separate items:
> >
> > 1. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for build-time tasks, and add Python as a build-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > 2. Contributors shall be encouraged to use Maven tasks in combination
> with
> > either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> > even under ant in Hadoop-1.
> > Please vote +1, 0, -1.
> >
> > 3. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for run-time tasks, and add Python as a run-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors
> to
> > use Maven plug-ins or Groovy as the only means of cross-platform
> build-time
> > tasks, or to simply continue using platform-dependent scripts as is being
> > done today.
> >
> > Vote closes at 12:30pm PST on Saturday 1 December.
> > ---------
> > Personally, my vote is +1, +1, +1.
> > I think #2 is preferable to #1, but still has many unknowns in it, and
> > until those are worked out I don't want to delay moving to cross-platform
> > scripts for build-time tasks.
> >
> > Best regards,
> > --Matt
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Radim Kolar-2
In reply to this post by Matt Foley-2
-1, +1, -1
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Chris Nauroth
In reply to this post by Colin McCabe
Declaring 2.4 to be the minimum supported version sounds like a great idea.
 I've worked with CentOS distributions that have a dependency on Python
2.4, and it was always awkward to get a later version on those machines.

Thank you,
--Chris

On Mon, Nov 26, 2012 at 8:53 AM, Colin McCabe <[hidden email]>wrote:

> Nonbinding, but:
>
> +1, +1, 0.
>
> Also, let's please clearly define the versions of Python we support if
> we do chooes to go this route.  Something like 2.4+ would be
> reasonable.  The process launching APIs in particular changed a lot in
> those early 2.x releases.
>
> best,
> Colin
>
>
> On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
> > For discussion, please see previous thread "[PROPOSAL] introduce Python
> as
> > build-time and run-time dependency for Hadoop and throughout Hadoop
> stack".
> >
> > This vote consists of three separate items:
> >
> > 1. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for build-time tasks, and add Python as a build-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > 2. Contributors shall be encouraged to use Maven tasks in combination
> with
> > either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> > even under ant in Hadoop-1.
> > Please vote +1, 0, -1.
> >
> > 3. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for run-time tasks, and add Python as a run-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors
> to
> > use Maven plug-ins or Groovy as the only means of cross-platform
> build-time
> > tasks, or to simply continue using platform-dependent scripts as is being
> > done today.
> >
> > Vote closes at 12:30pm PST on Saturday 1 December.
> > ---------
> > Personally, my vote is +1, +1, +1.
> > I think #2 is preferable to #1, but still has many unknowns in it, and
> > until those are worked out I don't want to delay moving to cross-platform
> > scripts for build-time tasks.
> >
> > Best regards,
> > --Matt
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Konstantin Boudnik-2
In reply to this post by Matt Foley-2
-1, +1, -1

Thanks

On Sat, Nov 24, 2012 at 12:13PM, Matt Foley wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>
> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Suresh Srinivas-2
In reply to this post by Matt Foley-2
+1, +1, +1

Regards,
Suresh


On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>
> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
>



--
http://hortonworks.com/download/
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Giridharan Kesavan
In reply to this post by Matt Foley-2
+1, +1, +1

-Giri


On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:

> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".
>
> This vote consists of three separate items:
>
> 1. Contributors shall be allowed to use Python as a platform-independent
> scripting language for build-time tasks, and add Python as a build-time
> dependency.
> Please vote +1, 0, -1.
>
> 2. Contributors shall be encouraged to use Maven tasks in combination with
> either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> even under ant in Hadoop-1.
> Please vote +1, 0, -1.
>
> 3. Contributors shall be allowed to use Python as a platform-independent
> scripting language for run-time tasks, and add Python as a run-time
> dependency.
> Please vote +1, 0, -1.
>
> Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to
> use Maven plug-ins or Groovy as the only means of cross-platform build-time
> tasks, or to simply continue using platform-dependent scripts as is being
> done today.
>
> Vote closes at 12:30pm PST on Saturday 1 December.
> ---------
> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.
>
> Best regards,
> --Matt
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Alejandro Abdelnur
Matt,

The scope of this vote seems different from what was discussed in the
PROPOSAL thread.

In the PROPOSAL thread you indicated this was for Hadoop1 because it is ANT
based. And the main reason was to remove saveVersion.sh.

Your #3  was not discussed in the proposal, was it?

It seems this vote is dragging much more stuff it was originally discussed.
I think you should suspend the vote, recap the motivation and then restart
the vote. As things are laid out at the moment my vote is:

-1 (It still seems an overkill to introduce a new runtime requirement for
building to replace a script.)
+1 (I think this is the right way to simplify the build)
-1 (AFAIK there is not such requirement at the moment, and if it comes it
would be in the form of an AM, which I'd argue it should leave outside of
Hadoop)

Thx


On Mon, Nov 26, 2012 at 1:16 PM, Giridharan Kesavan <
[hidden email]> wrote:

> +1, +1, +1
>
> -Giri
>
>
> On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
>
> > For discussion, please see previous thread "[PROPOSAL] introduce Python
> as
> > build-time and run-time dependency for Hadoop and throughout Hadoop
> stack".
> >
> > This vote consists of three separate items:
> >
> > 1. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for build-time tasks, and add Python as a build-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > 2. Contributors shall be encouraged to use Maven tasks in combination
> with
> > either plug-ins or Groovy scripts to do cross-platform build-time tasks,
> > even under ant in Hadoop-1.
> > Please vote +1, 0, -1.
> >
> > 3. Contributors shall be allowed to use Python as a platform-independent
> > scripting language for run-time tasks, and add Python as a run-time
> > dependency.
> > Please vote +1, 0, -1.
> >
> > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors
> to
> > use Maven plug-ins or Groovy as the only means of cross-platform
> build-time
> > tasks, or to simply continue using platform-dependent scripts as is being
> > done today.
> >
> > Vote closes at 12:30pm PST on Saturday 1 December.
> > ---------
> > Personally, my vote is +1, +1, +1.
> > I think #2 is preferable to #1, but still has many unknowns in it, and
> > until those are worked out I don't want to delay moving to cross-platform
> > scripts for build-time tasks.
> >
> > Best regards,
> > --Matt
> >
>



--
Alejandro
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Radim Kolar-2

> In the PROPOSAL thread you indicated this was for Hadoop1 because it is ANT
> based. And the main reason was to remove saveVersion.sh.
>
> Your #3  was not discussed in the proposal, was it?
it was part of original proposal but not discussed much because language
war was more attractive option. You want vote like this?

1. Using external language vs maven plugin to build
2. Using external language for startup scripts vs JVM script language.
Such as Jython use in websphere.
3. Choose python as external language
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Roman Shaposhnik-2
In reply to this post by Matt Foley-2
On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
> For discussion, please see previous thread "[PROPOSAL] introduce Python as
> build-time and run-time dependency for Hadoop and throughout Hadoop stack".

Perhaps I'm missing something, but I can't possibly imagine how
a vote on a [hidden email] could possibly
affect downstream projects. I honestly don't think we should be
in a business of telling Pig, Hive, Oozie, etc. what to use or
not to use.

With that in mind the following vote applies ONLY to Hadoop
project itself:
   -1, +1, -1

> Personally, my vote is +1, +1, +1.
> I think #2 is preferable to #1, but still has many unknowns in it, and
> until those are worked out I don't want to delay moving to cross-platform
> scripts for build-time tasks.

And yet #2 is, in my opinion, a much better investment of our collective
time. We already at the mercy of JDK, but at least it is a far superior
platform from a support and backward compatibility perspective. Anything
that we can offload to it -- is absolutely worth doing.

Thanks,
Roman.
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Matt Foley
In reply to this post by Alejandro Abdelnur
Hi Alejandro,
Please see in-line below.

On Mon, Nov 26, 2012 at 1:52 PM, Alejandro Abdelnur <[hidden email]>
 wrote:

> Matt,
>
> The scope of this vote seems different from what was discussed in the
> PROPOSAL thread.
> In the PROPOSAL thread you indicated this was for Hadoop1 because it is ANT
> based. And the main reason was to remove saveVersion.sh.
> Your #3  was not discussed in the proposal, was it?
>

The item #3 was in my original statement of the problem, with which I
started the proposal thread.  In fact, the thread title was "[PROPOSAL]
introduce Python as build-time and run-time dependency for Hadoop and
throughout Hadoop stack".  It is true that only one or two people chose to
discuss #3 further in that thread.

The point is not just to replace a single script, but to provide a means to
do cross-platform scripts, which will over time replace many
non-platform-specific scripts written in platform-specific languages.


>
> It seems this vote is dragging much more stuff it was originally discussed.
> I think you should suspend the vote, recap the motivation and then restart
> the vote.
>

I respectfully disagree.  I believe a careful reading of the cited
discussion thread, plus my own statement of the vote, provides sufficient
background for a thoughtful decision on the subject.  Presumably so do the
ten other people who had already voted before you made that comment.

If several other people want more discussion first, please speak up.
Thanks,
--Matt

As things are laid out at the moment my vote is:

>
> -1 (It still seems an overkill to introduce a new runtime requirement for
> building to replace a script.)
> +1 (I think this is the right way to simplify the build)
> -1 (AFAIK there is not such requirement at the moment, and if it comes it
> would be in the form of an AM, which I'd argue it should leave outside of
> Hadoop)
>
> Thx
>
>
> On Mon, Nov 26, 2012 at 1:16 PM, Giridharan Kesavan <
> [hidden email]> wrote:
>
> > +1, +1, +1
> >
> > -Giri
> >
> >
> > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
> >
> > > For discussion, please see previous thread "[PROPOSAL] introduce Python
> > as
> > > build-time and run-time dependency for Hadoop and throughout Hadoop
> > stack".
> > >
> > > This vote consists of three separate items:
> > >
> > > 1. Contributors shall be allowed to use Python as a
> platform-independent
> > > scripting language for build-time tasks, and add Python as a build-time
> > > dependency.
> > > Please vote +1, 0, -1.
> > >
> > > 2. Contributors shall be encouraged to use Maven tasks in combination
> > with
> > > either plug-ins or Groovy scripts to do cross-platform build-time
> tasks,
> > > even under ant in Hadoop-1.
> > > Please vote +1, 0, -1.
> > >
> > > 3. Contributors shall be allowed to use Python as a
> platform-independent
> > > scripting language for run-time tasks, and add Python as a run-time
> > > dependency.
> > > Please vote +1, 0, -1.
> > >
> > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES
> contributors
> > to
> > > use Maven plug-ins or Groovy as the only means of cross-platform
> > build-time
> > > tasks, or to simply continue using platform-dependent scripts as is
> being
> > > done today.
> > >
> > > Vote closes at 12:30pm PST on Saturday 1 December.
> > > ---------
> > > Personally, my vote is +1, +1, +1.
> > > I think #2 is preferable to #1, but still has many unknowns in it, and
> > > until those are worked out I don't want to delay moving to
> cross-platform
> > > scripts for build-time tasks.
> > >
> > > Best regards,
> > > --Matt
> > >
> >
>
>
>
> --
> Alejandro
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Alejandro Abdelnur
Matt, thanks for the clarification.

I may have missed the main point of the PROPOSAL thread then. I personally
want to continue the discussion before voting.

* Phyton as runtime requirement. Are you planing to migrate all BASH
scripts provided by Hadoop (or dynamically created -ie launcher scripts)
 to Phyton?
* What else in the current build, besides saveVersion.sh, you see as
candidate to be migrated to Phyton?
* How are you planning to define what Phyton modules can be used? Will
developers have to install them manually?

Cheers


On Thu, Nov 29, 2012 at 2:39 PM, Matt Foley <[hidden email]> wrote:

> Hi Alejandro,
> Please see in-line below.
>
> On Mon, Nov 26, 2012 at 1:52 PM, Alejandro Abdelnur <[hidden email]>
>  wrote:
>
> > Matt,
> >
> > The scope of this vote seems different from what was discussed in the
> > PROPOSAL thread.
> > In the PROPOSAL thread you indicated this was for Hadoop1 because it is
> ANT
> > based. And the main reason was to remove saveVersion.sh.
> > Your #3  was not discussed in the proposal, was it?
> >
>
> The item #3 was in my original statement of the problem, with which I
> started the proposal thread.  In fact, the thread title was "[PROPOSAL]
> introduce Python as build-time and run-time dependency for Hadoop and
> throughout Hadoop stack".  It is true that only one or two people chose to
> discuss #3 further in that thread.
>
> The point is not just to replace a single script, but to provide a means to
> do cross-platform scripts, which will over time replace many
> non-platform-specific scripts written in platform-specific languages.
>
>
> >
> > It seems this vote is dragging much more stuff it was originally
> discussed.
> > I think you should suspend the vote, recap the motivation and then
> restart
> > the vote.
> >
>
> I respectfully disagree.  I believe a careful reading of the cited
> discussion thread, plus my own statement of the vote, provides sufficient
> background for a thoughtful decision on the subject.  Presumably so do the
> ten other people who had already voted before you made that comment.
>
> If several other people want more discussion first, please speak up.
> Thanks,
> --Matt
>
> As things are laid out at the moment my vote is:
> >
> > -1 (It still seems an overkill to introduce a new runtime requirement for
> > building to replace a script.)
> > +1 (I think this is the right way to simplify the build)
> > -1 (AFAIK there is not such requirement at the moment, and if it comes it
> > would be in the form of an AM, which I'd argue it should leave outside of
> > Hadoop)
> >
> > Thx
> >
> >
> > On Mon, Nov 26, 2012 at 1:16 PM, Giridharan Kesavan <
> > [hidden email]> wrote:
> >
> > > +1, +1, +1
> > >
> > > -Giri
> > >
> > >
> > > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]> wrote:
> > >
> > > > For discussion, please see previous thread "[PROPOSAL] introduce
> Python
> > > as
> > > > build-time and run-time dependency for Hadoop and throughout Hadoop
> > > stack".
> > > >
> > > > This vote consists of three separate items:
> > > >
> > > > 1. Contributors shall be allowed to use Python as a
> > platform-independent
> > > > scripting language for build-time tasks, and add Python as a
> build-time
> > > > dependency.
> > > > Please vote +1, 0, -1.
> > > >
> > > > 2. Contributors shall be encouraged to use Maven tasks in combination
> > > with
> > > > either plug-ins or Groovy scripts to do cross-platform build-time
> > tasks,
> > > > even under ant in Hadoop-1.
> > > > Please vote +1, 0, -1.
> > > >
> > > > 3. Contributors shall be allowed to use Python as a
> > platform-independent
> > > > scripting language for run-time tasks, and add Python as a run-time
> > > > dependency.
> > > > Please vote +1, 0, -1.
> > > >
> > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES
> > contributors
> > > to
> > > > use Maven plug-ins or Groovy as the only means of cross-platform
> > > build-time
> > > > tasks, or to simply continue using platform-dependent scripts as is
> > being
> > > > done today.
> > > >
> > > > Vote closes at 12:30pm PST on Saturday 1 December.
> > > > ---------
> > > > Personally, my vote is +1, +1, +1.
> > > > I think #2 is preferable to #1, but still has many unknowns in it,
> and
> > > > until those are worked out I don't want to delay moving to
> > cross-platform
> > > > scripts for build-time tasks.
> > > >
> > > > Best regards,
> > > > --Matt
> > > >
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>



--
Alejandro
Reply | Threaded
Open this post in threaded view
|

RE: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Ivan Mitic
In reply to this post by Matt Foley-2
+1, +1, +1 (some comments inline)

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Matt Foley
Sent: Saturday, November 24, 2012 12:13 PM
To: [hidden email]
Subject: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

For discussion, please see previous thread "[PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack".

This vote consists of three separate items:

1. Contributors shall be allowed to use Python as a platform-independent scripting language for build-time tasks, and add Python as a build-time dependency.
Please vote +1, 0, -1.

2. Contributors shall be encouraged to use Maven tasks in combination with either plug-ins or Groovy scripts to do cross-platform build-time tasks, even under ant in Hadoop-1.
Please vote +1, 0, -1.

>>> I believe 1&2 in combination make a total sense. I ported a few scripts to Python, and thus far, it showed to be up to the task and satisfy the cross-platform requirements. In my option, it is also important to agree on the version, as I've run into some breaking changes in version 3+.


3. Contributors shall be allowed to use Python as a platform-independent scripting language for run-time tasks, and add Python as a run-time dependency.

>>> This is a great aspirational goal! Maintaining two sets of scripts would be a real challenge.


Please vote +1, 0, -1.

Note that voting -1 on #1 and +1 on #2 essentially REQUIRES contributors to use Maven plug-ins or Groovy as the only means of cross-platform build-time tasks, or to simply continue using platform-dependent scripts as is being done today.

Vote closes at 12:30pm PST on Saturday 1 December.
---------
Personally, my vote is +1, +1, +1.
I think #2 is preferable to #1, but still has many unknowns in it, and until those are worked out I don't want to delay moving to cross-platform scripts for build-time tasks.

Best regards,
--Matt

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Radim Kolar-2
In reply to this post by Alejandro Abdelnur

* What else in the current build, besides saveVersion.sh, you see as
candidate to be migrated to Phyton?

inline ant scripts
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

Alejandro Abdelnur
In reply to this post by Alejandro Abdelnur
Matt,

Let me repost my previous questions and a few more. I'd appreciate your
answers, as it will help me understand the full impact this would have in
Hadoop and related projects.

* Phyton as runtime requirement. Are you planing to migrate all BASH
scripts provided by Hadoop (or dynamically created -ie launcher scripts)
 to Phyton?
* What else in the current build, besides saveVersion.sh, you see as
candidate to be migrated to Phyton?
* How are you planning to define what Phyton modules can be used? Will
developers have to install them manually?
* What kind of tasks you envision Python scripts will enable that are not
possible today?
* Will the requirement of Python be pushed to clients using the hadoop
script? If so, this would affect all downstream projects that use hadoop
script in one why or the other, right?

Is the main motivation of the proposal to make things easier for window, so
there is no need for cygwin? If that is the case, have you considered doing
directly BAT scripts? If you take Tomcat for example, they have BAT scripts
and SH scripts and things work quite nicely.

Personally, I wouldn't be trilled to see the logic in the scripts to get
more complex, but on the opposite direction; IMO, scripts should be trimmed
to set env vars (with no voodoo logic), build the classpath (with no voodoo
logic, just from a set of dirs) and call Java.

Finally, this is code change, so I'm not sure why we are doing a vote.

Thx.

On Thu, Nov 29, 2012 at 3:26 PM, Alejandro Abdelnur <[hidden email]>wrote:

> Matt, thanks for the clarification.
>
> I may have missed the main point of the PROPOSAL thread then. I personally
> want to continue the discussion before voting.
>
> * Phyton as runtime requirement. Are you planing to migrate all BASH
> scripts provided by Hadoop (or dynamically created -ie launcher scripts)
>  to Phyton?
> * What else in the current build, besides saveVersion.sh, you see as
> candidate to be migrated to Phyton?
> * How are you planning to define what Phyton modules can be used? Will
> developers have to install them manually?
>
> Cheers
>
>
> On Thu, Nov 29, 2012 at 2:39 PM, Matt Foley <[hidden email]>wrote:
>
>> Hi Alejandro,
>> Please see in-line below.
>>
>> On Mon, Nov 26, 2012 at 1:52 PM, Alejandro Abdelnur <[hidden email]>
>>  wrote:
>>
>> > Matt,
>> >
>> > The scope of this vote seems different from what was discussed in the
>> > PROPOSAL thread.
>> > In the PROPOSAL thread you indicated this was for Hadoop1 because it is
>> ANT
>> > based. And the main reason was to remove saveVersion.sh.
>> > Your #3  was not discussed in the proposal, was it?
>> >
>>
>> The item #3 was in my original statement of the problem, with which I
>> started the proposal thread.  In fact, the thread title was "[PROPOSAL]
>> introduce Python as build-time and run-time dependency for Hadoop and
>> throughout Hadoop stack".  It is true that only one or two people chose to
>> discuss #3 further in that thread.
>>
>> The point is not just to replace a single script, but to provide a means
>> to
>> do cross-platform scripts, which will over time replace many
>> non-platform-specific scripts written in platform-specific languages.
>>
>>
>> >
>> > It seems this vote is dragging much more stuff it was originally
>> discussed.
>> > I think you should suspend the vote, recap the motivation and then
>> restart
>> > the vote.
>> >
>>
>> I respectfully disagree.  I believe a careful reading of the cited
>> discussion thread, plus my own statement of the vote, provides sufficient
>> background for a thoughtful decision on the subject.  Presumably so do the
>> ten other people who had already voted before you made that comment.
>>
>> If several other people want more discussion first, please speak up.
>> Thanks,
>> --Matt
>>
>> As things are laid out at the moment my vote is:
>> >
>> > -1 (It still seems an overkill to introduce a new runtime requirement
>> for
>> > building to replace a script.)
>> > +1 (I think this is the right way to simplify the build)
>> > -1 (AFAIK there is not such requirement at the moment, and if it comes
>> it
>> > would be in the form of an AM, which I'd argue it should leave outside
>> of
>> > Hadoop)
>> >
>> > Thx
>> >
>> >
>> > On Mon, Nov 26, 2012 at 1:16 PM, Giridharan Kesavan <
>> > [hidden email]> wrote:
>> >
>> > > +1, +1, +1
>> > >
>> > > -Giri
>> > >
>> > >
>> > > On Sat, Nov 24, 2012 at 12:13 PM, Matt Foley <[hidden email]>
>> wrote:
>> > >
>> > > > For discussion, please see previous thread "[PROPOSAL] introduce
>> Python
>> > > as
>> > > > build-time and run-time dependency for Hadoop and throughout Hadoop
>> > > stack".
>> > > >
>> > > > This vote consists of three separate items:
>> > > >
>> > > > 1. Contributors shall be allowed to use Python as a
>> > platform-independent
>> > > > scripting language for build-time tasks, and add Python as a
>> build-time
>> > > > dependency.
>> > > > Please vote +1, 0, -1.
>> > > >
>> > > > 2. Contributors shall be encouraged to use Maven tasks in
>> combination
>> > > with
>> > > > either plug-ins or Groovy scripts to do cross-platform build-time
>> > tasks,
>> > > > even under ant in Hadoop-1.
>> > > > Please vote +1, 0, -1.
>> > > >
>> > > > 3. Contributors shall be allowed to use Python as a
>> > platform-independent
>> > > > scripting language for run-time tasks, and add Python as a run-time
>> > > > dependency.
>> > > > Please vote +1, 0, -1.
>> > > >
>> > > > Note that voting -1 on #1 and +1 on #2 essentially REQUIRES
>> > contributors
>> > > to
>> > > > use Maven plug-ins or Groovy as the only means of cross-platform
>> > > build-time
>> > > > tasks, or to simply continue using platform-dependent scripts as is
>> > being
>> > > > done today.
>> > > >
>> > > > Vote closes at 12:30pm PST on Saturday 1 December.
>> > > > ---------
>> > > > Personally, my vote is +1, +1, +1.
>> > > > I think #2 is preferable to #1, but still has many unknowns in it,
>> and
>> > > > until those are worked out I don't want to delay moving to
>> > cross-platform
>> > > > scripts for build-time tasks.
>> > > >
>> > > > Best regards,
>> > > > --Matt
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Alejandro
>> >
>>
>
>
>
> --
> Alejandro
>



--
Alejandro
123