[jira] Created: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
Move Hadoop Abacus to hadoop.mapred.lib
---------------------------------------

                 Key: HADOOP-1290
                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
             Project: Hadoop
          Issue Type: Improvement
            Reporter: Runping Qi



Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
Any comments/thoughts/concerns/objections?



--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491365 ]

Doug Cutting commented on HADOOP-1290:
--------------------------------------

Why?  I'd like to hear more of this discussion.

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491369 ]

Runping Qi commented on HADOOP-1290:
------------------------------------


Mainly, I feel the Abacus package proved useful and fits into mapred.lib nicely. If moved to mapred.lib, it will be easier for other contrib module such as streaming to use it.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491377 ]

Doug Cutting commented on HADOOP-1290:
--------------------------------------

Looking at:

http://lucene.apache.org/hadoop/api/org/apache/hadoop/abacus/package-summary.html

I agree.  These look to be of general utility.  +1

Should we put some of these in lib.aggregate, or all just in lib?

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491395 ]

Runping Qi commented on HADOOP-1290:
------------------------------------


I'd put them all in lib.abacus (currently they are all in contrib.abacus).


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491399 ]

Doug Cutting commented on HADOOP-1290:
--------------------------------------

bq. I'd put them all in lib.abacus [ ... ]

I'd encourage a more descriptive name, like 'aggregate'.  The convention is that only projects use meaningless names; that all names within projects should attempt to be descriptive.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491435 ]

Runping Qi commented on HADOOP-1290:
------------------------------------


Then, lib.aggregate is fine with me.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491522 ]

Nigel Daley commented on HADOOP-1290:
-------------------------------------

If abacus is going into the main framework, I think we should require some unit tests for these classes.

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491525 ]

Runping Qi commented on HADOOP-1290:
------------------------------------

Sure.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Attachment:     (was: patch-1284.txt)

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Attachment: patch-1284.txt


This patch implemented the proposed protocol.

With this patch, the streaming user can specify a field separatot for the mapper's output and/or a field separator
for the reducer's output. The default will be the tab char.

The user can also specify how many fields in the output consitute the keys. The default is 1.
The rest part of a line will be the value.

A partitioner class, KeyFieldBasedPartitioner in mapred.lib, is also implemented.
The user can specify the number of the fields in the map output keys
will be used for partitioning.

Also a urility class, FieldSelectionMapReduce in mapred.lib, is added. This class allows the
user to create  map/reduce jobs that manapulate text data like the Unix cut utility.
The user can specify field separator (delimiter for cut) and specify which fields to select, and
by which fields to partition/sort.

Two unit tests are introduced.
All the unit tests passed.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491719 ]

Runping Qi commented on HADOOP-1290:
------------------------------------


Ooops, wrong JARA.


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Attachment: patch_1290.txt


This patch adds abacus code to mapred.lib.aggregate package.

It includes one unit test for the new code.

After a release with this patch, the user should be guided to use this package instead of
using contrib/abacus. Sometime down the road, contrib/abacus should be removed from future
releases..

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Status: Patch Available  (was: Open)

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492120 ]

Hadoop QA commented on HADOOP-1290:
-----------------------------------

-1, new javadoc warnings

The javadoc tool appears to have generated warning messages when testing the latest attachment http://issues.apache.org/jira/secure/attachment/12356366/patch_1290.txt against trunk revision r532871.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/85/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/85/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1290:
---------------------------------

    Status: Open  (was: Patch Available)

bq. After a release with this patch, the user should be guided to use this package instead of using contrib/abacus

Shouldn't we then deprecate all the classes in contrib/abacus?

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492125 ]

Runping Qi commented on HADOOP-1290:
------------------------------------


Sure. I was just lazy to deprecate those classes one by one:)



> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Attachment:     (was: patch_1290.txt)

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Status: Patch Available  (was: Open)

> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1290) Move Hadoop Abacus to hadoop.mapred.lib

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-1290:
-------------------------------

    Attachment: patch_1290.txt


Deprecate the classes in contrib/abacus

Fixed a few warning in javadoc


> Move Hadoop Abacus to hadoop.mapred.lib
> ---------------------------------------
>
>                 Key: HADOOP-1290
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1290
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>         Attachments: patch_1290.txt
>
>
> Owen and I discussed this issue and we both felt that it is appropriate to move Hadoop Abacus to the hadoop main framework.
> Any comments/thoughts/concerns/objections?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

12