[jira] [Resolved] (HADOOP-15208) DistCp to offer -xtrack <path> option to save src/dest filesets as alternative to delete()

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (HADOOP-15208) DistCp to offer -xtrack <path> option to save src/dest filesets as alternative to delete()

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli resolved HADOOP-15208.
----------------------------------------------
       Resolution: Duplicate
    Fix Version/s:     (was: 3.1.0)

> DistCp to offer -xtrack <path> option to save src/dest filesets as alternative to delete()
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-15208
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15208
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: tools/distcp
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability with object stores, but you need to test with production datasets to determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, people (myself included) can experiment with different strategies before trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]