[jira] Created: (HADOOP-1532) Distcp should support verification modes

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (HADOOP-1532) Distcp should support verification modes

ASF GitHub Bot (Jira)
Distcp should support verification modes
----------------------------------------

                 Key: HADOOP-1532
                 URL: https://issues.apache.org/jira/browse/HADOOP-1532
             Project: Hadoop
          Issue Type: New Feature
          Components: util
            Reporter: Senthil Subramanian
             Fix For: 0.14.0


distcp doesnot currently support any verification after copying files. It should support
1. verify quick (vq) mode - which compares the source and destination CRCs
2. verify long (vl) mode - which in addition to verify quick should read the entire destination file to catch DFS block level errors

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1532) Distcp should support verification modes

ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508339 ]

Doug Cutting commented on HADOOP-1532:
--------------------------------------

Will "verify long" still be needed once HDFS verifies checksums on write?

Currently checksums are generated when writing files and verified when reading.  When data is corrupted in memory before it is written we can end up in a case where all replicas are corrupt and the data is unusable.  But with HADOOP-1134 (or shortly thereafter) checksums can be validated on datanodes as data is written.  Failing tasks can be re-tried until a write succeeds without corruption.  Then data will only be unreadable if all block replicas are corrupted on disk, which is unlikely.


> Distcp should support verification modes
> ----------------------------------------
>
>                 Key: HADOOP-1532
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1532
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: util
>            Reporter: Senthil Subramanian
>             Fix For: 0.14.0
>
>
> distcp doesnot currently support any verification after copying files. It should support
> 1. verify quick (vq) mode - which compares the source and destination CRCs
> 2. verify long (vl) mode - which in addition to verify quick should read the entire destination file to catch DFS block level errors

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1532) Distcp should support verification modes

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508605 ]

Senthil Subramanian commented on HADOOP-1532:
---------------------------------------------

Looked at HADOOP-1134 and also discussed with Milind about DFS improvements that are going to be in 0.14. Neither "verify long" nor "verify quick" would be needed once we are on 0.14.

> Distcp should support verification modes
> ----------------------------------------
>
>                 Key: HADOOP-1532
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1532
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: util
>            Reporter: Senthil Subramanian
>             Fix For: 0.14.0
>
>
> distcp doesnot currently support any verification after copying files. It should support
> 1. verify quick (vq) mode - which compares the source and destination CRCs
> 2. verify long (vl) mode - which in addition to verify quick should read the entire destination file to catch DFS block level errors

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (HADOOP-1532) Distcp should support verification modes

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Senthil Subramanian resolved HADOOP-1532.
-----------------------------------------

    Resolution: Invalid

> Distcp should support verification modes
> ----------------------------------------
>
>                 Key: HADOOP-1532
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1532
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: util
>            Reporter: Senthil Subramanian
>             Fix For: 0.14.0
>
>
> distcp doesnot currently support any verification after copying files. It should support
> 1. verify quick (vq) mode - which compares the source and destination CRCs
> 2. verify long (vl) mode - which in addition to verify quick should read the entire destination file to catch DFS block level errors

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.