[jira] Created: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
dfs "du" shows that the size of a subdirectory is 0
---------------------------------------------------

                 Key: HADOOP-909
                 URL: https://issues.apache.org/jira/browse/HADOOP-909
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.10.1
            Reporter: Hairong Kuang


dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.

The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang reassigned HADOOP-909:
------------------------------------

    Assignee: Hairong Kuang

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466610 ]

Hairong Kuang commented on HADOOP-909:
--------------------------------------

After some further investigation, it turns out that the new FsShell does not handle the "du" command correctly. The size of a direcotry tree should be fetched using getContentLength not getLength.

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-909:
---------------------------------

    Attachment: du.patch

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-909:
---------------------------------

    Status: Patch Available  (was: Open)

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466646 ]

Hadoop QA commented on HADOOP-909:
----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12349417/du.patch applied and successfully tested against trunk revision r498829.

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-909:
--------------------------------

    Status: Open  (was: Patch Available)

I think the default implementation in FileSystem#getContentLength should, when passed a directory, list it and recursively sum the contentLength of it's contents.  That way it will work correctly for LocalFileSystem, S3FileSystem, etc.  HDFS has an optimized implementation that sums server-side.

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-909:
---------------------------------

    Attachment: du.patch

The new patch reflects Doug's suggestion. Thank you, Doug.

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-909:
---------------------------------

    Attachment:     (was: du.patch)

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-909:
---------------------------------

    Fix Version/s: 0.11.0
           Status: Patch Available  (was: Open)

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.11.0
>
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-909) dfs "du" shows that the size of a subdirectory is 0

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-909:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Hairong!

> dfs "du" shows that the size of a subdirectory is 0
> ---------------------------------------------------
>
>                 Key: HADOOP-909
>                 URL: https://issues.apache.org/jira/browse/HADOOP-909
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.11.0
>
>         Attachments: du.patch
>
>
> dfs "du" is implemented by sending a listPaths request to the namenode to get the size of each file/subdir under the directory. At the namenode side, the size of subdir was calculated by recursively going through the whole subtree with the subdir as the root. But starting from the release 0.10.0, the size of subdir is no longer gets calculated. So dfs "du" shows its size as 0.
> The problem is that both "du" and "list" send the same request "listPaths" to the  namenode. The previous implmentation made list very expensive, but the current implementation makes du not working.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.