[jira] Created: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
Investigate and fix the extremely large memory-footprint of JobTracker
----------------------------------------------------------------------

                 Key: HADOOP-815
                 URL: http://issues.apache.org/jira/browse/HADOOP-815
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.9.1
            Reporter: Arun C Murthy
         Assigned To: Arun C Murthy
             Fix For: 0.10.0


The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.

Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  

Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/HADOOP-815?page=comments#action_12458069 ]
           
Arun C Murthy commented on HADOOP-815:
--------------------------------------

Change 1:

I plan to remove:
org.apache.hadoop.mapred.TaskInProgress.totalTaskIds (String[])&
org.apache.hadoop.mapred.TaskInProgress.usableTaskIds (TreeSet)

and replace them with:
ArrayList<String> usableTaskIds

totalTaskIds isn't used anywhere except in org.apache.hadoop.mapred.TaskInProgress.init() and we don't need usableTaskIds to be a TreeSet, and ArrayList should suffice...

(I'll keep updating this issue with proposed changes as I glean more info from memory profiles of the JobTracker.)

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: 75k_jobs.nps
                150k_1199_774.nps

Attached are 2 netbeans' memory profiles of the JT after executing about 250 jobs & 500 jobs respectively (each job had 300 maps and 2 reduces i.e. ~75k tasks & ~150k tasks respectively) ... at this point the JT was consuming ~500Mb & ~1Gb of memory respectively.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: HADOOP-815_20061220_1.patch

Here is a very early patch while I keep testing; appreciate any feedback...

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: HADOOP-815_20061221_2.patch

Here is an updated patch (reflecting changes to trunk and some preliminary review by Devaraj).

After some thought I have done away with the call to jobtracker.removeTaskEntry from JobInProgress.failedTask and instead let the JobInProgress.garbageCollect -> jobtracker.finalizeJob handle it at the end of the job (success/failed/killed). Thoughts?

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: HADOOP-815_20061222_3.patch

Another take while I continue further testing... appreciate any f/b.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: HADOOP-815_20061230_4.patch
                jt_memory_profiles.tgz

Ok, here is a reasonably well-tested patch...

List of changes:
  a) Fixed HADOOP-740 i.e. ensure task entries are cleaned up on completion.
  b) Fixed HADOOP-787 i.e. ensure we keep only 100jobs per user; rest are available via jobhistroy anyway.
  c) Fixed both JobTracker & TaskTracker to ensure lost status-updates/heartbeatResponses due to lost rpcs are resent by both TaskTracker & JobTracker; and also that the JobTracker can detect that duplicate 'TaskTrackerStatus' updates and ignore them, which otherwise are fatal.
  d) Some miscellaneous fixes like using ArrayList instead of TreeSet and array for 'usableTaskIds' in TaskInProgress.java
 
  Results:
  Currently after running smallJobsBenchmark with 750 jobs each with 300 maps & 2 reduces (i.e. total of ~225,000 tasks) the memory footprint of the JobTracker is ~1.5Gb after 'RETIRE_JOB_INTERVAL' (which I suspect also leads to degeneration of JT's performance as in HADOOP-843 since each of the JT's datastructures are extremely bloated leading to sluggishness). With this patch the memory-footprint is down to ~150MB after 'RETIRE_JOB_INTERVAL', yes, that's 150Mb! :) (and seems to solve HADOOP-843 too).

  Appreciate any feedback...


> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/HADOOP-815?page=all ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Status: Patch Available  (was: Open)

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/HADOOP-815?page=comments#action_12461560 ]
           
Hadoop QA commented on HADOOP-815:
----------------------------------

-1, because the patch command could not apply the latest attachment (http://issues.apache.org/jira/secure/attachment/12348120/jt_memory_profiles.tgz) as a patch to trunk revision r489707. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ http://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12461750 ]

Devaraj Das commented on HADOOP-815:
------------------------------------

Looks good (except some whitespace changes which should be removed).

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: http://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-815:
---------------------------------

    Status: Open  (was: Patch Available)

1. The TaskInProgress.usableTaskIds should be removed and a new task id generated when needed. That list has been bothering me for a while. *smile*

2. You replace a new style for loop with an old style for loop on TaskTracker.java line 597, which should stay a new style loop.

3. The trackerToMarkedTaskMap isn't synchronized by a lock, so has race conditions.

4. There are spacing diffs.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462326 ]

Doug Cutting commented on HADOOP-815:
-------------------------------------

> There are spacing diffs.

On formatting, indentation is also four-spaces per level rather than the preferred two.

http://wiki.apache.org/lucene-hadoop/HowToContribute


> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462418 ]

Arun C Murthy commented on HADOOP-815:
--------------------------------------

> 1. The TaskInProgress.usableTaskIds should be removed and a new task id generated when needed. That list has been bothering me for a while. *smile*
  Ok, coming up. :)


> 2. You replace a new style for loop with an old style for loop on TaskTracker.java line 597, which should stay a new style loop.
  Subtle change due to the fact that org.apache.hadoop.mapred.TaskTrackerStatus.getTaskReports returns an 'Iterator'. Shud i fix the (public) api to return 'List<TaskStatus>' instead?


> 3. The trackerToMarkedTaskMap isn't synchronized by a lock, so has race conditions.
  The functions manipulating 'trackerToMarkedTaskMap' assume that the JobTracker itself is locked on entry (much like existing 'removeTaskEntry'), would it help if I put in a comment there? Or shud I explicitly mark the function as 'synchronized' ?

>On formatting, indentation is also four-spaces per level rather than the preferred two.
  Ok, I'll fix it. Spent time hitting the 'tab' key to ensure compliance with existing indentation in some functions. *rueful smile*

  Related thought: should we open a new issue to track and ensure 'all' code is indented with 2 spaces instead of 4 (as in some places)? :)

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]


    Attachment: HADOOP-815_20070105_5.patch

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462437 ]

Arun C Murthy commented on HADOOP-815:
--------------------------------------

Ok, that was me with the a new patch incorporating:
> 1. The TaskInProgress.usableTaskIds should be removed and a new task id generated when needed. That list has been bothering me for a while. *smile*
>On formatting, indentation is also four-spaces per level rather than the preferred two.
 (I've made all 'new' code indented with 2 spaces, while any code added to functions which had 4 are kept as before... does that sound reasonable?)

I'll serve-up new patches post-discussion on points 2 & 3.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

Doug Cutting
In reply to this post by JIRA jira@apache.org
Arun C Murthy (JIRA) wrote:
>   Related thought: should we open a new issue to track and ensure 'all' code is indented with 2 spaces instead of 4 (as in some places)? :)

Sure, we could re-indent all of the code.  We'd need to use the -l
option to 'patch' for all outstanding patches.  And we'd have to be
careful, as there are probably a few places where automatic indenters do
the wrong thing.  So the output should be read carefully.

In the meantime, I think it's worthwhile to make sure that new code is
well formatted and indented.

Doug
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462573 ]

Owen O'Malley commented on HADOOP-815:
--------------------------------------

Please change the TaskTrackerStatus.taskReports() method to be depricated and make a new:

public List<TaskStatus> getTaskReports() { ... }

I'm pretty sure there are some contexts where the JobTracker is not locked. I'll take a look.

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462580 ]

Arun C Murthy commented on HADOOP-815:
--------------------------------------

> Please change the TaskTrackerStatus.taskReports() method to be depricated and make a new:
> public List<TaskStatus> getTaskReports() { ... }
Ok, will do. Another patch coming up...

> I'm pretty sure there are some contexts where the JobTracker is not locked. I'll take a look.
I've run Devaraj through the whole synchronization logic here, basically the functions assume that the JobTracker is locked on entry (as in the javadoc) and happens so since most of the calls emanate from

public synchronized JobTracker.heartbeat() -> JobTracker.processHeartbeat() -> JobTracker.updateTaskStatuses

I'd definitely appreciate another closer look at this... Thanks!

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-815:
---------------------------------

    Attachment: HADOOP-815_20070106_6.patch

> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, HADOOP-815_20070106_6.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-815) Investigate and fix the extremely large memory-footprint of JobTracker

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/HADOOP-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462587 ]

Arun C Murthy commented on HADOOP-815:
--------------------------------------

Here is a new patch incorporating Owen's comments and have added:
public List<TaskStatus> getTaskReports() { ... }



> Investigate and fix the extremely large memory-footprint of JobTracker
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-815
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.9.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.10.0
>
>         Attachments: 150k_1199_774.nps, 75k_jobs.nps, HADOOP-815_20061220_1.patch, HADOOP-815_20061221_2.patch, HADOOP-815_20061222_3.patch, HADOOP-815_20061230_4.patch, HADOOP-815_20070105_5.patch, HADOOP-815_20070106_6.patch, jt_memory_profiles.tgz
>
>
> The JobTracker's memory footprint seems excessively large, especially when many jobs are submitted.
> Here is the 'top' output of a JobTracker which has scheduled ~1k jobs thus far:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
> 31877 arunc     19   0 2362m 261m  13m S 14.0 12.9  24:48.08 java  
> Clearly VIRTual memory of 2364Mb v/s 261Mb of RESident memory is symptomatic of this issue...

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
12