parse OutOfMemoryError?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

parse OutOfMemoryError?

Hi Everyone,

I am using MapReduce and DFS for a crawl + index operation. When parsing
relatively small
segments (about 50,000 - 60,000 URLs), everything goes fine. But, when I try
to parse a larger segment
(600,000 - 700,000 URLs), my job is stopped by OutOfMemoryError at
tasktrackers during the map phase :

"java.lang.OutOfMemoryError: Java heap space"

Is this an expected situation as the segments grow larger or is this a bug
waiting to be examined?
I have been trying to solve the problem, but I could not achieve it. Could
somebody help me?