Quantcast

How do I run job on the every nodes without input file.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How do I run job on the every nodes without input file.

김형준
Generally, map task runs on a special node by FileSplit.getSplit().
In my case, I have no input file. and I want to run task on the every nodes
in hadoop cluster.
It's like LSF or other job queue.

How do I run job on the every node without input file?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I run job on the every nodes without input file.

Owen O'Malley-5

On Apr 2, 2007, at 1:48 AM, 김형준 wrote:

> Generally, map task runs on a special node by FileSplit.getSplit().
> In my case, I have no input file. and I want to run task on the  
> every nodes
> in hadoop cluster.
> It's like LSF or other job queue.
>
> How do I run job on the every node without input file?

There are a couple of approaches that would work. One example is  
RandomWriter, which takes no input and just writes a set of random  
data files. It defines an InputFormat that generates the requested  
number of splits (and therefore maps) and creates a string for each  
and each one is given a single record with the generated string.

-- Owen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I run job on the every nodes without input file.

Doug Cutting
In reply to this post by 김형준
김형준 wrote:
> How do I run job on the every node without input file?

Define an InputFormat that returns InputSplits.  The number of input
splits returned is usually determined by job.getNumMapTasks(), but you
could instead use JobClient.getClusterStatus().getTaskTrackers() to run
one task per node.

InputSplits need not name files other external data.  They contain
simply the required parameters for each task, if any.

Doug
Loading...