[URGENT] Help with using extra information while processing input data using map technique

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[URGENT] Help with using extra information while processing input data using map technique

Pallavi Palleti
Hi All,
 I am new to hadoop and learning how to use it. I have a problem which can be solvable using map-reduce technique. But, in my map step, I need to consider some extra information which depends on  the input key,value pair. Can some one please help me what is the good way of taking this data? I am thinking of storing it in some HDFS and map code try to load it whenever it is processing a particular key, value pair. any other better approaches, please let me know.


Thanks in advance
Reply | Threaded
Open this post in threaded view
|

Re: [URGENT] Help with using extra information while processing input data using map technique

Ted Dunning-3

There is a special class for storing maps in HDFS.

Look at MapFile.

Also, any mapper can contact an outside resource such as a database.  This
is, however, very bad practice since the load on the outside resource can
skyrocket as your cluster grows.  If the cost of the request is small
relative to the work of the map, then this might be kind-of sort-of OK, but
maps are often very cheap to do which means that any dependence on a single
external resource can be really bad for throughput.


On 7/16/07 6:19 AM, "novice user" <[hidden email]> wrote:

>
> Hi All,
>  I am new to hadoop and learning how to use it. I have a problem which can
> be solvable using map-reduce technique. But, in my map step, I need to
> consider some extra information which depends on  the input key,value pair.
> Can some one please help me what is the good way of taking this data? I am
> thinking of storing it in some HDFS and map code try to load it whenever it
> is processing a particular key, value pair. any other better approaches,
> please let me know.
>
>
> Thanks in advance