What objects can be passed to Mapper and Reducer classes using Configuration?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

What objects can be passed to Mapper and Reducer classes using Configuration?

Ilya Vishnevsky
There is method Configuration.setObject(String name, Object value). I
tried values of different classes and found that only Strings are passed
normally. Other objects turn out to be null when Mapper or Reducer tries
to get them using getConf().getObject(String name).
Is there possibility to pass a Set for example?
Reply | Threaded
Open this post in threaded view
|

Re: What objects can be passed to Mapper and Reducer classes using Configuration?

Briggs
Ya know, that is a good question.

In theory, as long as the object is serializable, I wouldn't see a
problem.  The issue I have is that the underlying holder for the
key/value pairs is an instance of a java.util.Properties class.  Now,
since Properties does subclass Hashtable, it should be able to hold
any object. But, the contract with java.util.Property states that all
keys and values are strings, though the interface describes something
else.  So, it's an odd use of the Properties class because it's
(Configuration) calling Properties.put() (which is inherited from
Hashtable).  This is just another bad example of where sub-classing is
not always a good thing.  I don't understand why sun didn't
encapsulate Hashtable within Properties. It also looks like
Configuration.getObject() is just another direct call to
Properties.get() which is just the hashtable again....

Anyway, it should work.  Anyone else?



On 5/30/07, Ilya Vishnevsky <[hidden email]> wrote:
> There is method Configuration.setObject(String name, Object value). I
> tried values of different classes and found that only Strings are passed
> normally. Other objects turn out to be null when Mapper or Reducer tries
> to get them using getConf().getObject(String name).
> Is there possibility to pass a Set for example?
>


--
"Conscious decisions by conscious minds are what make reality real"
Reply | Threaded
Open this post in threaded view
|

Re: What objects can be passed to Mapper and Reducer classes using Configuration?

Andrzej Białecki-2
In reply to this post by Ilya Vishnevsky
Ilya Vishnevsky wrote:
> There is method Configuration.setObject(String name, Object value). I
> tried values of different classes and found that only Strings are passed
> normally. Other objects turn out to be null when Mapper or Reducer tries
> to get them using getConf().getObject(String name).
> Is there possibility to pass a Set for example?

The short answer is that this method is deprecated and will be removed.
Don't use it.

Longer answer - see http://issues.apache.org/jira/browse/HADOOP-1343


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

RE: What objects can be passed to Mapper and Reducer classes using Configuration?

Ilya Vishnevsky
Well, as far as I understand the Properties object is packed into xml
file before to be sent to the Mapper, so it can contain only strings.
That's reasonable.
 But I need an array (or collection) of floats to be passed to each
Mapper. Of course I can save it to DFS, so that Mapper could read it.
But won't it slow down work of the Hadoop if every Mapper reads the same
file from DFS? Maybe it's possible to attach this array to the job jar
in any way?



-----Original Message-----
From: Andrzej Bialecki [mailto:[hidden email]]
Sent: Wednesday, May 30, 2007 8:54 PM
To: [hidden email]
Subject: Re: What objects can be passed to Mapper and Reducer classes
using Configuration?

>Ilya Vishnevsky wrote:
>> There is method Configuration.setObject(String name, Object value). I
>> tried values of different classes and found that only Strings are
passed
>> normally. Other objects turn out to be null when Mapper or Reducer
tries
>> to get them using getConf().getObject(String name).
>> Is there possibility to pass a Set for example?

>The short answer is that this method is deprecated and will be removed.

>Don't use it.

>Longer answer - see http://issues.apache.org/jira/browse/HADOOP-1343


>--
>Best regards,
>Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
>[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>___|||__||  \|  ||  |  Embedded Unix, System Integration
>http://www.sigram.com  Contact: info at sigram dot com
Reply | Threaded
Open this post in threaded view
|

Re: What objects can be passed to Mapper and Reducer classes using Configuration?

Dennis Kubes
Two options.  Depending on the size of the array, you could set your
floats as a comma separated list of strings that you send through the
configuration as a variable or in the configure method of the mapper you
can read in the values from the DFS and store them in a class variable.
  This would be read once per map task, not once per map entry.  A third
option is using a custom MapRunner but I think that is overkill for this.

Dennis Kubes

Ilya Vishnevsky wrote:

> Well, as far as I understand the Properties object is packed into xml
> file before to be sent to the Mapper, so it can contain only strings.
> That's reasonable.
>  But I need an array (or collection) of floats to be passed to each
> Mapper. Of course I can save it to DFS, so that Mapper could read it.
> But won't it slow down work of the Hadoop if every Mapper reads the same
> file from DFS? Maybe it's possible to attach this array to the job jar
> in any way?
>
>
>
> -----Original Message-----
> From: Andrzej Bialecki [mailto:[hidden email]]
> Sent: Wednesday, May 30, 2007 8:54 PM
> To: [hidden email]
> Subject: Re: What objects can be passed to Mapper and Reducer classes
> using Configuration?
>
>> Ilya Vishnevsky wrote:
>>> There is method Configuration.setObject(String name, Object value). I
>>> tried values of different classes and found that only Strings are
> passed
>>> normally. Other objects turn out to be null when Mapper or Reducer
> tries
>>> to get them using getConf().getObject(String name).
>>> Is there possibility to pass a Set for example?
>
>> The short answer is that this method is deprecated and will be removed.
>
>> Don't use it.
>
>> Longer answer - see http://issues.apache.org/jira/browse/HADOOP-1343
>
>
>> --
>> Best regards,
>> Andrzej Bialecki     <><
>>  ___. ___ ___ ___ _ _   __________________________________
>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> http://www.sigram.com  Contact: info at sigram dot com