Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Passing data via Configuration


+
Peter Cogan 2013-02-08, 15:15
+
Robert Evans 2013-02-08, 15:23
Copy link to this message
-
Re: Passing data via Configuration
Peter Cogan 2013-02-08, 19:51
Hi Rob,

thanks for the explanation - I had also thought about 'cheating' by
serialising - I guess that's the way to go in my case as the data structure
is really quite small.

thanks!
On Fri, Feb 8, 2013 at 3:23 PM, Robert Evans <[EMAIL PROTECTED]> wrote:

> You could, but this is generally discouraged.  Pig does something like
> this by taking the object serializing it out into a byte array and then
> using base64 encoding turns it into a string that is put in the config.
>  The problem with this is that the config can grow very large.  In the 1.0
> line of Hadoop the maximum size of the Job's config is limited to avoid
> causing the Job Tracker to go out of memory.  In V2 this is less of a
> concern because it is your own application master that has to read it all
> in.
>
> In general if it is a very small amount of data you can play games like
> this, if it is a large amount of data you probably want to use the
> distributed cache to do this instead.
>
> --Bobby
>
> From: Peter Cogan <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Date: Friday, February 8, 2013 9:15 AM
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: Passing data via Configuration
>
> Hi,
>
> I have data stored in an object that I want to pass into my Mapper.
>
> I see from Configuration that there are setters and getters for
> primitives, but is there a way of doing this with non-primitives - either
> my own classes or builtin classes (such as HashMap etc)
>
> thanks!
> Peter
>