Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Passing data via Configuration

Peter Cogan 2013-02-08, 15:15
Robert Evans 2013-02-08, 15:23
Copy link to this message
Re: Passing data via Configuration
Hi Rob,

thanks for the explanation - I had also thought about 'cheating' by
serialising - I guess that's the way to go in my case as the data structure
is really quite small.

On Fri, Feb 8, 2013 at 3:23 PM, Robert Evans <[EMAIL PROTECTED]> wrote:

> You could, but this is generally discouraged.  Pig does something like
> this by taking the object serializing it out into a byte array and then
> using base64 encoding turns it into a string that is put in the config.
>  The problem with this is that the config can grow very large.  In the 1.0
> line of Hadoop the maximum size of the Job's config is limited to avoid
> causing the Job Tracker to go out of memory.  In V2 this is less of a
> concern because it is your own application master that has to read it all
> in.
> In general if it is a very small amount of data you can play games like
> this, if it is a large amount of data you probably want to use the
> distributed cache to do this instead.
> --Bobby
> From: Peter Cogan <[EMAIL PROTECTED]>
> Date: Friday, February 8, 2013 9:15 AM
> Subject: Passing data via Configuration
> Hi,
> I have data stored in an object that I want to pass into my Mapper.
> I see from Configuration that there are setters and getters for
> primitives, but is there a way of doing this with non-primitives - either
> my own classes or builtin classes (such as HashMap etc)
> thanks!
> Peter