Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: MapReduce processing with extra (possibly non-serializable) configuration


Copy link to this message
-
Re: MapReduce processing with extra (possibly non-serializable) configuration
Hazelcast is an interesting idea, but I was hoping that there is a way of
doing this in MapReduce. :-)

It didn't seem like that from the start, but I posted here just to make
sure I was not missing something.

So, I will serialize my data objects and use them accordingly.

Thanks!
On Thu, Feb 21, 2013 at 10:15 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> How do you imagine sending "data" of any kind (be it in object form,
> etc.) over the network to other nodes, without implementing or relying
> on a serialization for it? Are you looking for "easy" Java ways such
> as the distributed cache from Hazelcast, etc., where this may be taken
> care for you automatically in some way? :)
>
> On Fri, Feb 22, 2013 at 2:40 AM, Public Network Services
> <[EMAIL PROTECTED]> wrote:
> > Hi...
> >
> > I am trying to put an existing file processing application into Hadoop
> and
> > need to find the best way of propagating some extra configuration per
> split,
> > in the form of complex and proprietary custom Java objects.
> >
> > The general idea is
> >
> > A custom InputFormat splits the input data
> > The same InputFormat prepares the appropriate configuration for each
> split
> > Hadoop processes each split in MapReduce, using the split itself and the
> > corresponding configuration
> >
> > The problem is that these configuration objects contain a lot of
> properties
> > and references to other complex objects, and so on, therefore it will
> take a
> > lot of work to cover all the possible combinations and make the whole
> thing
> > serializable (if it can be done in the first place).
> >
> > Most probably this is the only way forward, but if anyone has ever dealt
> > with this problem, please suggest the best approach to follow.
> >
> > Thanks!
> >
>
>
>
> --
> Harsh J
>
+
feng lu 2013-02-22, 06:24
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB