Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> modify data distribution in jobconf


+
mohak gupta 2012-01-01, 07:29
+
Arun C Murthy 2012-01-02, 07:26
Copy link to this message
-
Re: modify data distribution in jobconf
Mohak,

I hope it means child jvms which are spawned by tasktrackers. It is still
not clear though what are you trying to achieve, I had say do a little more
research.

You might wanna chk this out.
http://blog.imaginea.com/hadoop-a-short-guide/ ( Take a look at Map-reduce
part.)

-P
On Mon, Jan 2, 2012 at 12:56 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote:

> I'm not sure what you are trying to achieve here.
>
> Hadoop MapReduce works by *trying* to schedule tasks on nodes on which
> data is 'close', either node-local/rack-local.
>
> We doesn't try to 'start'/'stop' nodes. If that is what you are trying to
> do, you need to look for something else.
>
> Arun
>
> On Dec 31, 2011, at 11:29 PM, mohak gupta wrote:
>
> > hi
> >
> > as part of my project I need to modify the data distribution layer in job
> > conf so as to achieve the following :
> >
> > 1) control which worker nodes should be  started based on the input data
> > given to them.
> >
> > 2) keep other worker nodes in some kind of sleep state.
> >
> > 3) based on the output emitted by the worker nodes and the data
> distributed
> > allow other worker nodes to start .
> >
> > 4) Perform this in a looping structure till the output is achieved.
> >
> > basically I wish to control which worker nodes perform map and reduce
> > functions based on the data they have recieved.
> >
> > Could you please help me by suggesting if this could be achieved and also
> > what are the tradeoffs involved, Any help is really appreciated
> >
> > regards
> > Mohak Gupta
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB