Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> replicated join amd map tasks


Copy link to this message
-
Re: replicated join amd map tasks
2 map tasks for join vs 100+ in other steps, what are "other" steps here?

Your 2nd question, I think you are asking about Map and Reduce Task
capacity mentioned on the JobTracker page? That is governed based on
configuration properties set before hadoop is started on cluster.
On Mon, Apr 30, 2012 at 7:54 AM, shan s <[EMAIL PROTECTED]> wrote:

> Sorry for the previous incomplete message.
> Here is the take 2:
>
> When I use a Replicated Join only 2 map tasks get scheduled (compared to
> 100+ tasks for the other steps)
> What is the idea behind this? What setting do I use to override this
> behaviour?
>
>
> Also, a basic question.
> Does hadoop decide the map task capacity or it simply follows the
> configuration?
>
> Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes
> Excluded Nodes
>  64                         20                             1.00
>
> Thanks, Prashant.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB