Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> replicated join amd map tasks


Copy link to this message
-
replicated join amd map tasks
Sorry for the previous incomplete message.
Here is the take 2:

When I use a Replicated Join only 2 map tasks get scheduled (compared to
100+ tasks for the other steps)
What is the idea behind this? What setting do I use to override this
behaviour?
Also, a basic question.
Does hadoop decide the map task capacity or it simply follows the
configuration?

Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes
Excluded Nodes
 64                         20                             1.00

Thanks, Prashant.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB