Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> replicated join amd map tasks


Copy link to this message
-
replicated join amd map tasks
Sorry for the previous incomplete message.
Here is the take 2:

When I use a Replicated Join only 2 map tasks get scheduled (compared to
100+ tasks for the other steps)
What is the idea behind this? What setting do I use to override this
behaviour?
Also, a basic question.
Does hadoop decide the map task capacity or it simply follows the
configuration?

Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes
Excluded Nodes
 64                         20                             1.00

Thanks, Prashant.