Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> partition as block?


Copy link to this message
-
partition as block?
Hi guys:

Im wondering - if I'm running mapreduce jobs on a cluster with large block
sizes - can i increase performance with either:

1) A custom FileInputFormat

2) A custom partitioner

3) -DnumReducers

Clearly, (3) will be an issue due to the fact that it might overload tasks
and network traffic... but maybe (1) or (2) will be a precise way to "use"
partitions as a "poor mans" block.

Just a thought - not sure if anyone has tried (1) or (2) before in order to
simulate blocks and increase locality by utilizing the partition API.

--
Jay Vyas
http://jayunit100.blogspot.com
+
Mohammad Tariq 2013-04-30, 18:56
+
Jay Vyas 2013-05-01, 00:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB