Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Dealing with stragglers in hadoop


Copy link to this message
-
Dealing with stragglers in hadoop
Hi,
  I have a very simple use case...
Basically I have an edge list and I am trying to convert it into adjacency
list..
Basically

src target
a     b
a    c
b    d
b    e

and so on..
What I am trying to build is

a [b,c]
b [d,e]
.. and so on..

But every now and then.. I hit a super node..which has millions of edges..

Thus keying on just node id is results in poor MR execution because of this
straggler reducer..

I have been trying to understand partitioner.. but I am at lost how to use
it here?

How do i solve this straggler issue?
Thanks
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB