Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Dealing with stragglers in hadoop


Copy link to this message
-
Dealing with stragglers in hadoop
jamal sasha 2013-11-15, 08:44
Hi,
  I have a very simple use case...
Basically I have an edge list and I am trying to convert it into adjacency
list..
Basically

src target
a     b
a    c
b    d
b    e

and so on..
What I am trying to build is

a [b,c]
b [d,e]
.. and so on..

But every now and then.. I hit a super node..which has millions of edges..

Thus keying on just node id is results in poor MR execution because of this
straggler reducer..

I have been trying to understand partitioner.. but I am at lost how to use
it here?

How do i solve this straggler issue?
Thanks