Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> hanging context.write() with large arrays


Copy link to this message
-
Re: hanging context.write() with large arrays
for the timeout problem,you can use a background thread that invoke
context.progress() timely which do "keep-alive" for forked
Child(mapper/combiner/reducer)...
it is tricky but works.

On Sat, May 5, 2012 at 10:05 PM, Zuhair Khayyat <[EMAIL PROTECTED]
> wrote:

> Hi,
>
> I am building a MapReduce application that constructs the adjacency list
> of a graph from an input edge list. I noticed that my Reduce phase always
> hangs (and timeout eventually) as it calls the function
> context.write(Key_x,Value_x) when the Value_x is a very large ArrayWritable
> (around 4M elements). I have increased both "mapred.task.timeout" and the
> Reducers memory but no luck; the reducer does not finish the job. Is there
> any other data format that supports large amount of data or should I use my
> own "OutputFormat" class to optimize writing the large amount of data?
>
>
> Thank you.
> Zuhair Khayyat
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB