Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> I am trying to run a large job and it is consistently failing with timeout - nothing happens for 600 sec


Copy link to this message
-
I am trying to run a large job and it is consistently failing with timeout - nothing happens for 600 sec
The map tasks fail timing out after 600 sec.
I am processing one 9 GB file with 16,000,000 records. Each record (think
is it as a line)  generates hundreds of key value pairs.
The job is unusual in that the output of the mapper in terms of records or
bytes orders of magnitude larger than the input.
I have no idea what is slowing down the job except that the problem is in
the writes.

If I change the job to merely bypass a fraction of the context.write
statements the job succeeds.
This is one map task that failed and one that succeeded - I cannot
understand how a write can take so long
or what else the mapper might be doing

JOB FAILED WITH TIMEOUT

*Parser*TotalProteins90,103NumberFragments10,933,089
*FileSystemCounters*HDFS_BYTES_READ67,245,605FILE_BYTES_WRITTEN444,054,807
*Map-Reduce Framework*Combine output records10,033,499Map input records
90,103Spilled Records10,032,836Map output bytes3,520,182,794Combine input
records10,844,881Map output records10,933,089
Same code but fewer writes
JOB SUCCEEDED

*Parser*TotalProteins90,103NumberFragments206,658,758
*FileSystemCounters*FILE_BYTES_READ111,578,253HDFS_BYTES_READ67,245,607
FILE_BYTES_WRITTEN220,169,922
*Map-Reduce Framework*Combine output records4,046,128Map input
records90,103Spilled
Records4,046,128Map output bytes662,354,413Combine input records4,098,609Map
output records2,066,588
Any bright ideas
--
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB