Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Writing large output kills job with timeout _ need ideas


Copy link to this message
-
Writing large output kills job with timeout _ need ideas
I am running a mapper job which generates a large number of output records
for every input record.
about 32,000,000,000 output records from about 150 mappers - each record
about 200 bytes
The job is failing with timeouts.
When I alter the code to do exactly what it did previously but only output
1 in 100 output records it runs to completion with no
difficulty.
I believe I am saturating some local resource on the mapper but this gets
WAY beyond my knowledge of what is going on internally
Any bright ideas?
--
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com