MapReduce, mail # user - Writing large output kills job with timeout _ need ideas
Steve Lewis
2012-01-18, 17:49
Radim Kolar
2012-01-26, 15:15
Harsh J
2012-01-26, 16:22
-
Writing large output kills job with timeout _ need ideas
Steve Lewis 2012-01-18, 17:49
I am running a mapper job which generates a large number of output records for every input record. about 32,000,000,000 output records from about 150 mappers - each record about 200 bytes The job is failing with timeouts. When I alter the code to do exactly what it did previously but only output 1 in 100 output records it runs to completion with no difficulty. I believe I am saturating some local resource on the mapper but this gets WAY beyond my knowledge of what is going on internally Any bright ideas? -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
-
Re: Writing large output kills job with timeout _ need ideas
Radim Kolar 2012-01-26, 15:15
Any bright ideas? call status update or Progress every 600 seconds or less
-
Re: Writing large output kills job with timeout _ need ideas
Harsh J 2012-01-26, 16:22
An earlier reply at http://search-hadoop.com/m/e9dM3rw9IP1 may help you get over the idle task issue, if you're idle due to processing and not a real freeze.
On Thu, Jan 26, 2012 at 8:45 PM, Radim Kolar <[EMAIL PROTECTED]> wrote: > Any bright ideas? > > > call status update or Progress every 600 seconds or less
-- Harsh J Customer Ops. Engineer, Cloudera
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext