Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Hadoop 2.2.0 MR tasks failing


Copy link to this message
-
Re: Hadoop 2.2.0 MR tasks failing
Robert Dyer 2013-11-02, 05:46
So does anyone have any ideas how to track this down?

Is it perhaps an exception somewhere in an output committer that is being
swallowed and not showing up in the logs?

On Tue, Oct 22, 2013 at 2:19 AM, Robert Dyer <[EMAIL PROTECTED]> wrote:

> The logs for the maps and reduces show nothing useful.  There are a ton of
> warnings about deprecated and final config values, but the task runs and
> seems to finish without error.  The only errors I've found in logs are the
> ones I posted above, which were in the NodeManager log files.
>
> Here's an example map log:
>
> 2013-10-21 23:14:57,241 INFO [main] org.apache.hadoop.mapred.MapTask: Map
> output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2013-10-21 23:14:57,337 INFO [main] org.apache.hadoop.mapred.MapTask:
> (EQUATOR) 0 kvi 26214396(104857584)
> 2013-10-21 23:14:57,337 INFO [main] org.apache.hadoop.mapred.MapTask:
> mapreduce.task.io.sort.mb: 100
> 2013-10-21 23:14:57,337 INFO [main] org.apache.hadoop.mapred.MapTask: soft
> limit at 83886080
> 2013-10-21 23:14:57,337 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufvoid = 104857600
> 2013-10-21 23:14:57,337 INFO [main] org.apache.hadoop.mapred.MapTask:
> kvstart = 26214396; length = 6553600
> 2013-10-21 23:14:57,392 INFO [main]
> org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded &
> initialized native-zlib library
> 2013-10-21 23:14:57,392 INFO [main]
> org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> [.deflate]
> 2013-10-21 23:15:08,610 INFO [main] org.apache.hadoop.mapred.MapTask:
> Starting flush of map output
> 2013-10-21 23:15:08,610 INFO [main] org.apache.hadoop.mapred.MapTask:
> Spilling map output
> 2013-10-21 23:15:08,611 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufend = 204512; bufvoid = 104857600
> 2013-10-21 23:15:08,611 INFO [main] org.apache.hadoop.mapred.MapTask:
> kvstart = 26214396(104857584); kvend = 26182336(104729344); length > 32061/6553600
> 2013-10-21 23:15:08,722 INFO [main]
> org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy]
> 2013-10-21 23:15:08,856 INFO [main] org.apache.hadoop.mapred.MapTask:
> Finished spill 0
> 2013-10-21 23:15:08,859 INFO [main] org.apache.hadoop.mapred.Task:
> Task:attempt_1382415258498_0001_m_000014_0 is done. And is in the process
> of committing
> 2013-10-21 23:15:08,896 INFO [main] org.apache.hadoop.mapred.Task: Task
> 'attempt_1382415258498_0001_m_000014_0' done.
>
>
>
> On Tue, Oct 22, 2013 at 12:16 AM, Arun C Murthy <[EMAIL PROTECTED]>wrote:
>
>> If you follow the links on the web-ui to the logs of the map/reduce
>> tasks, what do you see there?
>>
>> Arun
>>
>> On Oct 21, 2013, at 9:55 PM, Robert Dyer <[EMAIL PROTECTED]> wrote:
>>
>> I recently setup a 2.2.0 test cluster.  For some reason, all of my MR
>> jobs are failing.  The maps and reduces all run to completion, without any
>> errors.  Yet the app is marked failed and there is no final output.  Any
>> ideas?
>>
>> Application Type: MAPREDUCE
>> State: FINISHED
>> FinalStatus: FAILED
>> Diagnostics: We crashed durring a commit
>>
>> I notice in the logs this (but not sure what to make of it):
>>
>> 2013-10-21 23:42:41,379 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 789 for container-id container_1382415258498_0002_01_000001: 250.4 MB of 2 GB physical memory used; 2.0 GB of 6 GB virtual memory used
>> 2013-10-21 23:42:41,743 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1382415258498_0002_01_000001 is : 255
>> 2013-10-21 23:42:41,744 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1382415258498_0002_01_000001 and exit code: 255
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> 2013-10-21 23:42:41,746 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: