Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: tasktracker keep recevied KillJobAction and then delete unknown job while using hive


Copy link to this message
-
Re: tasktracker keep recevied KillJobAction and then delete unknown job while using hive
How much namenode handler (dfs.namenode.handler.count) you have defined for your cluster?

- Alex

--
Alexander Lorenz
http://mapredit.blogspot.com

On Feb 1, 2012, at 12:25 PM, Xiaobin She wrote:

>
> hi Alex,
>
> I'm using jre 1.6.0_24
>
> with hadoop 0.20.0
> hive 0.80
>
> thx
>
>
> 2012/2/1 alo alt <[EMAIL PROTECTED]>
> Hi,
>
> + hdfs-user (bcc'd)
>
> which jre version u use?
>
> - Alex
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> On Feb 1, 2012, at 8:16 AM, Xiaobin She wrote:
>
>> hi ,
>>
>>
>> I'm using hive to do some log analysis, and I have encountered a problem.
>>
>> My cluster have 3 nodes, one for NameNode/JobTracker and the other two for DataNode/TaskTracker
>>
>> One of the tasktracker will repeatedly receive KillJobAction and then delete unknown jobs
>>
>> the logs look like:
>>
>> 2012-01-31 00:35:37,640 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201201301055_0381
>> 2012-01-31 00:35:37,640 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201201301055_0381 being deleted.
>> 2012-01-31 00:36:22,697 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201201301055_0383
>> 2012-01-31 00:36:22,698 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201201301055_0383 being deleted.
>> 2012-01-31 01:05:34,108 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201201301055_0384
>> 2012-01-31 01:05:34,108 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201201301055_0384 being deleted.
>> 2012-01-31 01:07:43,280 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201201301055_0385
>> 2012-01-31 01:07:43,280 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201201301055_0385 being deleted.
>>
>> this happens occasionally, and if this happens, this tasktracker will do notghing but keep receiveing KillJobAction and delete unknown job, and thus the performance will drop down.
>>
>> to solve this problem, I have to restart the cluster.
>> but obviously, this is not a good solution.
>>
>> these jobs eventually will be run on the other tasktracker, and they will run well, the job will success.
>>
>> has anybody have encountered this problem and give me some advices?
>>
>> and occasionally there will be some errlog like:
>>
>> 2012-01-31 13:11:40,183 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 55837: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0
>> java.io.IOException: Connection reset by peer
>>        at sun.nio.ch.FileDispatcher.read0(Native Method)
>>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
>>        at sun.nio.ch.IOUtil.read(IOUtil.java:175)
>>        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
>>        at org.apache.hadoop.ipc.Server.channelRead(Server.java:1211)
>>        at org.apache.hadoop.ipc.Server.access$2300(Server.java:77)
>>        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:799)
>>        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:419)
>>        at org.apache.hadoop.ipc.Server$Listener.run(Server.java:328)
>> 2012-01-31 13:11:40,211 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201201311041_0071_r_-1096994286 exited. Number of tasks it ran: 0
>> 2012-01-31 13:11:40,214 INFO org.apache.hadoop.mapred.TaskTracker: Killing unknown JVM jvm_201201311041_0071_r_-386575334
>> 2012-01-31 13:11:40,221 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 55837: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0
>> java.io.IOException: Connection reset by peer
>>        at sun.nio.ch.FileDispatcher.read0(Native Method)
>>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB