|
Neil Guo
2012-12-19, 16:27
Nitin Pawar
2012-12-19, 17:03
Neil Guo
2012-12-19, 17:27
Nitin Pawar
2012-12-19, 17:47
Neil Guo
2012-12-19, 18:15
Nitin Pawar
2012-12-19, 18:58
Neil Guo
2012-12-20, 04:40
Neil Guo
2012-12-20, 07:21
Neil Guo
2012-12-20, 10:10
Mark Grover
2012-12-20, 17:29
Neil Guo
2012-12-21, 06:22
Neil Guo
2012-12-21, 07:53
|
-
hive jobs pending so longNeil Guo 2012-12-19, 16:27
hi,
My hive jobs became very slow from yesterday, it keeps about 5 minutes on pending, but it was only 40sec before. I didn't modify any configuration and there's no other jobs on hadoop cluster. My envirment, hadoop-0.20.203.0 hive-0.8.1 [neil@host logs@master]$ hive -e 'select count(*) from neiltest;' WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/opt/hadoop/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties Hive history file=/tmp/neil/hive_job_log_neil_201212192350_640804332.txt Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201212191724_0012, Tracking URL http://localhost:50030/jobdetails.jsp?jobid=job_201212191724_0012 Kill Command = /opt/hadoop/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201212191724_0012 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2012-12-19 23:53:21,894 Stage-1 map = 0%, reduce = 0% 2012-12-19 23:53:27,940 Stage-1 map = 100%, reduce = 0% 2012-12-19 23:53:37,001 Stage-1 map = 100%, reduce = 33% 2012-12-19 23:53:40,026 Stage-1 map = 100%, reduce = 100% Ended Job = job_201212191724_0012 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 22131 HDFS Write: 4 SUCESS Total MapReduce CPU Time Spent: 0 msec OK 100 Time taken: 305.199 seconds When the job was running, the datanode log file got this, 2012-12-20 00:00:18,408 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-5694413837496700253_48905 2012-12-20 00:01:11,408 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_1151904922466051083_48776 2012-12-20 00:01:51,855 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 39205 blocks took 7578 msec to generate and 124 msecs for RPC and NN processing 2012-12-20 00:02:04,608 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-7641590199699424252_48775 2012-12-20 00:02:57,607 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-7818849592980221590_48773 2012-12-20 00:03:50,608 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_9074790179047774257_48905 2012-12-20 00:04:43,608 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-3172554843058932436_17003 2012-12-20 00:05:36,608 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_8421435495571059078_48776 2012-12-20 00:06:29,808 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-4414438232543819004_48775 2012-12-20 00:07:22,808 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-6640428384205363780_48773 2012-12-20 00:08:15,808 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-5620587770589188815_48905 2012-12-20 00:09:08,808 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_8043462665698224618_48776 2012-12-20 00:10:02,008 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_8925335995905328229_16987 Does this slow dows the hive jobs or any other suggestion ? Thanks, Neil
-
Re: hive jobs pending so longNitin Pawar 2012-12-19, 17:03
Did you retire/remove few datanodes from your cluster in hurry ?
On Wed, Dec 19, 2012 at 9:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > hi, > > My hive jobs became very slow from yesterday, it keeps about 5 minutes on > pending, but it was only 40sec before. I didn't modify any configuration > and there's no other jobs on hadoop cluster. > > > My envirment, > hadoop-0.20.203.0 > hive-0.8.1 > > [neil@host logs@master]$ hive -e 'select count(*) from neiltest;' > > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please > use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties > files. > Logging initialized using configuration in > jar:file:/opt/hadoop/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties > Hive history file=/tmp/neil/hive_job_log_neil_201212192350_640804332.txt > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks determined at compile time: 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapred.reduce.tasks=<number> > Starting Job = job_201212191724_0012, Tracking URL > http://localhost:50030/jobdetails.jsp?jobid=job_201212191724_0012 > Kill Command = /opt/hadoop/hadoop/bin/../bin/hadoop job > -Dmapred.job.tracker=localhost:9001 -kill job_201212191724_0012 > Hadoop job information for Stage-1: number of mappers: 1; number of > reducers: 1 > 2012-12-19 23:53:21,894 Stage-1 map = 0%, reduce = 0% > 2012-12-19 23:53:27,940 Stage-1 map = 100%, reduce = 0% > 2012-12-19 23:53:37,001 Stage-1 map = 100%, reduce = 33% > 2012-12-19 23:53:40,026 Stage-1 map = 100%, reduce = 100% > Ended Job = job_201212191724_0012 > MapReduce Jobs Launched: > Job 0: Map: 1 Reduce: 1 HDFS Read: 22131 HDFS Write: 4 SUCESS > Total MapReduce CPU Time Spent: 0 msec > OK > 100 > Time taken: 305.199 seconds > > > When the job was running, the datanode log file got this, > > 2012-12-20 00:00:18,408 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-5694413837496700253_48905 > 2012-12-20 00:01:11,408 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_1151904922466051083_48776 > 2012-12-20 00:01:51,855 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 39205 > blocks took 7578 msec to generate and 124 msecs for RPC and NN processing > 2012-12-20 00:02:04,608 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-7641590199699424252_48775 > 2012-12-20 00:02:57,607 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-7818849592980221590_48773 > 2012-12-20 00:03:50,608 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_9074790179047774257_48905 > 2012-12-20 00:04:43,608 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-3172554843058932436_17003 > 2012-12-20 00:05:36,608 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_8421435495571059078_48776 > 2012-12-20 00:06:29,808 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-4414438232543819004_48775 > 2012-12-20 00:07:22,808 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-6640428384205363780_48773 > 2012-12-20 00:08:15,808 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-5620587770589188815_48905 > 2012-12-20 00:09:08,808 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_8043462665698224618_48776 > 2012-12-20 00:10:02,008 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification Nitin Pawar
-
Re: hive jobs pending so longNeil Guo 2012-12-19, 17:27
No, actually there's only one datanode.
On Thu, Dec 20, 2012 at 1:03 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > Did you retire/remove few datanodes from your cluster in hurry ? > > > On Wed, Dec 19, 2012 at 9:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > >> hi, >> >> My hive jobs became very slow from yesterday, it keeps about 5 minutes >> on pending, but it was only 40sec before. I didn't modify any >> configuration and there's no other jobs on hadoop cluster. >> >> >> My envirment, >> hadoop-0.20.203.0 >> hive-0.8.1 >> >> [neil@host logs@master]$ hive -e 'select count(*) from neiltest;' >> >> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please >> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties >> files. >> Logging initialized using configuration in >> jar:file:/opt/hadoop/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties >> Hive history file=/tmp/neil/hive_job_log_neil_201212192350_640804332.txt >> Total MapReduce jobs = 1 >> Launching Job 1 out of 1 >> Number of reduce tasks determined at compile time: 1 >> In order to change the average load for a reducer (in bytes): >> set hive.exec.reducers.bytes.per.reducer=<number> >> In order to limit the maximum number of reducers: >> set hive.exec.reducers.max=<number> >> In order to set a constant number of reducers: >> set mapred.reduce.tasks=<number> >> Starting Job = job_201212191724_0012, Tracking URL >> http://localhost:50030/jobdetails.jsp?jobid=job_201212191724_0012 >> Kill Command = /opt/hadoop/hadoop/bin/../bin/hadoop job >> -Dmapred.job.tracker=localhost:9001 -kill job_201212191724_0012 >> Hadoop job information for Stage-1: number of mappers: 1; number of >> reducers: 1 >> 2012-12-19 23:53:21,894 Stage-1 map = 0%, reduce = 0% >> 2012-12-19 23:53:27,940 Stage-1 map = 100%, reduce = 0% >> 2012-12-19 23:53:37,001 Stage-1 map = 100%, reduce = 33% >> 2012-12-19 23:53:40,026 Stage-1 map = 100%, reduce = 100% >> Ended Job = job_201212191724_0012 >> MapReduce Jobs Launched: >> Job 0: Map: 1 Reduce: 1 HDFS Read: 22131 HDFS Write: 4 SUCESS >> Total MapReduce CPU Time Spent: 0 msec >> OK >> 100 >> Time taken: 305.199 seconds >> >> >> When the job was running, the datanode log file got this, >> >> 2012-12-20 00:00:18,408 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-5694413837496700253_48905 >> 2012-12-20 00:01:11,408 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_1151904922466051083_48776 >> 2012-12-20 00:01:51,855 INFO >> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 39205 >> blocks took 7578 msec to generate and 124 msecs for RPC and NN processing >> 2012-12-20 00:02:04,608 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-7641590199699424252_48775 >> 2012-12-20 00:02:57,607 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-7818849592980221590_48773 >> 2012-12-20 00:03:50,608 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_9074790179047774257_48905 >> 2012-12-20 00:04:43,608 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-3172554843058932436_17003 >> 2012-12-20 00:05:36,608 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_8421435495571059078_48776 >> 2012-12-20 00:06:29,808 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-4414438232543819004_48775 >> 2012-12-20 00:07:22,808 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-6640428384205363780_48773 >> 2012-12-20 00:08:15,808 INFO >> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> succeeded for blk_-5620587770589188815_48905 >> 2012-12-20 00:09:08,808 INFO
-
Re: hive jobs pending so longNitin Pawar 2012-12-19, 17:47
i hope hadoop replication factor is set to 1
apart from this can you check how much disk space is empty on the datanode? how are the memory stats on JT and NN? On Wed, Dec 19, 2012 at 10:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > No, actually there's only one datanode. > > On Thu, Dec 20, 2012 at 1:03 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > >> Did you retire/remove few datanodes from your cluster in hurry ? >> >> >> On Wed, Dec 19, 2012 at 9:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: >> >>> hi, >>> >>> My hive jobs became very slow from yesterday, it keeps about 5 minutes >>> on pending, but it was only 40sec before. I didn't modify any >>> configuration and there's no other jobs on hadoop cluster. >>> >>> >>> My envirment, >>> hadoop-0.20.203.0 >>> hive-0.8.1 >>> >>> [neil@host logs@master]$ hive -e 'select count(*) from neiltest;' >>> >>> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. >>> Please use org.apache.hadoop.log.metrics.EventCounter in all the >>> log4j.properties files. >>> Logging initialized using configuration in >>> jar:file:/opt/hadoop/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties >>> Hive history file=/tmp/neil/hive_job_log_neil_201212192350_640804332.txt >>> Total MapReduce jobs = 1 >>> Launching Job 1 out of 1 >>> Number of reduce tasks determined at compile time: 1 >>> In order to change the average load for a reducer (in bytes): >>> set hive.exec.reducers.bytes.per.reducer=<number> >>> In order to limit the maximum number of reducers: >>> set hive.exec.reducers.max=<number> >>> In order to set a constant number of reducers: >>> set mapred.reduce.tasks=<number> >>> Starting Job = job_201212191724_0012, Tracking URL >>> http://localhost:50030/jobdetails.jsp?jobid=job_201212191724_0012 >>> Kill Command = /opt/hadoop/hadoop/bin/../bin/hadoop job >>> -Dmapred.job.tracker=localhost:9001 -kill job_201212191724_0012 >>> Hadoop job information for Stage-1: number of mappers: 1; number of >>> reducers: 1 >>> 2012-12-19 23:53:21,894 Stage-1 map = 0%, reduce = 0% >>> 2012-12-19 23:53:27,940 Stage-1 map = 100%, reduce = 0% >>> 2012-12-19 23:53:37,001 Stage-1 map = 100%, reduce = 33% >>> 2012-12-19 23:53:40,026 Stage-1 map = 100%, reduce = 100% >>> Ended Job = job_201212191724_0012 >>> MapReduce Jobs Launched: >>> Job 0: Map: 1 Reduce: 1 HDFS Read: 22131 HDFS Write: 4 SUCESS >>> Total MapReduce CPU Time Spent: 0 msec >>> OK >>> 100 >>> Time taken: 305.199 seconds >>> >>> >>> When the job was running, the datanode log file got this, >>> >>> 2012-12-20 00:00:18,408 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_-5694413837496700253_48905 >>> 2012-12-20 00:01:11,408 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_1151904922466051083_48776 >>> 2012-12-20 00:01:51,855 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 39205 >>> blocks took 7578 msec to generate and 124 msecs for RPC and NN processing >>> 2012-12-20 00:02:04,608 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_-7641590199699424252_48775 >>> 2012-12-20 00:02:57,607 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_-7818849592980221590_48773 >>> 2012-12-20 00:03:50,608 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_9074790179047774257_48905 >>> 2012-12-20 00:04:43,608 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_-3172554843058932436_17003 >>> 2012-12-20 00:05:36,608 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_8421435495571059078_48776 >>> 2012-12-20 00:06:29,808 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >>> succeeded for blk_-4414438232543819004_48775 >>> 2012-12-20 00:07:22,808 INFO Nitin Pawar
-
Re: hive jobs pending so longNeil Guo 2012-12-19, 18:15
Yes, hadoop replication factor is already set to 1
Space on the datenode, *11291 files and directories, 39205 blocks = 50496 total. Heap Size is 858.94 MB / 2.6 GB (32%) * Configured Capacity : 9.92 TB DFS Used : 2.12 TB Non DFS Used : 6.35 TB DFS Remaining : 1.45 TB DFS Used% : 21.39 % DFS Remaining% : 14.58 % Live Nodes : 1 Dead Nodes : 0 Decommissioning Nodes : 0 Number of Under-Replicated Blocks : 164 Result of jmap NN, Attaching to process ID 15949, please wait... Debugger attached successfully. Server compiler detected. JVM version is 19.0-b09 using thread-local object allocation. Parallel GC with 13 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 3145728000 (3000.0MB) NewSize = 1310720 (1.25MB) MaxNewSize = 17592186044415 MB OldSize = 5439488 (5.1875MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 21757952 (20.75MB) MaxPermSize = 174063616 (166.0MB) Heap Usage: PS Young Generation Eden Space: capacity = 527171584 (502.75MB) used = 138223192 (131.81990814208984MB) free = 388948392 (370.93009185791016MB) 26.219772877591218% used >From Space: capacity = 21954560 (20.9375MB) used = 8190336 (7.8109130859375MB) free = 13764224 (13.1265869140625MB) 37.30585354477612% used To Space: capacity = 21954560 (20.9375MB) used = 0 (0.0MB) free = 21954560 (20.9375MB) 0.0% used PS Old Generation capacity = 351535104 (335.25MB) used = 33554448 (32.00001525878906MB) free = 317980656 (303.24998474121094MB) 9.545120136849832% used PS Perm Generation capacity = 21757952 (20.75MB) used = 20201176 (19.265342712402344MB) free = 1556776 (1.4846572875976562MB) 92.8450251200113% used Resutes of JT, Attaching to process ID 16342, please wait... Debugger attached successfully. Server compiler detected. JVM version is 19.0-b09 using thread-local object allocation. Parallel GC with 13 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 3145728000 (3000.0MB) NewSize = 1310720 (1.25MB) MaxNewSize = 17592186044415 MB OldSize = 5439488 (5.1875MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 21757952 (20.75MB) MaxPermSize = 174063616 (166.0MB) Heap Usage: PS Young Generation Eden Space: capacity = 131792896 (125.6875MB) used = 72740936 (69.37116241455078MB) free = 59051960 (56.31633758544922MB) 55.1933664163507% used >From Space: capacity = 21954560 (20.9375MB) used = 4555392 (4.3443603515625MB) free = 17399168 (16.5931396484375MB) 20.749183768656717% used To Space: capacity = 21954560 (20.9375MB) used = 0 (0.0MB) free = 21954560 (20.9375MB) 0.0% used PS Old Generation capacity = 351535104 (335.25MB) used = 0 (0.0MB) free = 351535104 (335.25MB) 0.0% used PS Perm Generation capacity = 24313856 (23.1875MB) used = 24162376 (23.04303741455078MB) free = 151480 (0.14446258544921875MB) 99.37698076356132% used Not very familiar with Java, if I miss something, please just let me know. Thanks. On Thu, Dec 20, 2012 at 1:47 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > i hope hadoop replication factor is set to 1 > apart from this can you check how much disk space is empty on the datanode? > > how are the memory stats on JT and NN? > > > > On Wed, Dec 19, 2012 at 10:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > >> No, actually there's only one datanode. >> >> On Thu, Dec 20, 2012 at 1:03 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: >> >>> Did you retire/remove few datanodes from your cluster in hurry ? >>> >>> >>> On Wed, Dec 19, 2012 at 9:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: >>> >>>> hi, >>>> >>>> My hive jobs became very slow from yesterday, it keeps about 5 minutes >>>> on pending, but it was only 40sec before. I didn't modify any
-
Re: hive jobs pending so longNitin Pawar 2012-12-19, 18:58
does not look anything odd in here
do you see any task failures in your job on the jobtracker UI ? On Wed, Dec 19, 2012 at 11:45 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > > Yes, hadoop replication factor is already set to 1 > > Space on the datenode, > > *11291 files and directories, 39205 blocks = 50496 total. Heap Size is > 858.94 MB / 2.6 GB (32%) > * > Configured Capacity : 9.92 TB DFS Used : 2.12 TB Non DFS Used : 6.35 TBDFS Remaining:1.45 TBDFS Used%:21.39 %DFS Remaining%:14.58 %Live Nodes > : 1 Dead Nodes : 0 Decommissioning Nodes : 0 Number of Under-Replicated > Blocks : 164 > Result of jmap NN, > > Attaching to process ID 15949, please wait... > Debugger attached successfully. > Server compiler detected. > JVM version is 19.0-b09 > > using thread-local object allocation. > Parallel GC with 13 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 3145728000 (3000.0MB) > NewSize = 1310720 (1.25MB) > MaxNewSize = 17592186044415 MB > OldSize = 5439488 (5.1875MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 21757952 (20.75MB) > MaxPermSize = 174063616 (166.0MB) > > Heap Usage: > PS Young Generation > Eden Space: > capacity = 527171584 (502.75MB) > used = 138223192 (131.81990814208984MB) > free = 388948392 (370.93009185791016MB) > 26.219772877591218% used > From Space: > capacity = 21954560 (20.9375MB) > used = 8190336 (7.8109130859375MB) > free = 13764224 (13.1265869140625MB) > 37.30585354477612% used > To Space: > capacity = 21954560 (20.9375MB) > used = 0 (0.0MB) > free = 21954560 (20.9375MB) > 0.0% used > PS Old Generation > capacity = 351535104 (335.25MB) > used = 33554448 (32.00001525878906MB) > free = 317980656 (303.24998474121094MB) > 9.545120136849832% used > PS Perm Generation > capacity = 21757952 (20.75MB) > used = 20201176 (19.265342712402344MB) > free = 1556776 (1.4846572875976562MB) > 92.8450251200113% used > > > Resutes of JT, > > Attaching to process ID 16342, please wait... > Debugger attached successfully. > Server compiler detected. > JVM version is 19.0-b09 > > using thread-local object allocation. > Parallel GC with 13 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 3145728000 (3000.0MB) > NewSize = 1310720 (1.25MB) > MaxNewSize = 17592186044415 MB > OldSize = 5439488 (5.1875MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 21757952 (20.75MB) > MaxPermSize = 174063616 (166.0MB) > > Heap Usage: > PS Young Generation > Eden Space: > capacity = 131792896 (125.6875MB) > used = 72740936 (69.37116241455078MB) > free = 59051960 (56.31633758544922MB) > 55.1933664163507% used > From Space: > capacity = 21954560 (20.9375MB) > used = 4555392 (4.3443603515625MB) > free = 17399168 (16.5931396484375MB) > 20.749183768656717% used > To Space: > capacity = 21954560 (20.9375MB) > used = 0 (0.0MB) > free = 21954560 (20.9375MB) > 0.0% used > PS Old Generation > capacity = 351535104 (335.25MB) > used = 0 (0.0MB) > free = 351535104 (335.25MB) > 0.0% used > PS Perm Generation > capacity = 24313856 (23.1875MB) > used = 24162376 (23.04303741455078MB) > free = 151480 (0.14446258544921875MB) > 99.37698076356132% used > > > Not very familiar with Java, if I miss something, please just let me know. > > Thanks. > > > > > > > > On Thu, Dec 20, 2012 at 1:47 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > >> i hope hadoop replication factor is set to 1 >> apart from this can you check how much disk space is empty on the >> datanode? >> >> how are the memory stats on JT and NN? >> >> >> >> On Wed, Dec 19, 2012 at 10:57 PM, Neil Guo <[EMAIL PROTECTED]> wrote: >> >>> No, actually there's only one datanode. Nitin Pawar
-
Re: hive jobs pending so longNeil Guo 2012-12-20, 04:40
No, there's no failure in jobtracker ui, seems that all of them are success.
Attachent is hadoop log files. Thanks.
-
Re: hive jobs pending so longNeil Guo 2012-12-20, 07:21
another log,
cd $HADOOP_LOG_DIR && tail -n 0 -f *.log |tee log_merge.log ==> hadoop-neil-datanode-Ct-bj108.log <= ==> hadoop-neil-jobtracker-Ct-bj108.log <= ==> hadoop-neil-namenode-Ct-bj108.log <= ==> hadoop-neil-secondarynamenode-Ct-bj108.log <= ==> hadoop-neil-tasktracker-Ct-bj108.log <= ==> hadoop-neil-datanode-Ct-bj108.log <=2012-12-20 15:10:10,425 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_7549818753225420678_5717 ==> hadoop-neil-namenode-Ct-bj108.log <=2012-12-20 15:10:15,265 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 1 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 ==> hadoop-neil-datanode-Ct-bj108.log <=2012-12-20 15:10:16,415 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-6072279734308701530_134821 src: /127.0.0.1:34080 dest: /127.0.0.1:50010 ==> hadoop-neil-namenode-Ct-bj108.log <=2012-12-20 15:10:16,393 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7. blk_-6072279734308701530_134821 ==> hadoop-neil-datanode-Ct-bj108.log <=2012-12-20 15:10:16,564 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / 127.0.0.1:34080, dest: /127.0.0.1:50010, bytes: 57401, op: HDFS_WRITE, cliID: DFSClient_-1778955690, offset: 0, srvID: DS-1688600142-127.0.0.1-50010-1318579377533, blockid: blk_-6072279734308701530_134821, duration: 84788000 2012-12-20 15:10:16,565 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_-6072279734308701530_134821 terminating 2012-12-20 15:10:16,911 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5272124058034081804_134822 src: /127.0.0.1:34082 dest: /127.0.0.1:50010 2012-12-20 15:10:17,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / 127.0.0.1:34082, dest: /127.0.0.1:50010, bytes: 3915, op: HDFS_WRITE, cliID: DFSClient_-1778955690, offset: 0, srvID: DS-1688600142-127.0.0.1-50010-1318579377533, blockid: blk_5272124058034081804_134822, duration: 56734000 2012-12-20 15:10:17,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5272124058034081804_134822 terminating 2012-12-20 15:10:17,122 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-5406425112482671769_134823 src: /127.0.0.1:34083 dest: /127.0.0.1:50010 2012-12-20 15:10:17,313 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / 127.0.0.1:34083, dest: /127.0.0.1:50010, bytes: 449818, op: HDFS_WRITE, cliID: DFSClient_-1778955690, offset: 0, srvID: DS-1688600142-127.0.0.1-50010-1318579377533, blockid: blk_-5406425112482671769_134823, duration: 152658000 2012-12-20 15:10:17,313 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_-5406425112482671769_134823 terminating 2012-12-20 15:10:17,474 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-6418269071912761245_134824 src: /127.0.0.1:34084 dest: /127.0.0.1:50010 ==> hadoop-neil-namenode-Ct-bj108.log <=2012-12-20 15:10:16,565 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-6072279734308701530_134821 size 57401 2012-12-20 15:10:16,567 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7 is closed by DFSClient_-1778955690 2012-12-20 15:10:16,603 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Increasing replication for file /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7. New replication is 10 2012-12-20 15:10:16,910 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/hive-builtins-0.8.1.jar. blk_5272124058034081804_134822 2012-12-20 15:10:17,003 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_5272124058034081804_134822 size 3915 2012-12-20 15:10:17,005 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/hive-builtins-0.8.1.jar is closed by DFSClient_-1778955690 2012-12-20 15:10:17,043 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Increasing replication for file /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/hive-builtins-0.8.1.jar. New replication is 10 2012-12-20 15:10:17,121 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/protobuf-java-2.4.0a.jar. blk_-5406425112482671769_134823 2012-12-20 15:10:17,313 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-5406425112482671769_134823 size 449818 2012-12-20 15:10:17,315 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/protobuf-java-2.4.0a.jar is closed by DFSClient_-1778955690 2012-12-20 15:10:17,355 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Increasing replication for file /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/libjars/protobuf-java-2.4.0a.jar. New replication is 10 2012-12-20 15:10:17,472 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /data/disk/hadoop/mapred/staging/neil/.staging/job_201212201444_0001/job.jar. blk_-6418269071912761245_134824 ==> hadoop-neil-datanode-Ct-bj108.log <=2012-12-20 15:10:17,801 INFO org.apac
-
Re: hive jobs pending so longNeil Guo 2012-12-20, 10:10
Hive log, seems there's too many connection to jobtracker??
2012-12-20 17:35:14,615 INFO exec.Task (SessionState.java:printInfo(371)) - Kill Command = /opt/hadoop/hadoop-0.20.203.0/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201212201721_0003 2012-12-20 17:35:14,616 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #80 2012-12-20 17:35:14,617 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #80 2012-12-20 17:35:14,617 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:15,620 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #81 2012-12-20 17:35:15,621 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #81 2012-12-20 17:35:15,622 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:15,622 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #82 2012-12-20 17:35:15,623 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #82 2012-12-20 17:35:15,624 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:16,626 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #83 2012-12-20 17:35:16,627 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #83 2012-12-20 17:35:16,628 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:16,628 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #84 2012-12-20 17:35:16,629 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #84 2012-12-20 17:35:16,630 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:17,632 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #85 2012-12-20 17:35:17,633 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #85 2012-12-20 17:35:17,634 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:17,634 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #86 2012-12-20 17:35:17,635 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #86 2012-12-20 17:35:17,636 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:18,638 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #87 2012-12-20 17:35:18,639 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #87 2012-12-20 17:35:18,639 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:18,640 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #88 2012-12-20 17:35:18,641 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #88 2012-12-20 17:35:18,642 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:19,644 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #89 2012-12-20 17:35:19,645 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #89 2012-12-20 17:35:19,645 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:19,646 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #90 2012-12-20 17:35:19,647 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #90 2012-12-20 17:35:19,648 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:20,650 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #91 2012-12-20 17:35:20,651 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #91 2012-12-20 17:35:20,652 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 2 2012-12-20 17:35:20,652 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #92 2012-12-20 17:35:20,653 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #92 2012-12-20 17:35:20,653 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:21,656 DEBUG ipc.Client (Client.java:sendParam(735)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user sending #93 2012-12-20 17:35:21,657 DEBUG ipc.Client (Client.java:receiveResponse(770)) - IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop_user got value #93 2012-12-20 17:35:21,657 DEBUG ipc.RPC (RPC.java:invoke(228)) - Call: getJobStatus 1 2012-12-20 17:35:21,658 DEBUG
-
Re: hive jobs pending so longMark Grover 2012-12-20, 17:29
Neil,
It might be a transient problem with your single-node cluster. Was something else running on the machine when you ran the long running job. Can you monitor the usage of the machine when the hive query is running. Have you tried rebooting the machine? Does the query remain slow even after reboot? Mark On Wed, Dec 19, 2012 at 11:21 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > another log, > > cd $HADOOP_LOG_DIR && tail -n 0 -f *.log |tee log_merge.log > > > ==> hadoop-neil-datanode-Ct-bj108.log <=> > ==> hadoop-neil-jobtracker-Ct-bj108.log <=> > ==> hadoop-neil-namenode-Ct-bj108.log <=> > ==> hadoop-neil-secondarynamenode-Ct-bj108.log <=> > ==> hadoop-neil-tasktracker-Ct-bj108.log <=> > ==> hadoop-neil-datanode-Ct-bj108.log <=> 2012-12-20 15:10:10,425 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_7549818753225420678_5717 > > ==> hadoop-neil-namenode-Ct-bj108.log <=> 2012-12-20 15:10:15,265 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: > 1 Total time for transactions(ms): 0Number of transactions batched in Syncs: > 0 Number of syncs: 0 SyncTimes(ms): 0 > > ==> hadoop-neil-datanode-Ct-bj108.log <=> 2012-12-20 15:10:16,415 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-6072279734308701530_134821 src: /127.0.0.1:34080 dest: /127.0.0.1:50010 > > ==> hadoop-neil-namenode-Ct-bj108.log <=> 2012-12-20 15:10:16,393 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.allocateBlock: > /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7. > blk_-6072279734308701530_134821 > > ==> hadoop-neil-datanode-Ct-bj108.log <=> 2012-12-20 15:10:16,564 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /127.0.0.1:34080, dest: /127.0.0.1:50010, bytes: 57401, op: HDFS_WRITE, > cliID: DFSClient_-1778955690, offset: 0, srvID: > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > blk_-6072279734308701530_134821, duration: 84788000 > 2012-12-20 15:10:16,565 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block > blk_-6072279734308701530_134821 terminating > 2012-12-20 15:10:16,911 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_5272124058034081804_134822 src: /127.0.0.1:34082 dest: /127.0.0.1:50010 > 2012-12-20 15:10:17,003 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /127.0.0.1:34082, dest: /127.0.0.1:50010, bytes: 3915, op: HDFS_WRITE, > cliID: DFSClient_-1778955690, offset: 0, srvID: > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > blk_5272124058034081804_134822, duration: 56734000 > 2012-12-20 15:10:17,003 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block > blk_5272124058034081804_134822 terminating > 2012-12-20 15:10:17,122 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-5406425112482671769_134823 src: /127.0.0.1:34083 dest: /127.0.0.1:50010 > 2012-12-20 15:10:17,313 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /127.0.0.1:34083, dest: /127.0.0.1:50010, bytes: 449818, op: HDFS_WRITE, > cliID: DFSClient_-1778955690, offset: 0, srvID: > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > blk_-5406425112482671769_134823, duration: 152658000 > 2012-12-20 15:10:17,313 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block > blk_-5406425112482671769_134823 terminating > 2012-12-20 15:10:17,474 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-6418269071912761245_134824 src: /127.0.0.1:34084 dest: /127.0.0.1:50010 > > ==> hadoop-neil-namenode-Ct-bj108.log <=> 2012-12-20 15:10:16,565 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to > blk_-6072279734308701530_134821 size 57401 > 2012-12-20 15:10:16,567 INFO org.apache.hadoop.hdfs.StateChange: DIR*
-
Re: hive jobs pending so longNeil Guo 2012-12-21, 06:22
Hi Mark,
I've reboot the server today, but the problem is still there. The hive job is the only job on the machine. :( On Fri, Dec 21, 2012 at 1:29 AM, Mark Grover <[EMAIL PROTECTED]>wrote: > Neil, > It might be a transient problem with your single-node cluster. Was > something else running on the machine when you ran the long running > job. Can you monitor the usage of the machine when the hive query is > running. Have you tried rebooting the machine? Does the query remain > slow even after reboot? > > Mark > > On Wed, Dec 19, 2012 at 11:21 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > > another log, > > > > cd $HADOOP_LOG_DIR && tail -n 0 -f *.log |tee log_merge.log > > > > > > ==> hadoop-neil-datanode-Ct-bj108.log <=> > > > ==> hadoop-neil-jobtracker-Ct-bj108.log <=> > > > ==> hadoop-neil-namenode-Ct-bj108.log <=> > > > ==> hadoop-neil-secondarynamenode-Ct-bj108.log <=> > > > ==> hadoop-neil-tasktracker-Ct-bj108.log <=> > > > ==> hadoop-neil-datanode-Ct-bj108.log <=> > 2012-12-20 15:10:10,425 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_7549818753225420678_5717 > > > > ==> hadoop-neil-namenode-Ct-bj108.log <=> > 2012-12-20 15:10:15,265 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of > transactions: > > 1 Total time for transactions(ms): 0Number of transactions batched in > Syncs: > > 0 Number of syncs: 0 SyncTimes(ms): 0 > > > > ==> hadoop-neil-datanode-Ct-bj108.log <=> > 2012-12-20 15:10:16,415 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > > blk_-6072279734308701530_134821 src: /127.0.0.1:34080 dest: / > 127.0.0.1:50010 > > > > ==> hadoop-neil-namenode-Ct-bj108.log <=> > 2012-12-20 15:10:16,393 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > > NameSystem.allocateBlock: > > > /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7. > > blk_-6072279734308701530_134821 > > > > ==> hadoop-neil-datanode-Ct-bj108.log <=> > 2012-12-20 15:10:16,564 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > > /127.0.0.1:34080, dest: /127.0.0.1:50010, bytes: 57401, op: HDFS_WRITE, > > cliID: DFSClient_-1778955690, offset: 0, srvID: > > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > > blk_-6072279734308701530_134821, duration: 84788000 > > 2012-12-20 15:10:16,565 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > > blk_-6072279734308701530_134821 terminating > > 2012-12-20 15:10:16,911 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > > blk_5272124058034081804_134822 src: /127.0.0.1:34082 dest: / > 127.0.0.1:50010 > > 2012-12-20 15:10:17,003 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > > /127.0.0.1:34082, dest: /127.0.0.1:50010, bytes: 3915, op: HDFS_WRITE, > > cliID: DFSClient_-1778955690, offset: 0, srvID: > > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > > blk_5272124058034081804_134822, duration: 56734000 > > 2012-12-20 15:10:17,003 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > > blk_5272124058034081804_134822 terminating > > 2012-12-20 15:10:17,122 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > > blk_-5406425112482671769_134823 src: /127.0.0.1:34083 dest: / > 127.0.0.1:50010 > > 2012-12-20 15:10:17,313 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > > /127.0.0.1:34083, dest: /127.0.0.1:50010, bytes: 449818, op: HDFS_WRITE, > > cliID: DFSClient_-1778955690, offset: 0, srvID: > > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: > > blk_-5406425112482671769_134823, duration: 152658000 > > 2012-12-20 15:10:17,313 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > > blk_-5406425112482671769_134823 terminating > > 2012-12-20 15:10:17,474 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
-
Re: hive jobs pending so longNeil Guo 2012-12-21, 07:53
Well, the hadoop.tmp.dir was on a remote disk before, access by nfs. When I
change it to local disk, the problem is gone. But still don't know why it works before Dec 19. On Fri, Dec 21, 2012 at 2:22 PM, Neil Guo <[EMAIL PROTECTED]> wrote: > Hi Mark, > > I've reboot the server today, but the problem is still there. The hive > job is the only job on the machine. > > :( > > > On Fri, Dec 21, 2012 at 1:29 AM, Mark Grover <[EMAIL PROTECTED]>wrote: > >> Neil, >> It might be a transient problem with your single-node cluster. Was >> something else running on the machine when you ran the long running >> job. Can you monitor the usage of the machine when the hive query is >> running. Have you tried rebooting the machine? Does the query remain >> slow even after reboot? >> >> Mark >> >> On Wed, Dec 19, 2012 at 11:21 PM, Neil Guo <[EMAIL PROTECTED]> wrote: >> > another log, >> > >> > cd $HADOOP_LOG_DIR && tail -n 0 -f *.log |tee log_merge.log >> > >> > >> > ==> hadoop-neil-datanode-Ct-bj108.log <=>> > >> > ==> hadoop-neil-jobtracker-Ct-bj108.log <=>> > >> > ==> hadoop-neil-namenode-Ct-bj108.log <=>> > >> > ==> hadoop-neil-secondarynamenode-Ct-bj108.log <=>> > >> > ==> hadoop-neil-tasktracker-Ct-bj108.log <=>> > >> > ==> hadoop-neil-datanode-Ct-bj108.log <=>> > 2012-12-20 15:10:10,425 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification >> > succeeded for blk_7549818753225420678_5717 >> > >> > ==> hadoop-neil-namenode-Ct-bj108.log <=>> > 2012-12-20 15:10:15,265 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >> transactions: >> > 1 Total time for transactions(ms): 0Number of transactions batched in >> Syncs: >> > 0 Number of syncs: 0 SyncTimes(ms): 0 >> > >> > ==> hadoop-neil-datanode-Ct-bj108.log <=>> > 2012-12-20 15:10:16,415 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block >> > blk_-6072279734308701530_134821 src: /127.0.0.1:34080 dest: / >> 127.0.0.1:50010 >> > >> > ==> hadoop-neil-namenode-Ct-bj108.log <=>> > 2012-12-20 15:10:16,393 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >> > NameSystem.allocateBlock: >> > >> /tmp/hive-neil/hive_2012-12-20_15-10-12_982_9050806798858515370/-mr-10003/418d928d-6cd4-485b-99b6-0355e9a20fd7. >> > blk_-6072279734308701530_134821 >> > >> > ==> hadoop-neil-datanode-Ct-bj108.log <=>> > 2012-12-20 15:10:16,564 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >> > /127.0.0.1:34080, dest: /127.0.0.1:50010, bytes: 57401, op: HDFS_WRITE, >> > cliID: DFSClient_-1778955690, offset: 0, srvID: >> > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: >> > blk_-6072279734308701530_134821, duration: 84788000 >> > 2012-12-20 15:10:16,565 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for >> block >> > blk_-6072279734308701530_134821 terminating >> > 2012-12-20 15:10:16,911 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block >> > blk_5272124058034081804_134822 src: /127.0.0.1:34082 dest: / >> 127.0.0.1:50010 >> > 2012-12-20 15:10:17,003 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >> > /127.0.0.1:34082, dest: /127.0.0.1:50010, bytes: 3915, op: HDFS_WRITE, >> > cliID: DFSClient_-1778955690, offset: 0, srvID: >> > DS-1688600142-127.0.0.1-50010-1318579377533, blockid: >> > blk_5272124058034081804_134822, duration: 56734000 >> > 2012-12-20 15:10:17,003 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for >> block >> > blk_5272124058034081804_134822 terminating >> > 2012-12-20 15:10:17,122 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block >> > blk_-5406425112482671769_134823 src: /127.0.0.1:34083 dest: / >> 127.0.0.1:50010 >> > 2012-12-20 15:10:17,313 INFO >> > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >> > /127.0.0.1:34083, dest: /127.0.0.1:50010, bytes: 449818, op: >> HDFS_WRITE, >> > cliID: DFSClient_-1778955690, offset: 0, srvID: |