Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive Query


Copy link to this message
-
RE: Hive Query
yogesh.kumar13@... 2012-07-24, 10:17
Thanks  Bejoy :-),

You are correct.

I have run two commands,

1) select site_id from site where id = 2;

it shows result and details are.

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201207241536_0003, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201207241536_0003
Kill Command = /HADOOP/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201207241536_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2012-07-24 15:42:09,251 Stage-1 map = 0%,  reduce = 0%
2012-07-24 15:42:12,332 Stage-1 map = 100%,  reduce = 0%
2012-07-24 15:42:15,391 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201207241536_0003
MapReduce Jobs Launched:
Job 0: Map: 1   HDFS Read: 24 HDFS Write: 3 SUCESS
Total MapReduce CPU Time Spent: 0 msec
OK
12
Time taken: 17.696 seconds

When I have no reduce operator it works fine, but in second case.
2) select count(*) from site;

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201207241536_0002, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201207241536_0002
Kill Command = /HADOOP/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201207241536_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2012-07-24 15:37:53,818 Stage-1 map = 0%,  reduce = 0%
2012-07-24 15:38:00,176 Stage-1 map = 100%,  reduce = 0%
2012-07-24 15:39:01,068 Stage-1 map = 100%,  reduce = 0%
2012-07-24 15:39:03,307 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201207241536_0002 with errors
Error during job, obtaining debugging information...
Examining task ID: task_201207241536_0002_m_000002 (and more) from job job_201207241536_0002
Exception in thread "Thread-40" java.lang.RuntimeException: Error while reading from task log url
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
    at java.lang.Thread.run(Thread.java:680)
Caused by: java.io.IOException: Server returned HTTP response code: 407 for URL: http://10.203.33.81:50060/tasklog?taskid=attempt_201207241536_0002_r_000000_0&start=-8193
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
    at java.net.URL.openStream(URL.java:1010)
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
    ... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   HDFS Read: 24 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

When I have Reduce operator it gets fail.
I have restarted the cluster and TaskTracker, but not having idea what to do with mapred.reduce.parallel.copies.

Please suggest

Thanks & Regards
Yogesh Kumar

________________________________
From: Bejoy Ks [[EMAIL PROTECTED]]
Sent: Tuesday, July 24, 2012 3:10 PM
To: [EMAIL PROTECTED]
Subject: Re: Hive Query

Hi Yogesh

I'm not exactly sure of the real root cause of the error.
>From the error log and the nature of occurrence. I suspect it could be happening when the reduce task is not able to reach the map task node and fetch the map output. Something close to fetch failures. Can you try out the following and see whether it does make some difference
1. Increase the value of tasktracker.http.threads  (this to be done at TT level and not on job level, restart TT)
2. mapred.reduce.parallel.copies
The query, I just tested it out on my local environment, It is working fine and returned the desired output. Looks like the root cause at your end is  some hadoop mis configuration as most of the issues are mostly with Map reduce jobs.

Regards
Bejoy KS
________________________________
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, July 24, 2012 2:56 PM
Subject: RE: Hive Query

Thanks Bejoy :-)

I have an error Issue with

select count(*) from table;

it throws error

2012-07-24 13:39:25,181 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201207231123_0011 with errors
Error during job, obtaining debugging information...
Examining task ID: task_201207231123_0011_m_000002 (and more) from job job_201207231123_0011
Exception in thread "Thread-93" java.lang.RuntimeException: Error while reading from task log url
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
    at java.lang.Thread.run(Thread.java:680)
Caused by: java.io.IOException: Server returned HTTP response code: 407 for URL: http://10.203.33.81:50060/tasklog?taskid=attempt_201207231123_0011_r_000000_0&start=-8193
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
    at java.net.URL.openStream(URL.java:1010)
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
    ... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   HDFS Read: 24 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

and I run query

SELECT count(*),sub.name FROM (Select * FROM sitealias JOIN site O