|
|
-
Hive job fails on hive client even though all map-red stages finish but succeeds on hive server
Anurag Tangri 2012-08-11, 03:37
Hi, We are facing this issue where we run a hive job over huge data about ~6 TB input.
We run this from hive client and hive metastore server is on another machine. If we have smaller input, this job succeeds but for above input size, it fails with error :
2012-08-11 01:34:01,722 Stage-1 map = 100%, reduce = 100%
2012-08-11 01:35:02,195 Stage-1 map = 100%, reduce = 100%
2012-08-11 01:36:02,682 Stage-1 map = 100%, reduce = 100%
2012-08-11 01:37:03,215 Stage-1 map = 100%, reduce = 100%
2012-08-11 01:38:03,719 Stage-1 map = 100%, reduce = 100%
2012-08-11 01:39:04,311 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201207072204_34432
Loading data to table default.atangri_test_1
Failed with exception Unable to fetch table atangri_test_1
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask If we have smaller input (~2 TB), this job succeeds but for above input size, it fails with error : We have set hive.metastore.client.socket.timeout to big value like 86400 but still it fails after about 8-9 hours.
Does anyone face the same issue or any pointers ?
The job succeeds if it is directly run on hive server.
Thanks, Anurag Tangri
-
Re: Hive job fails on hive client even though all map-red stages finish but succeeds on hive server
Vinod Singh 2012-08-11, 07:39
We run Hive jobs on 20+ TB data without any issues.
Thanks, Vinod
On Sat, Aug 11, 2012 at 9:07 AM, Anurag Tangri <[EMAIL PROTECTED]>wrote:
> Hi, > We are facing this issue where we run a hive job over huge data about ~6 > TB input. > > We run this from hive client and hive metastore server is on another > machine. > > > If we have smaller input, this job succeeds but for above input size, it > fails with error : > > 2012-08-11 01:34:01,722 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:35:02,195 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:36:02,682 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:37:03,215 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:38:03,719 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:39:04,311 Stage-1 map = 100%, reduce = 100% > > Ended Job = job_201207072204_34432 > > Loading data to table default.atangri_test_1 > > Failed with exception Unable to fetch table atangri_test_1 > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > > > If we have smaller input (~2 TB), this job succeeds but for above input > size, it fails with error : We have set > hive.metastore.client.socket.timeout to big value like 86400 but still it > fails after about 8-9 hours. > > Does anyone face the same issue or any pointers ? > > The job succeeds if it is directly run on hive server. > > Thanks, > Anurag Tangri >
-
Re: Hive job fails on hive client even though all map-red stages finish but succeeds on hive server
Jagat Singh 2012-08-11, 07:49
Hi Anurag,
How much space is for /user and /tmp directory on client.
Did you check that part? , anything which might stop move task from finishing.
----------- Sent from Mobile , short and crisp. On 11-Aug-2012 1:37 PM, "Anurag Tangri" <[EMAIL PROTECTED]> wrote:
> Hi, > We are facing this issue where we run a hive job over huge data about ~6 > TB input. > > We run this from hive client and hive metastore server is on another > machine. > > > If we have smaller input, this job succeeds but for above input size, it > fails with error : > > 2012-08-11 01:34:01,722 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:35:02,195 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:36:02,682 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:37:03,215 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:38:03,719 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:39:04,311 Stage-1 map = 100%, reduce = 100% > > Ended Job = job_201207072204_34432 > > Loading data to table default.atangri_test_1 > > Failed with exception Unable to fetch table atangri_test_1 > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > > > If we have smaller input (~2 TB), this job succeeds but for above input > size, it fails with error : We have set > hive.metastore.client.socket.timeout to big value like 86400 but still it > fails after about 8-9 hours. > > Does anyone face the same issue or any pointers ? > > The job succeeds if it is directly run on hive server. > > Thanks, > Anurag Tangri >
-
Re: Hive job fails on hive client even though all map-red stages finish but succeeds on hive server
Anurag Tangri 2012-08-11, 14:19
I see exception like:
Moving data to: hdfs://../hive/atangri_test_1 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection timed out FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
There is enough space in /user and /tmp
Thanks, Anurag Tangri
On Sat, Aug 11, 2012 at 12:49 AM, Jagat Singh <[EMAIL PROTECTED]> wrote:
> Hi Anurag, > > How much space is for /user and /tmp directory on client. > > Did you check that part? , anything which might stop move task from > finishing. > > ----------- > Sent from Mobile , short and crisp. > On 11-Aug-2012 1:37 PM, "Anurag Tangri" <[EMAIL PROTECTED]> wrote: > >> Hi, >> We are facing this issue where we run a hive job over huge data about ~6 >> TB input. >> >> We run this from hive client and hive metastore server is on another >> machine. >> >> >> If we have smaller input, this job succeeds but for above input size, it >> fails with error : >> >> 2012-08-11 01:34:01,722 Stage-1 map = 100%, reduce = 100% >> >> 2012-08-11 01:35:02,195 Stage-1 map = 100%, reduce = 100% >> >> 2012-08-11 01:36:02,682 Stage-1 map = 100%, reduce = 100% >> >> 2012-08-11 01:37:03,215 Stage-1 map = 100%, reduce = 100% >> >> 2012-08-11 01:38:03,719 Stage-1 map = 100%, reduce = 100% >> >> 2012-08-11 01:39:04,311 Stage-1 map = 100%, reduce = 100% >> >> Ended Job = job_201207072204_34432 >> >> Loading data to table default.atangri_test_1 >> >> Failed with exception Unable to fetch table atangri_test_1 >> >> FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.MoveTask >> >> >> If we have smaller input (~2 TB), this job succeeds but for above input >> size, it fails with error : We have set >> hive.metastore.client.socket.timeout to big value like 86400 but still it >> fails after about 8-9 hours. >> >> Does anyone face the same issue or any pointers ? >> >> The job succeeds if it is directly run on hive server. >> >> Thanks, >> Anurag Tangri >> >
-
Re: Hive job fails on hive client even though all map-red stages finish but succeeds on hive server
Mapred Learn 2012-08-11, 15:20
Hi Vinod, Do you use remote server configuration ?
What version of hive do you use ? We are using 0.7.1 where we see this issue.
Sent from my iPhone
On Aug 11, 2012, at 12:39 AM, Vinod Singh <[EMAIL PROTECTED]> wrote:
> We run Hive jobs on 20+ TB data without any issues. > > Thanks, > Vinod > > On Sat, Aug 11, 2012 at 9:07 AM, Anurag Tangri <[EMAIL PROTECTED]> wrote: > Hi, > We are facing this issue where we run a hive job over huge data about ~6 TB input. > > We run this from hive client and hive metastore server is on another machine. > > > If we have smaller input, this job succeeds but for above input size, it fails with error : > > 2012-08-11 01:34:01,722 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:35:02,195 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:36:02,682 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:37:03,215 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:38:03,719 Stage-1 map = 100%, reduce = 100% > > 2012-08-11 01:39:04,311 Stage-1 map = 100%, reduce = 100% > > Ended Job = job_201207072204_34432 > > Loading data to table default.atangri_test_1 > > Failed with exception Unable to fetch table atangri_test_1 > > FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask > > > > If we have smaller input (~2 TB), this job succeeds but for above input size, it fails with error : We have set hive.metastore.client.socket.timeout to big value like 86400 but still it fails after about 8-9 hours. > > Does anyone face the same issue or any pointers ? > > The job succeeds if it is directly run on hive server. > > Thanks, > Anurag Tangri >
|
|