Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Skew join failure


+
David Morel 2012-11-30, 10:10
Copy link to this message
-
Re: Skew join failure
Hi David,
It seems like Hive is unable to find the skewed keys on HDFS.
Did you set *hive.skewjoin.key property? If so, to what value?*

Mark

On Fri, Nov 30, 2012 at 2:10 AM, David Morel <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I am trying to solve the "last reducer hangs because of GC because of
> truckloads of data" issue that I have on some queries, by using SET
> hive.optimize.skewjoin=true; Unfortunately, every time I try this, I
> encounter an error of the form:
> ...
> 2012-11-30 10:42:39,181 Stage-10 map = 100%,  reduce = 100%, Cumulative
> CPU 406984.1 sec
> MapReduce Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds
> 100 msec
> Ended Job = job_201211281801_0463
> java.io.FileNotFoundException: File hdfs://nameservice1/tmp/hive-**
> dmorel/hive_2012-11-30_09-23-**00_375_8178040921995939301/-**
> mr-10014/hive_skew_join_**bigkeys_0 does not exist.
>         at org.apache.hadoop.hdfs.**DistributedFileSystem.**listStatus(**
> DistributedFileSystem.java:**365)
>         at org.apache.hadoop.hive.ql.**plan.**ConditionalResolverSkewJoin.
> **getTasks(**ConditionalResolverSkewJoin.**java:96)
>         at org.apache.hadoop.hive.ql.**exec.ConditionalTask.execute(**
> ConditionalTask.java:81)
>         at org.apache.hadoop.hive.ql.**exec.Task.executeTask(Task.**
> java:133)
>         at org.apache.hadoop.hive.ql.**exec.TaskRunner.runSequential(**
> TaskRunner.java:57)
>         at org.apache.hadoop.hive.ql.**Driver.launchTask(Driver.java:**
> 1332)
>         at org.apache.hadoop.hive.ql.**Driver.execute(Driver.java:**1123)
>         at org.apache.hadoop.hive.ql.**Driver.run(Driver.java:931)
> ...
>
> Googling didn't give me any indication on how to debug/solve this, so I'd
> be glad if I could get any indication where to start looking.
>
> I'm using CMF4.0 currently, so Hive 0.8.1.
>
> Thanks a lot!
>
> David Morel
>
+
David Morel 2012-12-03, 20:25
+
Mark Grover 2012-12-04, 06:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB