Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map join small table filesize not working on Hive 0.12 with Hadoop 2.2.0


Copy link to this message
-
Map join small table filesize not working on Hive 0.12 with Hadoop 2.2.0
Hi,

I have development cluster installed with Hive 0.12 running on top of
Hadoop 2.2.0. I have set the following properties values in the .hiverc
file and when running queries in the hive command line:

 set hive.exec.mode.local.auto=true;
 set hive.auto.convert.join=true;
 set hive.mapjoin.smalltable.filesize=25000000;

>From what I see Hive is not taking into account the value for the
smalltable.filesize property. Even though I set the property value to very
small numbers like 3 Hive still converts the join to a local map join and
the query fails due to memory exhaustion with the following error:

2013-11-24 09:45:15,098 ERROR mr.MapredLocalTask
(MapredLocalTask.java:executeFromChildJVM(323)) - Hive Runtime Error: Map
local work exhausted memory
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException:
2013-11-24 09:45:14    Processing rows:        1000000 Hashtable size:
999999  Memory usage:   1031866880      percentage:     0.968
         at
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)

         at
org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249)
         at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
         at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
         at
org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:136)
         at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
         at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
...

Any ideas about what might cause this error or how to fix this?