Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map join small table filesize not working on Hive 0.12 with Hadoop 2.2.0


Copy link to this message
-
Map join small table filesize not working on Hive 0.12 with Hadoop 2.2.0
Hi,

I have development cluster installed with Hive 0.12 running on top of
Hadoop 2.2.0. I have set the following properties values in the .hiverc
file and when running queries in the hive command line:

 set hive.exec.mode.local.auto=true;
 set hive.auto.convert.join=true;
 set hive.mapjoin.smalltable.filesize=25000000;

>From what I see Hive is not taking into account the value for the
smalltable.filesize property. Even though I set the property value to very
small numbers like 3 Hive still converts the join to a local map join and
the query fails due to memory exhaustion with the following error:

2013-11-24 09:45:15,098 ERROR mr.MapredLocalTask
(MapredLocalTask.java:executeFromChildJVM(323)) - Hive Runtime Error: Map
local work exhausted memory
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException:
2013-11-24 09:45:14    Processing rows:        1000000 Hashtable size:
999999  Memory usage:   1031866880      percentage:     0.968
         at
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)

         at
org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249)
         at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
         at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
         at
org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:136)
         at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
         at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
...

Any ideas about what might cause this error or how to fix this?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB