|
|
-
Restrictions on Tables in Map side join in HiveViraj Bhat 2010-06-24, 23:32
Hi,
I have 2 tables: hive> describe extended idtablerc; id string from deserializer Detailed Table Information Table(tableName:idtablerc, dbName:default, owner:viraj, createTime:1277418576, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:yuid, type:string, comment:null)], location:hdfs://nn1/projects/idtablerc, inputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.RCFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{transient_lastDdlTime=1277418576}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) Time taken: 0.414 seconds hive> describe extended t2; OK c1 int c2 string Detailed Table Information Table(tableName:t2, dbName:default, owner:viraj, createTime:1277334757, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:c1, type:int, comment:null), FieldSchema(name:c2, type:string, comment:null)], location:hdfs://nn1/user/viraj/t2table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{EXTERNAL=TRUE,transient_lastDdlTime=1277334757}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) Time taken: 0.244 seconds I try to join these tables as follows: This works: select /*+ MAPJOIN(t2) */ t2.c1, t2.c2 from t2 join idtablerc on (t2.c2 = idtablerc.id); This fails : Caused by: java.io.IOException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExtern al(MapJoinObjectValue.java:109) select /*+ MAPJOIN(idtablerc) */ t2.c1, t2.c2 from t2 join idtablerc on (t2.c2 = idtablerc.id); Both tables are less than 1 MB in size. Are there some restrictions on table types? Viraj |