Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Restrictions on Tables in Map side join in Hive


Copy link to this message
-
Restrictions on Tables in Map side join in Hive
Hi,

 I have 2 tables:

hive> describe extended idtablerc;
 

id    string  from deserializer

                

Detailed Table Information      Table(tableName:idtablerc,
dbName:default, owner:viraj, createTime:1277418576, lastAccessTime:0,
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:yuid,
type:string, comment:null)], location:hdfs://nn1/projects/idtablerc,
inputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.RCFileOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
parameters:{}), partitionKeys:[],
parameters:{transient_lastDdlTime=1277418576}, viewOriginalText:null,
viewExpandedText:null, tableType:MANAGED_TABLE)

Time taken: 0.414 seconds

 

 

hive> describe extended t2;        

OK

c1      int

c2      string

                

Detailed Table Information      Table(tableName:t2, dbName:default,
owner:viraj, createTime:1277334757, lastAccessTime:0, retention:0,
sd:StorageDescriptor(cols:[FieldSchema(name:c1, type:int, comment:null),
FieldSchema(name:c2, type:string, comment:null)],
location:hdfs://nn1/user/viraj/t2table,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
parameters:{}), partitionKeys:[],
parameters:{EXTERNAL=TRUE,transient_lastDdlTime=1277334757},
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)

Time taken: 0.244 seconds

 

 

I try to join these tables as follows:

 

This works:

 

select /*+ MAPJOIN(t2) */ t2.c1, t2.c2 from t2 join idtablerc on (t2.c2
= idtablerc.id);

 

 

This fails : Caused by: java.io.IOException: java.io.EOFException

            at
org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExtern
al(MapJoinObjectValue.java:109)

 

 

select /*+ MAPJOIN(idtablerc) */ t2.c1, t2.c2 from t2 join idtablerc on
(t2.c2 = idtablerc.id);

 

Both tables are less than 1 MB in size.

Are there some restrictions on table types?

 

 

Viraj