Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Hive Index


Copy link to this message
-
Hive Index
Santosh Achhra 2013-01-11, 06:19
Hello Hive Users,

I have created index on table with deferred rebuild.

1) After Index is created I see two tables;
           a) MAIN_TABLE
           b) MAIN_TABLE_INDEX

No when I do a explain on the table, I see that it is going for table scan
and not index scan ?

        MAIN_TABLE
*          TableScan*
            alias: MAIN_TABLE
            Filter Operator
              predicate:
                  expr: (trim(F1) = 'V1')
                  type: boolean

Is this expected ? Are there any additional steps which should
be done after index is created ?

I found below mentioned text in archives which is from Mark

INSERT OVERWRITE DIRECTORY '/tmp/indexes/x' SELECT `_bucketname`,
`_offsets` FROM default__t_x__
where j='and';
(The name default__t_x__ can be found in the output of step 2. Also,
/tmp/indexes directory
needs to exist in HDFS. You can substitute this to be any pre-existing
directory in HDFS)
SET hive.index.compact.file=/tmp/indexes/x;
SET hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
SELECT a, count(*) from t where j='and' group by a;
Also does join keys use indexes which were created ?

Good wishes,always !
Santosh