Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Index not working


+
Hamza Asad 2013-06-13, 10:12
Copy link to this message
-
Re: Index not working
Hamza Asad 2013-06-14, 07:04
Please help me out. Am i doing something wrong? OR suggest me another
document which explains index implementation and its effective use
completely.
On Thu, Jun 13, 2013 at 3:12 PM, Hamza Asad <[EMAIL PROTECTED]> wrote:

> I have created simple table as follow
> *CREATE TABLE events_details(
>   id int,
>   event_id int,
>   user_id BIGINT,
>   event_date string,
>   intval_1 int ,
>   intval_2 int,
>   intval_3 int,
>   intval_4 int,
>   intval_5 int,
>   intval_6 int,
>   intval_7 int,
>   intval_8 int,
>   intval_9 int,
>   intval_10 int,
>   intval_11 int,
>   intval_12 int,
>   intval_13 int,
>   intval_14 int,
>   intval_15 int,
>   intval_16 int,
>   intval_17 int,
>   intval_18 int,
>   intval_19 int,
>   intval_20 int,
>   intval_21 int,
>   intval_22 int,
>   intval_23 int,
>   intval_24 int,
>   intval_25 int,
>   intval_26 int)
> ROW FORMAT DELIMITED
>  FIELDS TERMINATED BY ','
> STORED AS TEXTFILE;*
> *
> *
> and populate it using sqoop. Then i create index on it which is as follow
>
> * create INDEX idx_event_date ON TABLE events_details (event_date) as
> 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'  WITH
> DEFERRED REBUILD;
> *
> * *and
> *ALTER INDEX  idx_event_date ON events_details REBUILD; *
> after index created successfully i executed following command according to
> a document
> *INSERT OVERWRITE DIRECTORY "/tmp/idx_event_date" SELECT `_bucketname` ,
> `_offsets` FROM nydus.nydus__events_details_idx_event_date__ where
> to_date(event_date) >='2012-06-24' AND to_date(event_date) <= '2012-06-29' ;
> *
> * *when this executes completely, i  executes following commands
> *SET hive.index.compact.file=/tmp/idx_event_date;
> SET
> hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
> *
> now i run same old query ****SELECT COUNT(DISTINCT ed.user_id) as
> Value,min(to_date(event_date)) as event_date FROM nydus.events_details ed
> where intval_14 = 3 AND event_id = 1  AND ( to_date(event_date)
> >='2012-06-24' AND to_date(event_date) <= '2012-06-29' )  GROUP BY
> weekofyear(event_date) **** but its execution time have not optimized
> (same 770 sec as before). What am i doing wrong? Please help me out
>
> --
> *Muhammad Hamza Asad*
>

--
*Muhammad Hamza Asad*