Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Index not working


+
Hamza Asad 2013-06-13, 10:12
Copy link to this message
-
Re: Index not working
Please help me out. Am i doing something wrong? OR suggest me another
document which explains index implementation and its effective use
completely.
On Thu, Jun 13, 2013 at 3:12 PM, Hamza Asad <[EMAIL PROTECTED]> wrote:

> I have created simple table as follow
> *CREATE TABLE events_details(
>   id int,
>   event_id int,
>   user_id BIGINT,
>   event_date string,
>   intval_1 int ,
>   intval_2 int,
>   intval_3 int,
>   intval_4 int,
>   intval_5 int,
>   intval_6 int,
>   intval_7 int,
>   intval_8 int,
>   intval_9 int,
>   intval_10 int,
>   intval_11 int,
>   intval_12 int,
>   intval_13 int,
>   intval_14 int,
>   intval_15 int,
>   intval_16 int,
>   intval_17 int,
>   intval_18 int,
>   intval_19 int,
>   intval_20 int,
>   intval_21 int,
>   intval_22 int,
>   intval_23 int,
>   intval_24 int,
>   intval_25 int,
>   intval_26 int)
> ROW FORMAT DELIMITED
>  FIELDS TERMINATED BY ','
> STORED AS TEXTFILE;*
> *
> *
> and populate it using sqoop. Then i create index on it which is as follow
>
> * create INDEX idx_event_date ON TABLE events_details (event_date) as
> 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'  WITH
> DEFERRED REBUILD;
> *
> * *and
> *ALTER INDEX  idx_event_date ON events_details REBUILD; *
> after index created successfully i executed following command according to
> a document
> *INSERT OVERWRITE DIRECTORY "/tmp/idx_event_date" SELECT `_bucketname` ,
> `_offsets` FROM nydus.nydus__events_details_idx_event_date__ where
> to_date(event_date) >='2012-06-24' AND to_date(event_date) <= '2012-06-29' ;
> *
> * *when this executes completely, i  executes following commands
> *SET hive.index.compact.file=/tmp/idx_event_date;
> SET
> hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
> *
> now i run same old query ****SELECT COUNT(DISTINCT ed.user_id) as
> Value,min(to_date(event_date)) as event_date FROM nydus.events_details ed
> where intval_14 = 3 AND event_id = 1  AND ( to_date(event_date)
> >='2012-06-24' AND to_date(event_date) <= '2012-06-29' )  GROUP BY
> weekofyear(event_date) **** but its execution time have not optimized
> (same 770 sec as before). What am i doing wrong? Please help me out
>
> --
> *Muhammad Hamza Asad*
>

--
*Muhammad Hamza Asad*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB