|
|
-
Problem with Hive Indexing
Ablimit Aji 2012-07-30, 15:12
I have written a custom index handler and wanted to test it. However hive is not using it. So I test with simple table (pokes (int foo, string bar)) which comes with hive distribution for testing purpose. Then I created a compact index and set the set hive.optimize.index.filter=true; However, upon checking the log info, it seems hive is still not using the index. So, what is the problem ? The query I issued is as follow: select foo from pokes WHERE foo=498 ;
Below is the log info I got after issuing the query.
12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate for index optimization 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498) 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=pokes_idx 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string _bucketname, list _offsets} 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index staleness... 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not loaded
-
Re: Problem with Hive Indexing
Mahsa Mofidpoor 2012-08-16, 16:32
Hi,
At lease the table size must be greater than 5GB to use the index for filter pushdown. Otherwise you have to comment the checkQuerySize method.
Cheers, Mahsa
On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji <[EMAIL PROTECTED]> wrote:
> I have written a custom index handler and wanted to test it. However hive > is not using it. > So I test with simple table (pokes (int foo, string bar)) which comes with > hive distribution for testing purpose. > Then I created a compact index and set the set > hive.optimize.index.filter=true; > However, upon checking the log info, it seems hive is still not using the > index. > So, what is the problem ? > The query I issued is as follow: select foo from pokes WHERE foo=498 ; > > Below is the log info I got after issuing the query. > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate for > index optimization > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498) > 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default > tbl=pokes_idx > 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string > _bucketname, list _offsets} > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index > staleness... > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 > 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop > library > 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not loaded >
-
Re: Problem with Hive Indexing
Ablimit Aji 2012-08-16, 16:43
Thanks Mahsa ! I didn't know that there is such a constraint.
Best, Ablimit
On Thu, Aug 16, 2012 at 12:32 PM, Mahsa Mofidpoor <[EMAIL PROTECTED]>wrote:
> Hi, > > At lease the table size must be greater than 5GB to use the index for > filter pushdown. Otherwise you have to comment the checkQuerySize method. > > Cheers, > Mahsa > > On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji <[EMAIL PROTECTED]> wrote: > > > I have written a custom index handler and wanted to test it. However hive > > is not using it. > > So I test with simple table (pokes (int foo, string bar)) which comes > with > > hive distribution for testing purpose. > > Then I created a compact index and set the set > > hive.optimize.index.filter=true; > > However, upon checking the log info, it seems hive is still not using the > > index. > > So, what is the problem ? > > The query I issued is as follow: select foo from pokes WHERE foo=498 ; > > > > Below is the log info I got after issuing the query. > > > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate > for > > index optimization > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498) > > 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default > > tbl=pokes_idx > > 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string > > _bucketname, list _offsets} > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index > > staleness... > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 > > 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 > > 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop > > library > > 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not > loaded > > >
|
|