bucketing is certainly helpful when you have finite number of values on a
different column in a partitioned column.
though bucketing would mean that when you load data into the table, it
can't be a straight forward load data in path, you will need to run it via
hive queries (which does not seem to be a problem at least from the look of

clustering used to be in the ranges of 2 like 2, 4, 8, 16 etc. Not sure if
it has changed now.
Also while loading data for bucketed table its advised you set the value
for set hive.enforce.bucketing = true;

 I have rarely used indexing in hive. but I do remember hive indexes used
to provide better data access to certain queries as well the storage layout
helps in improving search and lookup of the data.

It may be really helpful if you can note down the performance you get after
fine tuning the parameters

On Tue, Mar 25, 2014 at 10:37 PM, Saumitra Shahapure (Vizury) <

Nitin Pawar

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB