Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Handling hierarchical data in Hive

Saumitra Shahapure 2014-03-25, 10:52
Nitin Pawar 2014-03-25, 10:56
Saumitra Shahapure 2014-03-25, 11:26
Nitin Pawar 2014-03-25, 11:52
Prasan Samtani 2014-03-25, 14:14
Saumitra Shahapure 2014-03-25, 17:08
Copy link to this message
Re: Handling hierarchical data in Hive
bucketing is certainly helpful when you have finite number of values on a
different column in a partitioned column.
though bucketing would mean that when you load data into the table, it
can't be a straight forward load data in path, you will need to run it via
hive queries (which does not seem to be a problem at least from the look of

clustering used to be in the ranges of 2 like 2, 4, 8, 16 etc. Not sure if
it has changed now.
Also while loading data for bucketed table its advised you set the value
for set hive.enforce.bucketing = true;

 I have rarely used indexing in hive. but I do remember hive indexes used
to provide better data access to certain queries as well the storage layout
helps in improving search and lookup of the data.

It may be really helpful if you can note down the performance you get after
fine tuning the parameters

On Tue, Mar 25, 2014 at 10:37 PM, Saumitra Shahapure (Vizury) <

Nitin Pawar