So, I have an external table in hive backed by a huge hbase table. I was
wondering what are the best practices to partition my data so that my
queries do not have to do a full-table scan always?
A quick research on this yielded some ways where the partition would need
to be created and then data loaded into these partitions. Or to use dynamic
Is there any way to limit the scans based on the start and stop keys? Also,
if I decide to go with dynamic partitions, how do I keep the data up to
date in my partitioned tables?
Thanks for any help.