Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Partitions on hive hbase table


Copy link to this message
-
Re: Partitions on hive hbase table
Thanks for the reply bharath. It was helpful.

Looking into HIVE-1643 to support range scans, is the patch for it ready to
be consumed or it still would go some modifications?

Thanks,

On Mon, Oct 15, 2012 at 8:56 PM, bharath vissapragada <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> In your queries, if you have some select predicates of the form, row-key >
> x / row-key < y / row-key < x and row-key >y  they are used in building the
> correct scan object with these values. For ex, the query would look
> something like this,
>
> select <something> from <hbase-hive-table> where row-key < x and row-key
> >y  and ....<so-on>.   (Just include such predicates on hbase row key)
>
> Thanks
>
>
>
> On Mon, Oct 15, 2012 at 10:27 PM, [EMAIL PROTECTED] <
> [EMAIL PROTECTED]> wrote:
>
>> Thanks bharath.
>>
>> Do you have an example of what such query would look like?
>>
>> Thanks,
>>
>>
>> On Mon, Oct 15, 2012 at 11:48 AM, bharath vissapragada <
>> [EMAIL PROTECTED]> wrote:
>>
>>>
>>> I'm not sure about partitioning but the scans are currently limited
>>> based on start and stop keys ( if predicates on rowkeys are provided in the
>>> query)
>>>
>>> See Hive-1643 ,2815 jiras !
>>>
>>> On Mon, Oct 15, 2012 at 10:09 PM, [EMAIL PROTECTED] <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> All,
>>>>
>>>> So, I have an external table in hive backed by a huge hbase table. I
>>>> was wondering what are the best practices to partition my data so that my
>>>> queries do not have to do a full-table scan always?
>>>>
>>>> A quick research on this yielded some ways where the partition would
>>>> need to be created and then data loaded into these partitions. Or to use
>>>> dynamic partitions.
>>>>
>>>> Is there any way to limit the scans based on the start and stop keys?
>>>> Also, if I decide to go with dynamic partitions, how do I keep the data up
>>>> to date in my partitioned tables?
>>>>
>>>> Thanks for any help.
>>>>
>>>> --
>>>> Swarnim
>>>>
>>>> --
>>>> Regards,
>>>> Bharath .V
>>>> w:http://researchweb.iiit.ac.in/~bharath.v
>>>>
>>>>
>>
>>
>> --
>> Swarnim
>>
>
>
>
> --
> Regards,
> Bharath .V
> w:http://researchweb.iiit.ac.in/~bharath.v
>

--
Swarnim
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB