Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> What is the best hbase table schema for following json data?


+
AnilKumar B 2013-05-30, 03:47
Copy link to this message
-
Re: What is the best hbase table schema for following json data?
bq. 1) Suppose If I want search on key of click, It will be full scan

You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to speed
up scan.

How many key / value pairs does each 'click' have ? Among these pairs, are
you going to search for a subset of keys ?

Cheers

On Wed, May 29, 2013 at 8:47 PM, AnilKumar B <[EMAIL PROTECTED]> wrote:

> Hi,
>
> What is the best hbase table schema for following json data?
> I need to store following JSON data in hbase.
> {"Session"":{"Header" :
> {"key1":"value1","key2":"value2","key3":"value3","key4":"value4",....},
> "clicks" : [{"click" " : {"key1":"value1","key2":"value2",
> "key3":"value3"....}, {"click" : {"key1":"value1", "key2":"value2",
> ....}}]}}
>
> I have created the schema as below, but there seems to some issues.
> rowkey -> compositeKey of session fields
> ColumnFamily 1 -> "Header" which consists of following columns
> 1) Header:HeaderFields which stores  "{"Header" :
> {"key1":"value1","key1":"value1","key1":"value1","key1":"value1",....}" in
> one cell
> 2) other columns
>
> ColumnFamily 2 -> "clicks" and each "click" will be one column
>
> The problem here is
> 1) Suppose If I want search on key of click, It will be full scan, how can
> I optimize my schema for such search requirement?
> 2) If I want to provide some secondary index for keys of clicks, how can
> Implement it?
>
> Thanks,
> B Anil Kumar.
>
+
AnilKumar B 2013-05-30, 06:13
+
Ted Yu 2013-05-30, 16:48
+
Michael Segel 2013-05-30, 19:09
+
AnilKumar B 2013-06-01, 14:36
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB