Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> What is the best hbase table schema for following json data?


+
AnilKumar B 2013-05-30, 03:47
+
Ted Yu 2013-05-30, 04:12
+
AnilKumar B 2013-05-30, 06:13
Copy link to this message
-
Re: What is the best hbase table schema for following json data?
bq. Still these ColumnPrefixFilter will work in this case?

Probably not. Can you group the subset of keys at the beginning of the
column (assuming the subset of keys is known and doesn't change) ?

bq. I am storing each click(set of key value pairs) in one cell say
"clicks:event1". Is this OK?

This should be Okay.

On Wed, May 29, 2013 at 11:13 PM, AnilKumar B <[EMAIL PROTECTED]> wrote:

> Hi Ted,
>
> @You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to speed
> up scan.
> [Anil] Thanks for the info. But I am storing all the key value pairs
> corresponding to one click in one column. Still these ColumnPrefixFilter
> will work in this case?
>
> @How many key / value pairs does each 'click' have ?
> [Anil] number of key value pairs are not fixed. It can vary from 20-200
>
> @Among these pairs, are you going to search for a subset of keys ?
> [Anil] Yes.
>
>
>
> In my schema, I am storing each click(set of key value pairs) in one cell
> say "clicks:event1". Is this OK? or do I need to change schema design in
> such a way that each key-value pair as one column? What is the better way
> to store Json data?
>
>
> Thanks,
> B Anil Kumar.
>
>
> On Thu, May 30, 2013 at 9:42 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > bq. 1) Suppose If I want search on key of click, It will be full scan
> >
> > You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to speed
> > up scan.
> >
> > How many key / value pairs does each 'click' have ? Among these pairs,
> are
> > you going to search for a subset of keys ?
> >
> > Cheers
> >
> > On Wed, May 29, 2013 at 8:47 PM, AnilKumar B <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Hi,
> > >
> > > What is the best hbase table schema for following json data?
> > > I need to store following JSON data in hbase.
> > > {"Session"":{"Header" :
> > > {"key1":"value1","key2":"value2","key3":"value3","key4":"value4",....},
> > > "clicks" : [{"click" " : {"key1":"value1","key2":"value2",
> > > "key3":"value3"....}, {"click" : {"key1":"value1", "key2":"value2",
> > > ....}}]}}
> > >
> > > I have created the schema as below, but there seems to some issues.
> > > rowkey -> compositeKey of session fields
> > > ColumnFamily 1 -> "Header" which consists of following columns
> > > 1) Header:HeaderFields which stores  "{"Header" :
> > > {"key1":"value1","key1":"value1","key1":"value1","key1":"value1",....}"
> > in
> > > one cell
> > > 2) other columns
> > >
> > > ColumnFamily 2 -> "clicks" and each "click" will be one column
> > >
> > > The problem here is
> > > 1) Suppose If I want search on key of click, It will be full scan, how
> > can
> > > I optimize my schema for such search requirement?
> > > 2) If I want to provide some secondary index for keys of clicks, how
> can
> > > Implement it?
> > >
> > > Thanks,
> > > B Anil Kumar.
> > >
> >
>
+
Michael Segel 2013-05-30, 19:09
+
AnilKumar B 2013-06-01, 14:36