Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Rowkey design and presplit table


+
Lukáš Drbal 2013-03-04, 10:48
+
Jilal Oussama 2013-03-04, 11:01
+
Lukáš Drbal 2013-03-04, 20:55
+
Ted Yu 2013-03-04, 21:06
+
Lukáš Drbal 2013-03-04, 21:27
Copy link to this message
-
Re: Rowkey design and presplit table
Ted Yu 2013-03-04, 21:32
Glad I am able to help.

If you have idea for enhancement from your use case, please share.

On Mon, Mar 4, 2013 at 1:27 PM, Lukáš Drbal <[EMAIL PROTECTED]> wrote:

> Hi Ted,
>
> thanks alot for this. It's exactly what i need.
>
> Lukas
>
> 2013/3/4 Ted Yu <[EMAIL PROTECTED]>
>
> > What HBase version are you planning to use ?
> >
> > In 0.94, you can refer to:
> >
> >
> src/main/java/org/apache/hadoop/hbase/regionserver/KeyPrefixRegionSplitPolicy.java
> >
> > You can write a policy which splits along category boundaries.
> >
> > There're other split policies in case you're interested:
> >
> >
> >
> ./src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java
> >
> >
> ./src/main/java/org/apache/hadoop/hbase/regionserver/DelimitedKeyPrefixRegionSplitPolicy.java
> >
> >
> ./src/main/java/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.java
> >
> > Cheers
> >
> > On Mon, Mar 4, 2013 at 12:55 PM, Lukáš Drbal <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Hi Jilal,
> > > thanks for response, but can you give me please any link or explain it
> > > more?
> > > I don't know what you mean with regular expression spliting. My data
> are
> > > not fixed and will grow in time.
> > >
> > > Thanks.
> > >
> > > Regards
> > >
> > > Lukas Drbal
> > >
> > >
> > > 2013/3/4 Jilal Oussama <[EMAIL PROTECTED]>
> > >
> > > > You can split in your application using a regular expression on the
> > > > underscore char if the langage supports them (like spliting data of a
> > csv
> > > > file)
> > > >
> > > >
> > > > 2013/3/4 Lukáš Drbal <[EMAIL PROTECTED]>
> > > >
> > > > > Hi,
> > > > >
> > > > > i have one question about rowkey design and presplit table.
> > > > >
> > > > > My usecase:
> > > > > I need store a lot of comments where each comment are for one
> article
> > > and
> > > > > this article has one category.
> > > > >
> > > > > What i need:
> > > > > 1) read one comment by id (where i know commentId, articleId and
> > > > > categoryId)
> > > > > 2) read all coments for article (i know categoryId and articleId)
> > > > > 3) read all comments for category (i know categoryId)
> > > > >
> > > > > From this read pattern i see one good rowkey:
> > > > > <categoryId>_<articleId>_<commentId>
> > > > >
> > > > > But here i don't have fixed size of rowkey, so i don't know how to
> > > define
> > > > > split pattern. How can be this solved?
> > > > > This id's come from external system and grow very fast, so add some
> > > like
> > > > > "padding" for each part are hard.
> > > > >
> > > > > Maybe i can use hash function for each part
> > > > > md5(<categoryId>_md5(<articleId>)_md5(<commentId>), but this rowkey
> > is
> > > > very
> > > > > long (3*32+2 bytes), i don't have experience with this long
> rowkeys.
> > > > >
> > > > > Can someone give me a suggestions please?
> > > > >
> > > > > Regards
> > > > >
> > > > > Lukas Drbal
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Save The World - http://www.worldcommunitygrid.org/
> > >
> http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR
> > >
> > > LesTR
> > >
> >
>
>
>
> --
> Save The World - http://www.worldcommunitygrid.org/
> http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR
>
> LesTR
>
+
Asaf Mesika 2013-03-07, 07:42
+
James Taylor 2013-03-07, 08:42
+
Lukáš Drbal 2013-03-07, 22:32