Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Limit number of columns in column family


Copy link to this message
-
Re: Limit number of columns in column family
Dave Latham 2013-09-23, 21:32
What about having all columns in the column family use the same qualifier
and then setting the max versions for that column family to limit it?
http://hbase.apache.org/book.html#schema.versions

It would only work if you didn't need to do updates to the cell without
knowing its timestamp or having them count against the total number of
cells to keep around.
On Wed, Sep 18, 2013 at 11:23 PM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Don't worry for the language ;)
>
> I don't think there is any mecanism today to limit the number of columns
> into a column family.
>
> There might be multiple options but they will all have some drawback.
>
> On option is to have a daily mapreduce job looking at each row and doing
> the cleanup. This can work if you don't have millions of huge columns
> because you will have to keep track of all of them to see how many you have
> and how many you need to remove...
>
> There might be some other options, like keep the index in the column name
> so you know you need to remove all column with name < XXX where XXX is the
> last index value minus the numbre of columns you can to keep.
>
> etc.
>
> JM
>
>
> 2013/9/18 M. BagherEsmaeily <[EMAIL PROTECTED]>
>
> > any cell in the same row.
> > Sorry because of my poor language!
> >
> >
> > On Thu, Sep 19, 2013 at 9:28 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Hi MBE,
> > >
> > > When you are saying "cells  with least timestamp being removed" you
> mean
> > > versions of the same cell? Or any cell in the same row/cf?
> > >
> > > JM
> > >
> > >
> > > 2013/9/18 M. BagherEsmaeily <[EMAIL PROTECTED]>
> > >
> > > > Hi,
> > > > I have a column family that I want the number of columns on it has a
> > > > specific limit, and when this number becomes greater than the limit,
> > > cells
> > > > with least timestamp being removed, like TTL on count not time.
> > > > Please guide me to find best optimized way.
> > > >
> > > > Thanks.
> > > > MBE
> > > >
> > >
> >
>