Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - How to rename table's family name


Copy link to this message
-
Re: How to rename table's family name
Andrey Stepachev 2011-01-10, 23:02
Stack, I don't think, that I know good solution, but I see some possible way
how
this can be implemented.

I think, that rename should be atomic (in case of schema in zookeeper, it is
possible).
So, the application developer should take care on how application should
handle this rename. And because it will live in zookeeper application can
listen for changes and take needed actions.

If we stilll store r/cf/q/v tuples in hfile we can consider simple logic:
1. all cf should have uuids to identify renames like a->b->a
2. region server sees cf rename and begin rename all results from not
converted
hfiles with old cf
2. if hfile converted, no renames needed at all
3. actual keyvalues rewrite performed on compaction

If some denormalization will be implemented (like don't store cf in each
keyvalue in hfile at all)
then nothing needed, cf name always reconstructed on Result instantination.

To exclude possibility of some races, like a->b->a, we must store history of
all renames to
help RS identify was or not given cf renamed.

2011/1/10 Stack <[EMAIL PROTECTED]>

> Want to make a new issue then Andrey?  Thanks.  We could knock out
> hbase-68 at same time.  When would we do the replacement of code with
> actual columnfamily name?  Over on client or on server before result
> is sent the client?
> St.Ack
>
> On Sun, Jan 9, 2011 at 12:29 PM, Andrey Stepachev <[EMAIL PROTECTED]>
> wrote:
> > 2011/1/9 Stack <[EMAIL PROTECTED]>
> >
> >>
> >> To respond to Andrey, you seem to be asking about the old issue
> >> HBASE-68.  It hasn't seen much attention of late.  Its more about
> >> saving space rather than facilitating rename (though the latter is a
> >> legit case).
> >>
> >
> > Thanks. But saving space is not an issue. Really renaming with going
> > offline (rename table) or reload data (rename cf) is issue (especially in
> > case of live data migration on version upgrade on production cluster).
> >
> >
> >>
> >> St.Ack
> >>
> >>
> >> On Sat, Jan 8, 2011 at 12:10 PM, M. C. Srivas <[EMAIL PROTECTED]>
> wrote:
> >> > In general. there's need for a loose "schema" to allow not only
> renames
> >> of
> >> > columns and column-families, but efficient delete of entire columns or
> >> CFs.
> >> > (eg, mark this C as deleted in the "schema" and remove it during the
> next
> >> > major compaction). But implementing the master-coordination for this
> (for
> >> > instance, all RS's should delete 'atomically') will be interesting
> ....
> >> >
> >> >
> >> >
> >> > On Sat, Jan 8, 2011 at 11:36 AM, Andrey Stepachev <[EMAIL PROTECTED]>
> >> wrote:
> >> >
> >> >> 2011/1/8 Stack <[EMAIL PROTECTED]>
> >> >>
> >> >> >
> >> >> >
> >> >>
> >> >> > Perhaps we should consider
> >> >> > detaching CF name from whats stored?
> >> >> >
> >> >>
> >> >> Yes! Are there any jira? I'll vote for it.
> >> >>
> >> >>
> >> >> >
> >> >> > St.Ack
> >> >> >
> >> >>
> >> >
> >>
> >
>