Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Schema Updates: what do you do today?


Copy link to this message
-
Re: Schema Updates: what do you do today?
Ian Varley 2012-04-09, 19:39
Thanks, Andy. Yeah, a tool that compares a schema definition with a running cluster, and gives you a way to apply changes (without offlining, where possible), would be pretty sweet.

Anybody else think so? Or, do you have tools you've already written for this? Seems like a common need (we also need that, and have started tools for it internally).

Ian

On Apr 9, 2012, at 11:56 AM, Andrew Purtell wrote:

Manual schema changes via one-off shell scripts.
What I would like to do is write code that gets the HTD, checks if
all of the schema structure and features are as they should be, and, if
not, makes the necessary modifications without taking the table offline.(I typically write code like that which does offlining first. In practice, it creates the table if it is missing in some test environment, later it is disabled.) It could be possible to update HTD and HCD attributes without offlining, possibly even to add CFs. I wouldn't expect all admin actions could be accomplished without offlining.
Best regards,
    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)

----- Original Message -----
From: Ian Varley <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Cc:
Sent: Monday, April 9, 2012 9:08 AM
Subject: Schema Updates: what do you do today?

All:

I'm doing a little research into various ways to apply schema modifications
to an HBase cluster. Anybody care to share with the list what you currently do?
E.g.

- Connect via the HBase shell and manually issue commands ("create",
"disable", "alter", etc.)
- Write one-off scripts that do the above
- Write tools that read from a static schema definition and then apply changes
to a cluster (e.g. using HBaseAdmin)

etc. My supposition is that some additional tooling in this area, to consolidate
stuff everybody already does on their own, might be helpful. In light of recent
discussions on the dev list about various ways to alter the schema on a running
cluster, it seems like this area is still a bit of a "wild west" in
the HBase community, both in how HBase works and in what people do in practice.

What do you do today for schema changes, and what would you like to do in an
ideal world?

Thanks,
Ian