Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Pull instant schema updating out?

Copy link to this message
Re: Pull instant schema updating out?
I agree having two implementations seems like a big mess. Any sense
which one is closer to working for the average user (i.e without
provisions like "don't split or crash servers mid-alter")?

On Mon, Apr 2, 2012 at 4:56 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
> Hi nifty devs,
> After encountering HBASE-5702, I started playing with instant schema
> updating (HBASE-4213) a bit more and I must say that it's a bit rough
> which makes me wonder... should we pull that code out?
> We're in this "interesting" situation in 0.94 where we have two
> different ways to alter tables without disabling them and I don't
> trust either. I'm pretty sure most of the devs don't even know which
> one takes precedence over the other when both are enabled without
> looking at the code. Well, right now hbase.online.schema.update.enable
> needs to be enabled in order to have
> hbase.instant.schema.alter.enabled working. If only the former is
> enabled the master handles the alter, else if both are enabled then
> it's going to be done via ZK although the master still keeps track of
> it.
> So the differences between both IIUC:
>  - "Online schema update" is a rolling close/open of all the regions
> so that they pick up the new HTD. It's handled by the master and has
> been in since 0.92. I've used it quite a bit when running other tests
> and it's ok as long as regions are not splitting and RS are not
> shutting down. We also enabled it on our clusters here since our
> regions don't tend to move that much.
>  - "Instant schema alter" is instant in the sense that all the regions
> are asked to close from the get-go but effectively the region servers
> can only close one region at a time. The state of the alter is kept in
> ZK and the master has a bunch of watches and logs the progress. It's
> new since 0.94 and I'm not sure if anyone is using it. I've tested it
> a bit and at the moment I can say the the MonitedTasks handling needs
> to be redone, it logs way too much information in the log, but the few
> alters I ran worked... It's just a bit hard to know when they're done.
> FWIW we could pull either or both out, but instant schema alter hasn't
> been in a released version yet so it's unlikely it'll bother someone
> while the other is already in use (like here).
> Opinions?
> Thanks,
> J-D

Todd Lipcon
Software Engineer, Cloudera