Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Pull instant schema updating out?


Copy link to this message
-
Re: Pull instant schema updating out?
I agree having two implementations seems like a big mess. Any sense
which one is closer to working for the average user (i.e without
provisions like "don't split or crash servers mid-alter")?

On Mon, Apr 2, 2012 at 4:56 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
> Hi nifty devs,
>
> After encountering HBASE-5702, I started playing with instant schema
> updating (HBASE-4213) a bit more and I must say that it's a bit rough
> which makes me wonder... should we pull that code out?
>
> We're in this "interesting" situation in 0.94 where we have two
> different ways to alter tables without disabling them and I don't
> trust either. I'm pretty sure most of the devs don't even know which
> one takes precedence over the other when both are enabled without
> looking at the code. Well, right now hbase.online.schema.update.enable
> needs to be enabled in order to have
> hbase.instant.schema.alter.enabled working. If only the former is
> enabled the master handles the alter, else if both are enabled then
> it's going to be done via ZK although the master still keeps track of
> it.
>
> So the differences between both IIUC:
>
>  - "Online schema update" is a rolling close/open of all the regions
> so that they pick up the new HTD. It's handled by the master and has
> been in since 0.92. I've used it quite a bit when running other tests
> and it's ok as long as regions are not splitting and RS are not
> shutting down. We also enabled it on our clusters here since our
> regions don't tend to move that much.
>
>  - "Instant schema alter" is instant in the sense that all the regions
> are asked to close from the get-go but effectively the region servers
> can only close one region at a time. The state of the alter is kept in
> ZK and the master has a bunch of watches and logs the progress. It's
> new since 0.94 and I'm not sure if anyone is using it. I've tested it
> a bit and at the moment I can say the the MonitedTasks handling needs
> to be redone, it logs way too much information in the log, but the few
> alters I ran worked... It's just a bit hard to know when they're done.
>
> FWIW we could pull either or both out, but instant schema alter hasn't
> been in a released version yet so it's unlikely it'll bother someone
> while the other is already in use (like here).
>
> Opinions?
>
> Thanks,
>
> J-D

--
Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB