Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Hbase Assignments in trunk.


Copy link to this message
-
Re: Hbase Assignments in trunk.
+1 on rethinking the assignment + splitting code paths, and using zk as a
transactional database. Just my 2 cents w/o spending a lot of time on the
details, but maybe we should stop keeping master and RS in memory metadata,
but keep region-assignments in zk, and HM and RS just keep a consistent
in-memory cache.

Enis

On Mon, Sep 10, 2012 at 3:29 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> I've been saying a while ago that we should require ZK 3.4.x for 0.96+.
>
> Distributed consensus without a "transaction" option always rang a bit
> weird to me.
>
> Maybe switch in 0.98+?
>
> -- Lars
>
>
> ----- Original Message -----
> From: n keywal <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc:
> Sent: Thursday, September 6, 2012 12:53 AM
> Subject: Re: Hbase Assignments in trunk.
>
> On the Async vs. sync: there are 3 different ways to write multiple znodes
> in ZK, and huge differences in the performances between them:
>
> 1) for loop sync
> 2) for loop async
> 3) multi
>
> Async will be 20 to 100 times faster than sync. multi will be 2 to 4 times
> faster than async (that is, 80 to 400 times faster than sync).
>
> Multi was not available before ZK 3.4. It has several obvious advantages
> over async imho: it's faster, it's synchronous and it's a transaction. That
> simplifies the user code usually.
>
> It has other advantages:
> - async and sync will typically send 1 or more packet per znode (naggle is
> not activated iirc), while there will be only a few packets for all the
> znodes with multi
> - you can expect async to write multiple times on the disk, while multi
> should write only once. This is usually better for i/o performances.
>
> On a serious recovery situation, with all the regions moving all other the
> place, saving disk and network i/o for ZooKeeper is important.
>
> Disadvantage: it's new.
>
> On Thu, Sep 6, 2012 at 7:49 AM, Stack <[EMAIL PROTECTED]> wrote:
>
> > On Wed, Sep 5, 2012 at 5:17 PM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:
> > > Here's a link to the pdf/picture.
> > >
> > > http://people.apache.org/~jmhsieh/hbase/120905-hbase-assignment.pdf
> > >
> >
> > Pretty picture.  Not a pretty story.
> >
> > What you thinking?
> >
> > St.Ack
> >
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB