Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> How practical is it to add a timestamp oracle on Zookeeper


+
yun peng 2013-04-16, 12:14
+
Jean-Marc Spaggiari 2013-04-16, 12:19
+
yun peng 2013-04-16, 12:40
+
Jean-Marc Spaggiari 2013-04-16, 13:31
+
PG 2013-04-21, 15:10
+
kishore g 2013-04-21, 16:22
+
Jimmy Xiang 2013-04-21, 16:33
+
Bijieshan 2013-04-16, 12:23
+
Ted Yu 2013-04-16, 12:37
+
Michel Segel 2013-04-21, 16:36
Copy link to this message
-
Re: How practical is it to add a timestamp oracle on Zookeeper
Hi,

I presume you have read the percolator paper. The design there uses a
single ts oracle, and BigTable itself as the transaction manager. In omid,
they also have a TS oracle, but I do not know how scalable it is. But using
ZK as the TS oracle would not work, since ZK can scale up to 40-50K
requests per second, but depending on the cluster size, you should be
getting much more than that. Especially considering all clients doing reads
and writes has to obtain a TS. Instead what you want is a TS that can scale
to millions of requests per sec. This can be achieved by the technique in
the percolator paper, by pre allocating a range by persisting to disk, and
an extremely lightweight rpc. I do not know whether Omid provides this.
There is a twitter project https://github.com/twitter/snowflake that you
might want to look at.

Hope this helps.

Enis
On Sun, Apr 21, 2013 at 9:36 AM, Michel Segel <[EMAIL PROTECTED]>wrote:

> Time is relative.
> What does the timestamp mean?
>
> Sounds like a simple question, but its not. Is it the time your
> application says they wrote to HBase? Is it the time HBase first gets the
> row? Or is it the time that the row was written to the memstore?
>
> Each RS has its own clock in addition to your app server.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 16, 2013, at 7:14 AM, yun peng <[EMAIL PROTECTED]> wrote:
>
> > Hi, All,
> > I'd like to add a global timestamp oracle on Zookeep to assign globally
> > unique timestamp for each Put/Get issued from HBase cluster. The reason I
> > put it on Zookeeper is that each Put/Get needs to go through it and
> unique
> > timestamp needs some global centralised facility to do it. But I am
> asking
> > how practical is this scheme, like anyone used in practice?
> >
> > Also, how difficulty is it to extend Zookeeper, or to inject code to the
> > code path of HBase inside Zookeeper. I know HBase has Coprocessor on
> region
> > server to let programmer to extend without recompiling HBase itself. Does
> > Zk allow such extensibility? Thanks.
> >
> > Regards
> > Yun
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB