Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Zookeeper/Hbase storage type on EC2


Copy link to this message
-
RE: Zookeeper/Hbase storage type on EC2
Laxman 2011-07-22, 11:17
Hi Pat,

Actually, HBase uses both ephemeral and persistent nodes. Ephemeral znodes
are used for coordination purpose. Persistent znodes are used for storing
metadata. So, there is no harm on ZK cluster restart as well.
>> Do this two directories need to be on a persistent storage
>> which survives a node crash? Or does an ephemeral storage device suffice
>> since a failed node which is restarted is being synchronized with the
other
>> two nodes anyway?

Yves, what exactly you mean by ephemeral storage and persistent storage
here?

ZK supports these two types of nodes and both types of nodes are used for
different purpose as mentioned above in my explanation. Its up to the
application to decide. Both the types of znodes, will be persisted to the
local disk.

Hope this clarifies some of your questions.

--
Thanks,
Laxman
-----Original Message-----
From: Patrick Hunt [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 22, 2011 4:27 AM
To: [EMAIL PROTECTED]
Subject: Re: Zookeeper/Hbase storage type on EC2

On Thu, Jul 21, 2011 at 1:21 AM, Yves Langisch <[EMAIL PROTECTED]> wrote:
> I just need a statement if it makes sense to use ephemeral storage for ZK
at
> all (in conjunction with Hbase if the answer depends on the use case)?
>
> Any help is appreciated.
>
>
> On 19.07.2011 19:37, Yves Langisch wrote:

>> I plan to setup a HBase installation on EC2. As recommended I therefore
>> want to setup a zookeeper ensemble with 3 nodes but I'm not sure what
kind
>> of storage I've to choose for the two zk directories (dataDir and
>> dataLogDir). Do this two directories need to be on a persistent storage
>> which survives a node crash? Or does an ephemeral storage device suffice
>> since a failed node which is restarted is being synchronized with the
other
>> two nodes anyway? And what happens when I restart the whole zk ensemble
with
>> ephemeral storage which means there is no zk data available anymore after
>> booting up? Any impact on the Hbase cluster?

I don't think you want to use ephemeral storage given that HBase would
lose information if the zk cluster was restarted. But really that's a
better question for the hbase team, I don't know exactly how they are
using ZK and the effects of such a loss on their application.

Regards,

Patrick