Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> how many severs in a hbase cluster


Copy link to this message
-
Re: how many severs in a hbase cluster
You HAVE TO run a ZK3, or else you don't need to have ZK2 and any ZK
failure will be an issue. You need to have an odd number of ZK
servers...

Also, if you don't run MR jobs, you don't need the TT and JT... Else,
everything below is correct. But there is many other options, all
depend on your needs and the hardware you have ;)

JM

2013/6/22 Mohammad Tariq <[EMAIL PROTECTED]>:
> With 8 machines you can do something like this :
>
> Machine 1 - NN+JT
> Machine 2 - SNN+ZK1
> Machine 3 - HM+ZK2
> Machine 4-8 - DN+TT+RS
> (You can run ZK3 on a slave node with some additional memory).
>
> DN and RS run on the same machine. Although RSs are said to hold the data,
> the data is actually stored in DNs. Replication is managed at HDFS level.
> You don't have to worry about that.
>
> You can visit this link <http://hbase.apache.org/book/perf.writing.html> to
> see how to write efficiently into HBase. With a small field there should
> not be any problem except storage and increased metadata, as you'll have
> many small cells. If possible club several small fields into one and put
> them together in one cell.
>
> HTH
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Sat, Jun 22, 2013 at 8:31 PM, myhbase <[EMAIL PROTECTED]> wrote:
>
>> Thanks for your response.
>>
>> Now if 5 servers are enough, how can I install  and configure my nodes? If
>> I need 3 replicas in case data loss, I should at least have 3 datanodes, we
>> still have namenode, regionserver and HMaster nodes, zookeeper nodes, some
>> of them must be installed in the same machine. The datanode seems the disk
>> IO sensitive node while region server is the mem sensitive, can I install
>> them in the same machine? Any suggestion on the deployment plan?
>>
>> My business requirement is that the write is much more than read(7:3), and
>> I have another concern that I have a field which will have the 8~15KB in
>>  data size, I am not sure, there will be any problem in hbase when it runs
>> compaction and split in regions.
>>
>>  Oh, you already have heavyweight's input :).
>>>
>>> Thanks JM.
>>>
>>> Warm Regards,
>>> Tariq
>>> cloudfront.blogspot.com
>>>
>>>
>>> On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>  Hello there,
>>>>
>>>>          IMHO, 5-8 servers are sufficient enough to start with. But it's
>>>> all relative to the data you have and the intensity of your reads/writes.
>>>> You should have different strategies though, based on whether it's 'read'
>>>> or 'write'. You actually can't define 'big' in absolute terms. My cluster
>>>> might be big for me, but for someone else it might still be not big
>>>> enough
>>>> or for someone it might be very big. Long story short it depends on your
>>>> needs. If you are able to achieve your goal with 5-8 RSs, then having
>>>> more
>>>> machines will be a wastage, I think.
>>>>
>>>> But you should always keep in mind that HBase is kinda greedy when it
>>>> comes to memory. For a decent load 4G is sufficient, IMHO. But it again
>>>> depends on operations you are gonna perform. If you have large clusters
>>>> where you are planning to run MR jobs frequently you are better off with
>>>> additional 2G.
>>>>
>>>>
>>>> Warm Regards,
>>>> Tariq
>>>> cloudfront.blogspot.com
>>>>
>>>>
>>>> On Sat, Jun 22, 2013 at 7:51 PM, myhbase <[EMAIL PROTECTED]> wrote:
>>>>
>>>>  Hello All,
>>>>>
>>>>> I learn hbase almost from papers and books, according to my
>>>>> understanding, HBase is the kind of architecture which is more appliable
>>>>> to a big cluster. We should have many HDFS nodes, and many HBase(region
>>>>> server) nodes. If we only have several severs(5-8), it seems hbase is
>>>>> not a good choice, please correct me if I am wrong. In addition, how
>>>>> many nodes usually we can start to consider the hbase solution and how
>>>>> about the physic mem size and other hardware resource in each node, any
>>>>> reference document or cases? Thanks.
>>>>>
>>>>> --Ning
>>>>>
>>>>>
>>>>>
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB