myhbase 2013-06-22, 14:21
Jean-Marc Spaggiari 2013-06-22, 14:29
Mohammad Tariq 2013-06-22, 14:35
Mohammad Tariq 2013-06-22, 14:37
myhbase 2013-06-22, 15:01
With 8 machines you can do something like this :
Machine 1 - NN+JT
Machine 2 - SNN+ZK1
Machine 3 - HM+ZK2
Machine 4-8 - DN+TT+RS
(You can run ZK3 on a slave node with some additional memory).
DN and RS run on the same machine. Although RSs are said to hold the data,
the data is actually stored in DNs. Replication is managed at HDFS level.
You don't have to worry about that.
You can visit this link <http://hbase.apache.org/book/perf.writing.html> to
see how to write efficiently into HBase. With a small field there should
not be any problem except storage and increased metadata, as you'll have
many small cells. If possible club several small fields into one and put
them together in one cell.
On Sat, Jun 22, 2013 at 8:31 PM, myhbase <[EMAIL PROTECTED]> wrote:
> Thanks for your response.
> Now if 5 servers are enough, how can I install and configure my nodes? If
> I need 3 replicas in case data loss, I should at least have 3 datanodes, we
> still have namenode, regionserver and HMaster nodes, zookeeper nodes, some
> of them must be installed in the same machine. The datanode seems the disk
> IO sensitive node while region server is the mem sensitive, can I install
> them in the same machine? Any suggestion on the deployment plan?
> My business requirement is that the write is much more than read(7:3), and
> I have another concern that I have a field which will have the 8~15KB in
> data size, I am not sure, there will be any problem in hbase when it runs
> compaction and split in regions.
> Oh, you already have heavyweight's input :).
>> Thanks JM.
>> Warm Regards,
>> On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq <[EMAIL PROTECTED]>
>> Hello there,
>>> IMHO, 5-8 servers are sufficient enough to start with. But it's
>>> all relative to the data you have and the intensity of your reads/writes.
>>> You should have different strategies though, based on whether it's 'read'
>>> or 'write'. You actually can't define 'big' in absolute terms. My cluster
>>> might be big for me, but for someone else it might still be not big
>>> or for someone it might be very big. Long story short it depends on your
>>> needs. If you are able to achieve your goal with 5-8 RSs, then having
>>> machines will be a wastage, I think.
>>> But you should always keep in mind that HBase is kinda greedy when it
>>> comes to memory. For a decent load 4G is sufficient, IMHO. But it again
>>> depends on operations you are gonna perform. If you have large clusters
>>> where you are planning to run MR jobs frequently you are better off with
>>> additional 2G.
>>> Warm Regards,
>>> On Sat, Jun 22, 2013 at 7:51 PM, myhbase <[EMAIL PROTECTED]> wrote:
>>> Hello All,
>>>> I learn hbase almost from papers and books, according to my
>>>> understanding, HBase is the kind of architecture which is more appliable
>>>> to a big cluster. We should have many HDFS nodes, and many HBase(region
>>>> server) nodes. If we only have several severs(5-8), it seems hbase is
>>>> not a good choice, please correct me if I am wrong. In addition, how
>>>> many nodes usually we can start to consider the hbase solution and how
>>>> about the physic mem size and other hardware resource in each node, any
>>>> reference document or cases? Thanks.
Jean-Marc Spaggiari 2013-06-22, 16:09
Kevin Odell 2013-06-22, 16:15
Mohammad Tariq 2013-06-22, 17:05
iain wright 2013-06-22, 22:37
Mohammad Tariq 2013-06-22, 23:21
Kevin Odell 2013-06-23, 13:39