Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase not scaling well

Copy link to this message
Re: HBase not scaling well
Hi Hari,

Could you do some realtime monitoring (htop, iptraf, iostat) and report the results? Also you could add some timers to the map-reduce operations: measure average operations times to figure out what's taking so long.

On Oct 29, 2010, at 9:55 AM, Hari Shankar wrote:

> Hi,
>     We are currently doing a POC for HBase in our system. We have
> written a bulk upload job to upload our data from a text file into
> HBase. We are using a 3-node cluster, one master which also works as
> slave (running as namenode, jobtracker, HMaster, datanode,
> tasktracker, HQuorumpeer and  HRegionServer) and 2 slaves (datanode,
> tasktracker, HQuorumpeer and  HRegionServer running). The problem is
> that we are getting lower performance from distributed cluster than
> what we were getting from single-node pseudo distributed node. The
> upload is taking about 30  minutes on an individual machine, whereas
> it is taking 2 hrs on the cluster. We have replication set to 3, so
> all parts should ideally be available on all nodes, so we doubt if the
> problem is network latency. scp of files between nodes gives a speed
> of about 12 MB/s, which I believe should be good enough for this to
> function. Please correct me if I am wrong here. The nodes are all 4
> core machines with 8 GB RAM.  We are spawning 4 simultaneous map tasks
> on each node, and the job does not have any reduce phase. Any help is
> greatly appreciated.
> Thanks,
> Hari Shankar