Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Compaction problem


Copy link to this message
-
Re: Compaction problem
From my testing: I had 3 RS and 1 Master all connected to a single router at speed of 1000 mbps.
This network measured speed was 107 - 120 MB/sec, while the theoretic max is 125 MB/sec (1000/8)
HDFS Write speed (copyFromLocal) was 73 MB/sec
HDFS Read speed (copyToLocal) was 97 MB/sec

Now, if you're writing to HBase constantly, you basically get couple of HDFS clients happening in the same site:
1. WAL writing
2. Flush
3. Compaction

So, if you can write to HDFS at 73 MB/sec, you will get ~ 25 MB/sec. I didn't get that - I guess HBase process it self add some to it, so I got around 18 MB/sec.

All this testing was done using HBase vanilla. All computers had 4 cores each 3.1 GHz, and 8GB of RAM.
I used ethtool to test if my computer was connected to the router at rate of 1000 Mbit/sec
I used netperf to measure network speed between each data node (You may find one link which is faulty out of the 3)

You can also use iozone to check if the hard-drives are the one causing the bottleneck.
(I used the following command: iozone -R -r 16384k -s 64m -l 1 -u 1 -F ~/f1 -I)
For instance, my hard-drives gave mixed read-write rate between 90-120 MB/sec.

Thanks,
Asaf

On Mar 27, 2013, at 5:05 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote:

> Hi Asaf,
>
> What kind of results should we expect from the test you are suggesting?
>
> I mean, how many MB/sec should we see on an healthy cluster?
>
> Thanks,
>
> JM
>
> 2013/3/26 Asaf Mesika <[EMAIL PROTECTED]>:
>> 1st thing I would do to find the bottleneck it so benchmark HDFS solo performance.
>> Create a 16GB file (using dd) which is x2 your memory and run "time hadoop fs -copyFromLocal yourFile.txt /tmp/a.txt"
>> Tell us what is the speed of this file copy in MB/sec.
>>
>>
>> On Mar 22, 2013, at 4:44 PM, tarang dawer <[EMAIL PROTECTED]> wrote:
>>
>>> Hi
>>> As per my use case , I have to write around 100gb data , with a ingestion
>>> speed of around 200 mbps. While writing , i am getting a performance hit by
>>> compaction , which adds to the delay.
>>> I am using a 8 core machine with 16 gb RAM available., 2 Tb hdd 7200RPM.
>>> Got some idea from the archives and  tried pre splitting the regions ,
>>> configured HBase with following parameters(configured the parameters in a
>>> haste , so please guide me if anything's out of order) :-
>>>
>>>
>>>       <property>
>>>               <name>hbase.hregion.memstore.block.multiplier</name>
>>>               <value>4</value>
>>>       </property>
>>>       <property>
>>>                <name>hbase.hregion.memstore.flush.size</name>
>>>                <value>1073741824</value>
>>>       </property>
>>>
>>>       <property>
>>>               <name>hbase.hregion.max.filesize</name>
>>>               <value>1073741824</value>
>>>       </property>
>>>       <property>
>>>               <name>hbase.hstore.compactionThreshold</name>
>>>               <value>5</value>
>>>       </property>
>>>       <property>
>>>             <name>hbase.hregion.majorcompaction</name>
>>>                 <value>0</value>
>>>       </property>
>>>       <property>
>>>               <name>hbase.hstore.blockingWaitTime</name>
>>>               <value>30000</value>
>>>       </property>
>>>        <property>
>>>                <name>hbase.hstore.blockingStoreFiles</name>
>>>                <value>200</value>
>>>        </property>
>>>
>>> <property>
>>>       <name>hbase.regionserver.lease.period</name>
>>>       <value>3000000</value>
>>> </property>
>>>
>>>
>>> but still m not able to achieve the optimal rate , getting around 110 mbps.
>>> Need some optimizations ,so please could you help out ?
>>>
>>> Thanks
>>> Tarang Dawer
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Mar 22, 2013 at 6:05 PM, Jean-Marc Spaggiari <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hi Tarang,
>>>>
>>>> I will recommand you to take a look at the list archives first to see
>>>> all the discussions related to compaction. You will found many
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB