Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> LZO vs GZIP vs NO COMPREESSION: why is GZIP the winner ???


Copy link to this message
-
Re: LZO vs GZIP vs NO COMPREESSION: why is GZIP the winner ???
Yes of course.

We use a 4 machine cluster (4 large instances on AWS): 8 GB RAM
each, dual core CPU. 1 is for the Hadoop and HBase namenode /
masters, and 3 are hosting the datanode / regionservers.

The table used for testing is first created, then I insert
sequentially a set of rows and count the nb of rows inserted by second.

I insert rows by set of 1000 (using HTable.put(list<Put>);

When reading, I read also sequentially by using a scanner (scanner
caching is set to 1024 rows).

Maybe our installation of LZO is not good ?
Le 23/02/10 22:15, Jean-Daniel Cryans a �crit :
> Vincent,
>
> I don't expect that either, can you give us more info about your test
> environment?
>
> Thx,
>
> J-D
>
> On Tue, Feb 23, 2010 at 10:39 AM, Vincent Barat
> <[EMAIL PROTECTED]>  wrote:
>> Hello,
>>
>> I did some testing to figure out which compression algo I should use for my
>> HBase tables. I thought that LZO was the good candidate, but it appears that
>> it is the worst one.
>>
>> I uses one table with 2 families and 10 columns. Each row has a total of 200
>> to 400 bytes.
>>
>> Here is my results:
>>
>> GZIP:           2600 to 3200 inserts/s  12000 to 15000 reads/s
>> NO COMPRESSION: 2000 to 2600 inserts/s  4900 to 5020 reads/s
>> LZO             1600 to 2100 inserts/s  4020 to 4600 reads/s
>>
>> Do you have an explanation to this ? I though that the LZO compression was
>> always faster at compression and decompression than GZIP ?
>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB