Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Scan performance on compressed column families


+
David Koch 2012-11-03, 14:57
+
David Koch 2012-11-07, 13:09
Copy link to this message
-
Re: Scan performance on compressed column families
Dave,

  I would recommend trying it in your environment.  Like most tests you can
find other blogs that argue the performance is better:

http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of

  It will depend on your environment, filesize, how well it compresses, how
taxed your disks are, how wide your cells are, etc..

On Wed, Nov 7, 2012 at 8:09 AM, David Koch <[EMAIL PROTECTED]> wrote:

> *BUMP*
>
> Sorry,
>
> /David
>
> On Sat, Nov 3, 2012 at 3:57 PM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > Are scans faster when compression is activated? The HBase book by Lars
> > George seems to suggest so (p424, Section on "Compression" in chapter
> > "Performance Tuning").
> >
> > "... compression usually will yield overall better performance, because
> > the overhead of the CPU performing the compression and de- compression is
> > less than what is required to read more data from disk."
> >
> > I searched around for a bit and found this:
> > http://gbif.blogspot.fr/2012/02/performance-evaluation-of-hbase.html.
> The
> > author conducted a series of scan performance tests on tables of up to
> > 200million rows and found that compression actually slowed down read
> > performance slightly - albeit at lower CPU load.
> >
> > Thank you,
> >
> > /David
> >
>

--
Kevin O'Dell
Customer Operations Engineer, Cloudera
+
Oliver Meyn 2012-11-09, 19:46
+
David Koch 2012-11-11, 18:08