-Re: Scan performance on compressed column families
David Koch 2012-11-11, 18:08
Thank you for the clarification. As Kevin also pointed out, I guess we will
just have to test compression in our environment.
On Fri, Nov 9, 2012 at 8:46 PM, Oliver Meyn (GBIF) <[EMAIL PROTECTED]> wrote:
> Hi David,
> I wrote that blog post and I know that Lars George has much more
> experience than me with tuning HBase, especially in different environments,
> so weight our opinions accordingly. As he says, it will "usually" help,
> and the unusual cases of lower spec'd hardware (that I did those tests on)
> are where it might hurt scans, but obviously still helps with disk and
> network use. So take my post with a grain of salt, and as Kevin says, try
> it out on your data and your cluster and see what works best for you.
> On 2012-11-03, at 3:57 PM, David Koch wrote:
> > Hello,
> > Are scans faster when compression is activated? The HBase book by Lars
> > George seems to suggest so (p424, Section on "Compression" in chapter
> > "Performance Tuning").
> > "... compression usually will yield overall better performance, because
> > overhead of the CPU performing the compression and de- compression is
> > than what is required to read more data from disk."
> > I searched around for a bit and found this:
> > http://gbif.blogspot.fr/2012/02/performance-evaluation-of-hbase.html.
> > author conducted a series of scan performance tests on tables of up to
> > 200million rows and found that compression actually slowed down read
> > performance slightly - albeit at lower CPU load.
> > Thank you,
> > /David
> Oliver Meyn
> Software Developer
> Global Biodiversity Information Facility (GBIF)
> +45 35 32 15 12