-Re: PerformanceEvaluation results
Lars Francke 2012-02-07, 11:27
Hi Stack, Hi everyone,
>> I do feel the HBase project would benefit from some example metrics
>> for various operations and hardware or else it will remain a difficult
>> technology for some people to get into with confidence. We'll blog
>> our findings, and hopefully it might be of benefit to other
>> leprechauns. If we can prove the concept, we're more likely to be
>> able to get $ to grow.
> Agree (except for the bit where you look like a leprechaun). Would be
> cool if folks published what stats they see doing various operations
> in hbase on a specific hardware. Previous I'd have thought the
> deploys, configs., etc., too various but I suppose you have to start
I too agree.
>From my experience there are a lot of small companies which can't
afford or need large clusters and don't have the knowledge and
resources to fully optimize a cluster. We're certainly one of those
organizations. It's already a challenge for us to follow the rapid
development in the projects we're using (Hadoop, HBase, Oozie, Hive,
etc.). We're still putting Hadoop and HBase to good use and it's
As all our work is Open Source we're in the very fortunate position to
being able to point to all our configs, workflows and metrics
(Ganglia now up and public) etc. and ask for recommendations based
on that but a lot of other companies don't enjoy that privilege. We're
more than willing to provide information and even test out different
configurations on our (admittedly small and aging) cluster and we
would hope that this'll prove helpful for others as well.
It is worth noting that we do plan to buy new and better hardware, but
need to understand the technologies and capabilities to make some
informed choices before spending our total yearly hardware budget.
Therefore, understanding the behavior even on lesser quality hardware
is still important for us.
Thanks for all the past and (hopefully) future help and it's great to
finally be able to work with HBase again.
PS: Tim and I work at the same organization
 See also the cluster sizes on <http://wiki.apache.org/hadoop/PoweredBy>