Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> performance improvment on regionserver.MemStore/updateColumnValue


Copy link to this message
-
Re: performance improvment on regionserver.MemStore/updateColumnValue
I would compare YCSB between the patched and unpatched version for a more
realistic workload than the unit tests provide.

-Joey

On Wed, Jul 20, 2011 at 10:49 AM, N Keywal <[EMAIL PROTECTED]> wrote:

> Hello,
>
> Some words on the context: We're thinking about using HBase for a product
> we're developping. For this reason, I am currently looking at HBase source
> code to understand  how to debug & modify HBase.  To start with something
> simple but useful, I am looking for performance improvement by profiling
> hbase during the execution of the unit tests. I expect that many of the
> hotspots found on the unit tests are also hotspots in real production. I
> plan to spend around 10 m.d on this until september.
>
>
> The method regionserver.MemStore/updateColumnValue seems quite used, and is
> ultimatly responsible of 30% of the time in the test subsets I am using.
>
>
> There is bit of it that can be optimized easily by changing the conditions
> order:
>
>        if (firstKv.matchingQualifier(kv)) {
>          if (kv.getType() == KeyValue.Type.Put.getCode()) {
>            now = Math.max(now, kv.getTimestamp());
>          }
>        }
>
>  becomes:
>                if (kv.getType() == KeyValue.Type.Put.getCode() &&
>                        kv.getTimestamp() > now &&
>                        firstKv.matchingQualifier(kv)) {
>                    now = kv.getTimestamp();
>                }
>
>  As comparing the qualifier is much more expensive, we put it at the end.
>  It improve the performances by 3% (i.e: total execution time lowered by
> 3%).
>
>
>  So first question: would you be interested by a patch for this kind of
> stuff?
>
>
>
>  Second question (more technical...): in this method
> (regionserver.MemStore/updateColumnValue), I see:
>
>            KeyValue firstKv = KeyValue.createFirstOnRow(
>                    row, family, qualifier);
>
>            [...]
>            while (it.hasNext()) {
>                KeyValue kv = it.next();
>
>                // if this isnt the row we are interested in, then bail:
>                if (!firstKv.matchingColumn(family, qualifier) ||
> !firstKv.matchingRow(kv)) {
>                    break; // rows dont match, bail.
>                }
>
>                [...]
>            }
>
>   For the test "firstKv.matchingColumn(family, qualifier)", I don't see:
>   1) Why it is tested in the loop, as firstKv is not modified, the result
> won't change.
>   2) How the result can be 'false', as firstKv is inialized with the family
> and the parameters.
>
>   Or is it shared for update a way or another?
>
>   If we can remove it, we gain another 2%...
>
>
>   N.
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434