Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Optimizing bulk load performance


Copy link to this message
-
Re: Optimizing bulk load performance
Premal Shah 2013-10-26, 13:42
Hi Harry,
I'm currently working on Map Reduce which also involves incremental bulk
load using the HFileOutputFormat and I see similar performance in the
reduce phase and I believe this is the reason. The KeyValues have to be
sorted before being written to the HFile. So, the reducer runs a
TotalOrderPartitioner<http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.html>
to
sort your map output and depending on how much data there is to sort plus
the allocated memory, sorting could be a performance bottleneck. The number
of reducers = number of regions and that cannot be overridden in the job
config. I guess this is related to your issue.

Hope this helps.

On Wed, Oct 23, 2013 at 7:57 AM, Harry Waye <[EMAIL PROTECTED]> wrote:

> I'm trying to load data into hbase using HFileOutputFormat and incremental
> bulk load but am getting rather lackluster performance, 10h for ~0.5TB
> data, ~50000 blocks.  This is being loaded into a table that has 2
> families, 9 columns, 2500 regions and is ~10TB in size.  Keys are md5
> hashes and regions are pretty evenly spread.  The majority of time appears
> to be spend in the reduce phase, with the map phase completing very
> quickly.  The network doesn't appear to be saturated, but the load is
> consistently at 6 which is the number or reduce tasks per node.
>
> 12 hosts (6 cores, 2 disk as RAID0, 1GB eth, no one else on the rack).
>
> MR conf: 6 mappers, 6 reducers per node.
>
> I spoke to someone on IRC and they recommended reducing job output
> replication to 1, and reducing the number of mappers which I reduced to 2.
>  Reducing replication appeared not to make any difference, reducing
> reducers appeared just to slow the job down.  I'm going to have a look at
> running the benchmarks mentioned on Michael Noll's blog and see what that
> turns up.  I guess some questions I have are:
>
> How does the global number/size of blocks affect perf.?  (I have a lot of
> 10mb files, which are the input files)
>
> How does the job local number/size of input blocks affect perf.?
>
> What is actually happening in the reduce phase that requires so much CPU?
>  I assume the actual construction of HFiles isn't intensive.
>
> Ultimately, how can I improve performance?
> Thanks
>

--
Regards,
Premal Shah.