-Re: Bulkload into empty table with configureIncrementalLoad()
Jean-Daniel Cryans 2013-09-19, 16:55
You need to create the table with pre-splits, see
On Thu, Sep 19, 2013 at 9:52 AM, Dolan Antenucci <[EMAIL PROTECTED]>wrote:
> I have about 1 billion values I am trying to load into a new HBase table
> (with just one column and column family), but am running into some issues.
> Currently I am trying to use MapReduce to import these by first converting
> them to HFiles and then using LoadIncrementalHFiles.doBulkLoad(). I also
> use HFileOutputFormat.configureIncrementalLoad() as part of my MR job. My
> code is essentially the same as this example:
> The problem I'm running into is that only 1 reducer is created
> by configureIncrementalLoad(), and there is not enough space on this node
> to handle all this data. configureIncrementalLoad() should start one
> reducer for every region the table has, so apparently the table only has 1
> region -- maybe because it is empty and brand new (my understanding of how
> regions work is not crystal clear)? The cluster has 5 region servers, so
> I'd at least like that many reducers to handle this loading.
> On a side note, I also tried the command line tool, completebulkload, but
> am running into other issues with this (timeouts, possible heap issues) --
> probably due to only one server being assigned the task of inserting all
> the records (i.e. I look at the region servers' logs, and only one of the
> servers has log entries; the rest are idle).
> Any help is appreciated
> -Dolan Antenucci