|
|
+
Amit Sela 2012-11-01, 17:03
-
RE: Bulk Loading - LoadIncrementalHFilesAnoop Sam John 2012-11-02, 03:55
Hi
Yes while doing the bulk load the table can be presplit. It will have the same number of reducers as that of the region. One per region. Each HFile that the reducer generates will be having a max size of HFile max size configuration. You can see that while bulk loading also there will be splits on the HFiles if needed (as per the new splits which may happen on the regions) Yes in case of table being not splits, later it will lead to splits... Better way would be to do presplit I would say. -Anoop- ________________________________________ From: Amit Sela [[EMAIL PROTECTED]] Sent: Thursday, November 01, 2012 10:33 PM To: [EMAIL PROTECTED] Subject: Bulk Loading - LoadIncrementalHFiles Hi everyone, I'm using MR to bulk load into HBase by using HFileOutputFormat.configureIncrementalLoad and after the job is complete I use loadIncrementalHFiles.doBulkLoad >From what I see, the MR outputs a file for each CF written and to my understanding these files are loaded as store files into a region. What I don't understand is *how many regions will open* ? and *how is that determined *? If I have 3 CF's and a lot of data to load, does that mean 3 large store files will load into 1 region (more ?) and this region will split on major compaction ? Can I pre-create regions and tell the bulk load to split the data between them during the load ? In general, if someone could elaborate about LoadIncrementalHFiles it would save me a lot of time diving into it. Another question I is about running over values, is it possible to load an updated value ? or generally updating columns and values for an existing key ? I'd think that there's no problem but when I try to run the same bulk load twice (MR and then load) with the same data, the second time fails. Right after mapreduce.LoadIncrementalHFiles: Trying to load hfile=........ I get: ERROR mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting... Thanks! |