Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Merge and HMerge


Copy link to this message
-
Merge and HMerge
Hi,

These two seem both in a bit of a weird state: HMerge is scoped package local, therefore no one but the package can call the merge() functions... and no one does that but the unit test. But it would be good to have this on the CLI and shell as a command (and in the shell maybe with a confirmation message?), but it is not available AFAIK.

HMerge can merge regions of tables that are disabled. It also merges all that qualify, i.e. where the merged region is less than or equal of half the configured max file size.

Merge on the other hand does have a main(), so can be invoked:

$ hbase org.apache.hadoop.hbase.util.Merge
Usage: bin/hbase merge <table-name> <region-1> <region-2>

Note how the help insinuates that you can use it as a tool, but that is not correct. Also, it only merges two given regions, and the cluster must be shut down (only the HBase daemons). So that is a step back.

What is worse is that I cannot get it to work. I tried in the shell:

hbase(main):001:0> create 'testtable', 'colfam1',  {SPLITS => ['row-10','row-20','row-30','row-40','row-50']}
0 row(s) in 0.2640 seconds

hbase(main):002:0> for i in '0'..'9' do for j in '0'..'9' do put 'testtable', "row-#{i}#{j}", "colfam1:#{j}", "#{j}" end end
0 row(s) in 1.0450 seconds

hbase(main):003:0> flush 'testtable'
0 row(s) in 0.2000 seconds

hbase(main):004:0> scan '.META.', { COLUMNS => ['info:regioninfo']}
ROW                                  COLUMN+CELL
 testtable,,1309614509037.612d1e0112 column=info:regioninfo, timestamp=130...
 406e6c2bb482eeaec57322.             STARTKEY => '', ENDKEY => 'row-10'
 testtable,row-10,1309614509040.2fba column=info:regioninfo, timestamp=130...
 fcc9bc6afac94c465ce5dcabc5d1.       STARTKEY => 'row-10', ENDKEY => 'row-20'
 testtable,row-20,1309614509041.e7c1 column=info:regioninfo, timestamp=130...
 6267eb30e147e5d988c63d40f982.       STARTKEY => 'row-20', ENDKEY => 'row-30'
 testtable,row-30,1309614509041.a9cd column=info:regioninfo, timestamp=130...
 e1cbc7d1a21b1aca2ac7fda30ad8.       STARTKEY => 'row-30', ENDKEY => 'row-40'
 testtable,row-40,1309614509041.d458 column=info:regioninfo, timestamp=130...
 236feae097efcf33477e7acc51d4.       STARTKEY => 'row-40', ENDKEY => 'row-50'
 testtable,row-50,1309614509041.74a5 column=info:regioninfo, timestamp=130...
 7dc7e3e9602d9229b15d4c0357d1.       STARTKEY => 'row-50', ENDKEY => ''
6 row(s) in 0.0440 seconds

hbase(main):005:0> exit

$ ./bin/stop-hbase.sh

$ hbase org.apache.hadoop.hbase.util.Merge testtable \
 testtable,row-20,1309614509041.e7c16267eb30e147e5d988c63d40f982. \
 testtable,row-30,1309614509041.a9cde1cbc7d1a21b1aca2ac7fda30ad8.

But I get consistently errors:

11/07/02 07:20:49 INFO util.Merge: Merging regions testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0. and testtable,row-30,1309613053987.3664920956c30ac5ff2a7726e4e6 in table testtable
11/07/02 07:20:49 INFO wal.HLog: HLog configuration: blocksize=32 MB, rollsize=30.4 MB, enabled=true, optionallogflushinternal=1000ms
11/07/02 07:20:49 INFO wal.HLog: New hlog /Volumes/Macintosh-HD/Users/larsgeorge/.logs_1309616449171/hlog.1309616449181
11/07/02 07:20:49 INFO wal.HLog: getNumCurrentReplicas--HDFS-826 not available; hdfs_out=org.apache.hadoop.fs.FSDataOutputStream@25961581, exception=org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.getNumCurrentReplicas()
11/07/02 07:20:49 INFO regionserver.HRegion: Setting up tabledescriptor config now ...
11/07/02 07:20:49 INFO regionserver.HRegion: Onlined -ROOT-,,0.70236052; next sequenceid=1
info: null
region1: [B@48fd918a
region2: [B@7f5e2075
11/07/02 07:20:49 FATAL util.Merge: Merge failed
java.io.IOException: Could not find meta region for testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0.
        at org.apache.hadoop.hbase.util.Merge.mergeTwoRegions(Merge.java:211)
        at org.apache.hadoop.hbase.util.Merge.run(Merge.java:111)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386)
11/07/02 07:20:49 INFO regionserver.HRegion: Setting up tabledescriptor config now ...
11/07/02 07:20:49 INFO regionserver.HRegion: Onlined .META.,,1.1028785192; next sequenceid=1
11/07/02 07:20:49 INFO regionserver.HRegion: Closed -ROOT-,,0.70236052
11/07/02 07:20:49 INFO wal.HLog: main.logSyncer exiting
11/07/02 07:20:49 ERROR util.Merge: exiting due to error
java.lang.NullPointerException
        at org.apache.hadoop.hbase.util.Merge$1.processRow(Merge.java:119)
        at org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:229)
        at org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:258)
        at org.apache.hadoop.hbase.util.Merge.run(Merge.java:116)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386)

After which I most of the times have shot .META. with an error

2011-07-02 06:42:10,763 WARN org.apache.hadoop.hbase.master.HMaster: Failed getting all descriptors
java.io.FileNotFoundException: No status for hdfs://localhost:8020/hbase/.corrupt
        at org.apache.hadoop.hbase.util.FSUtils.getTableInfoModtime(FSUtils.java:888)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:122)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:149)
        at org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1429)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:312)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1065)

Lars