It comes from the hbase source, but is modified to actually work (the class provided in hbase is private and does not work out of the box). There is a readme at the bottom of the gist with my process. One important note though, I did this with a deep understanding (after hours of reading hbase code and doing tests on a test cluster) of how it all works. And even then I felt nervous to do it in prod. Hence why I went the snapshot/compact route.
I would definitely test it on a test cluster and get some familiarity before getting close to a production table. That said, I've run this on 8-10 production tables a few months ago, reducing in size from 10-20x in some cases. On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <[EMAIL PROTECTED]> wrote:
I have a question here. In 0.98 the merge_region command which can be run through HBase shell is not reliable? If we simply want to merge 2 regions at a time? I thought that the older Merge tool was not safe.
Thanks, Shahab On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault <[EMAIL PROTECTED]
Hey Ted! How many regions (per region server) do you have on average? If it's not too bad you might just be able to increase hbase.hregion.max.filesize to 10 or 20g and bounce all the region servers. Then as you write more data you will fill up the existing regions.
"Too bad" is fuzzy. If you approach hundreds of regions per region server you likely have a problem, depending on your read/write patterns.
________________________________ From: Ted Tuttle <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Development <[EMAIL PROTECTED]> Sent: Thursday, August 28, 2014 11:19 AM Subject: state-of-the-art method for merging regions on v0.94
We recently realized our region size is 1G and need to increase it to get our region count under control. I've done some research on merging regions and have come away confused.
Bryan, we should pull in your code if that works better. ________________________________ From: Andrew Purtell <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Development <[EMAIL PROTECTED]> Sent: Thursday, August 28, 2014 12:12 PM Subject: Re: state-of-the-art method for merging regions on v0.94
If the 0.94 merge code doesn't work out the box we should fix that.
On Thu, Aug 28, 2014 at 11:26 AM, Bryan Beaudreault < [EMAIL PROTECTED]> wrote: Best regards,
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Oh, I just cheched and HBASE-1212 seems to have been backported into 0.94 too... Missed that. So then offline merges should be fine in that branch too... 2014-08-28 15:12 GMT-04:00 Andrew Purtell <[EMAIL PROTECTED]>:
Lars, so it worked for me, and I'm more than happy for anyone to use/adapt it as necessary for hbase proper. But I'm not sure it's anywhere near production ready, and I don't have the time to work on it more right now. Perhaps someone with more knowledge of region internals could vet it and add relevant tests. We could enter a JIRA, and if I find time in the future I can take a look.
And yes, @JM, my gist was specific to an online migration (cluster is active, but table is disabled). Offline did not meet our requirements at the time, so I never tried it.
On Thu, Aug 28, 2014 at 5:04 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext