I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing out after 60 seconds. I increased the value of hbase.snapshot.master.timeoutMillis and restarted HBase, but the timeout still happens after 60 seconds. Any suggestions?
There are 174 regions, not well balanced. One RegionServer has 69 regions. That RegionServer generates a series of log entries (modified and shown below), one for each region, at roughly 1 to 2 second intervals. The timeout period expires when it reaches region 36.
2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for hfiles 2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references for [hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2] hfiles 2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file (1/1) : hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2 2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush Snapshotting region hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6. completed. 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region operation on hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21 07:49:45,137 DEBUG [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1] snapshot.FlushSnapshotSubprocedure: Starting region operation on hosts,\x00\x8A\x90\xD6\x08,1400 659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG [member: ‘xxx.digitalenvoy.net,60020,1405943192177' subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager: Completed 1/174 local region snapshots. 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush Snapshotting region hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729. started... 2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for snapshot.
On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote:
The snapshot timeout properties are confusingly named and I dug through the code to understand them some time ago. Use these:
<property> <name>hbase.snapshot.master.timeoutMillis</name> <!-- Change from default of 60s to 600s to allow for slow flushing of tables --> <value>600000</value> <description> This is the time HBase master waits for the snapshot operation to complete. Do not confuse this hbase.snapshot.master.timeout.millis, which although sounding similar, serves a very different purpose. Note: This property has a completely different meaning before hbase version 0.94.11 and should not enabled on a cluster using snapshots and running a version before 0.94.11. </description> </property> <property> <name>hbase.snapshot.master.timeout.millis</name> <!-- Change from default of 60s to 600s to allow for slow flushing of tables --> <value>600000</value> <description> This is the timeout the master indicates the client to wait when it takes the snapshot. The client actually waits longer than this due to exponential backoff. See HBaseAdmin.snapshot for the exact algorithm. </description> </property> <property> <name>hbase.snapshot.region.timeout</name> <!-- Change from default of 60s to 600s to allow for slow flushing of tables --> <value>600000</value> <description> This is the time the regionserver waits to complete all of its activities for a snapshot operation. </description> </property> On Mon, Jul 21, 2014 at 7:02 AM, Matteo Bertozzi <[EMAIL PROTECTED]> wrote: *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
You can leave your config value there. Remember to record such change in a place for future reference - you may change other cost parameter later.
The side-effects of this change partially depend on how you want your cluster balanced. I suggest you go over the CostFunction's in StochasticLoadBalancer so that you know which factors (and their weights) load balancer considers.
Cheers On Tue, Jul 22, 2014 at 8:43 AM, Brian Jeltema < [EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext