|
|
-
HBase Issues (perhaps related to 127.0.0.1)Ratner, Alan S 2012-11-21, 20:01
I'd appreciate any suggestions as to how to get HBase up and running. Right now it dies after a few seconds on all servers. I am using Hadoop 1.0.4, ZooKeeper 3.4.4 and HBase 0.94.2 on Ubuntu.
History: Yesterday I managed to get HBase 0.94.2 working but only after removing the 127.0.0.1 line from my /etc/hosts file (and synchronizing my clocks). All was fine until this morning when I realized I could not initiate remote log-ins to my servers (using VNC or NX) until I restored the 127.0.0.1 line in /etc/hosts. With that restored I am back to a non-working HBase. With HBase managing ZK I see the following in the HBase Master and ZK logs, respectively: 2012-11-21 13:40:22,236 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase 2012-11-21 13:40:22,122 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running At roughly the same time (clocks not perfectly synchronized) I see this in a Regionserver log: 2012-11-21 13:40:57,727 WARN org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration. ... 2012-11-21 13:40:57,848 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master Logs and configuration follows. Then I tried managing ZK myself and HBase then fails for seemingly different reasons. 2012-11-21 14:46:37,320 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/backup-masters/hadoop1,60000,1353527196915 already deleted, and this is not a retry 2012-11-21 14:46:47,483 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.net.ConnectException: Call to hadoop1/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused Both HMaster error logs (self-managed and me-managed ZK) mention the 127.0.0.1 IP address instead of referring to the server by its name (hadoop1) or its true IP address or simply as localhost. So, start-hbase.sh works OK (HB managing ZK): ngc@hadoop1:~/hbase-0.94.2$ bin/start-hbase.sh hadoop1: starting zookeeper, logging to /tmp/hbase-ngc/logs/hbase-ngc-zookeeper-hadoop1.out hadoop2: starting zookeeper, logging to /tmp/hbase-ngc/logs/hbase-ngc-zookeeper-hadoop2.out hadoop3: starting zookeeper, logging to /tmp/hbase-ngc/logs/hbase-ngc-zookeeper-hadoop3.out starting master, logging to /tmp/hbase-ngc/logs/hbase-ngc-master-hadoop1.out hadoop2: starting regionserver, logging to /tmp/hbase-ngc/logs/hbase-ngc-regionserver-hadoop2.out hadoop6: starting regionserver, logging to /tmp/hbase-ngc/logs/hbase-ngc-regionserver-hadoop6.out hadoop3: starting regionserver, logging to /tmp/hbase-ngc/logs/hbase-ngc-regionserver-hadoop3.out hadoop5: starting regionserver, logging to /tmp/hbase-ngc/logs/hbase-ngc-regionserver-hadoop5.out hadoop4: starting regionserver, logging to /tmp/hbase-ngc/logs/hbase-ngc-regionserver-hadoop4.out I have in hbase-site.xml: <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.master</name> <value>hadoop1:60000</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop1:9000/hbase</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/tmp/zookeeper_data</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop1,hadoop2,hadoop3</value> </property> I have in hbase-env.sh: export JAVA_HOME=/home/ngc/jdk1.6.0_25/ export HBASE_CLASSPATH=/home/zookeeper-3.4.4/conf:/home/zookeeper-3.4.4 export HBASE_HEAPSIZE=2000 export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode" export HBASE_LOG_DIR=/tmp/hbase-ngc/logs export HBASE_MANAGES_ZK=true Wed Nov 21 13:40:20 EST 2012 Starting master on hadoop1 core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386178 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 386178 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 2012-11-21 13:40:21,410 INFO org.apache.hadoop.hbase.util.VersionInfo: HBase 0.94.2 2012-11-21 13:40:21,410 INFO org.apache.hadoop.hbase.util.VersionInfo: Subversion https://svn.apache.org/repos/asf/hbase/branches/0.94 -r 1395367 2012-11-21 13:40:21,410 INFO org.apache.hadoop.hbase.util.VersionInfo: Compiled by jenkins on Sun Oct 7 19:11:01 UTC 2012 2012-11-21 13:40:21,558 DEBUG org.apache.hadoop.hbase.master.HMaster: Set serverside HConnection retries=100 2012-11-21 13:40:21,823 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,826 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,829 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,833 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,836 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,839 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11-21 13:40:21,842 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2 2012-11- |