Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - number of region servers is wrong


Copy link to this message
-
Re: number of region servers is wrong
Samir Ahmic 2012-02-23, 08:51
Hi Lu,

I remember that i had similar issue with wrong number of region servers
reported to master. In my case it was issue with reverse name resolution so
i think you should check DNS settings and /etc/hosts.
Try ping -c 2 $HOSTNAME on regionserver that is reported twice
(10.27.17.251<http://10.27.17.251:60020/>)
and correct file $HBASE_HOME/config/regionservers with HOSTNAME reported
by  ping -c 2 $HOSTNAME command.
You also should  check:
http://hbase.apache.org/book/os.html

On Thu, Feb 23, 2012 at 8:49 AM, Lu, Wei <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I met with a weird problem when using HBase. There are 3 machines: 1
> master and  2 region servers (wlu-rs1/10.27.17.251 and wlu-rs2/10.27.16.11
> ).
> But when I use "status 'detailed'" to see region servers' status, it show
> there are three server, and one server appears twice (exactly same).
> 3 live servers
> 10.27.17.251:60020 1329975187706
> 10.27.16.11:60020 1329975209046
> 10.27.17.251:60020 1329975187706
>
> When balance begins, region server 10.27.17.251 seems to move data from &
> to itself, and FATAL error occurs.
>
> Log info of HMaster:
>
> 2012-02-23 00:01:00,629 INFO org.apache.hadoop.hbase.master.HMaster:
> balance
> hri=usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429.,
> src=wlu-rs1,60020,1329968056162, dest=10.27.17.251,60020,1329968056162
> 2012-02-23 00:01:00,629 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
> region
> usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429.
> (offlining)
> 2012-02-23 00:01:09,712 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned
> node: /hbase/unassigned/ad483f3806a03756f3f47cd8bd220d09
> (region=usertable,user819517397,1329972500402.ad483f3806a03756f3f47cd8bd220d09.,
> server=wlu-rs1,60020,1329968056162, state=RS_ZK_REGION_CLOSING)
> 2012-02-23 00:01:09,712 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_CLOSING, server=wlu-rs1,60020,1329968056162,
> region=ad483f3806a03756f3f47cd8bd220d09
> 2012-02-23 00:01:12,678 FATAL org.apache.hadoop.hbase.master.HMaster:
> Remote unexpected exception
> java.io.IOException: Call to /10.27.17.251:60020 failed on local
> exception: java.io.EOFException
>                at
> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:806)
>                at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
>                at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>                at $Proxy6.closeRegion(Unknown Source)
>                at
> org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:601)
>                at
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1123)
>                at
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1070)
>                at
> org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1930)
>                at
> org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:694)
>                at
> org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:585)
>                at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> Caused by: java.io.EOFException
>                at java.io.DataInputStream.readInt(Unknown Source)
>                at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
>                at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
> 2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster:
> Aborting
> 2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster:
> balance
>
>
> I use HBase0.90.3 and Hadoop0.20.2. Can anyone please help to figure this
> out?
>
>
>
> Regards,
> Wei
>
>