Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Regions not getting reassigned if RS is brought down


Copy link to this message
-
Re: Regions not getting reassigned if RS is brought down
Shrijeet Paliwal 2011-07-15, 20:02
So the problem is if you are using an interface anything other than
'default' (literally that keyword) DNS.java 's getDefaultHost will return a
string which will
have a trailing period at the end. Now to me it seems javadoc of reverseDns
in DNS.java (see below) is conflicting with what that function is actually
doing.
It is returning a PTR record while claims it returns a hostname. The PTR
record always has period at the end , RFC:
http://irbs.net/bog-4.9.5/bog47.html

  /**
   * Returns the hostname associated with the specified IP address by the
   * provided nameserver.
   *
   * @param hostIp
   *            The address to reverse lookup
   * @param ns
   *            The host name of a reachable DNS server
*   * @return The host name associated with the provided IP*
   * @throws NamingException
   *             If a NamingException is encountered
   */
  public static String reverseDns(InetAddress hostIp, String ns)
    throws NamingException {
    //
    // Builds the reverse IP lookup form
    // This is formed by reversing the IP numbers and appending in-addr.arpa
    //
    String[] parts = hostIp.getHostAddress().split("\\.");
    String reverseIP = parts[3] + "." + parts[2] + "." + parts[1] + "."
      + parts[0] + ".in-addr.arpa";

    System.out.println("reverse ip is :" + reverseIP);

    DirContext ictx = new InitialDirContext();
    Attributes attribute       ictx.getAttributes("dns://"               // Use "dns:///" if the
default
                         + ((ns == null) ? "" : ns) +
                         // nameserver is to be used
                         "/" + reverseIP, new String[] { "PTR" });
    ictx.close();

*    return attribute.get("PTR").get().toString();*
  }
Related issue (I havent gone through it completely but glancing hints it is
related).
https://issues.apache.org/jira/browse/HBASE-2599 . Thanks Karthick for
pointing this out.

A quicky is to recognize that default host has a trailing period and drop it
when we call it here:
 String machineName = DNS.getDefaultHost(conf.get(
        "hbase.regionserver.dns.interface", "default"), conf.get(
        "hbase.regionserver.dns.nameserver", "default"));

I will open an issue shortly.  Thoughts?

-Shrijeet
On Fri, Jul 15, 2011 at 10:25 AM, Stack <[EMAIL PROTECTED]> wrote:

> Thanks for digging in Shrijeet.  We don't do this name matching well
> in 0.90.x  Sorry for pain caused.  on your observation below about
> RegionServerTracker, if you figure an improvement, that'd be great.
>
> Thanks,
> St.Ack
>
> On Thu, Jul 14, 2011 at 9:07 PM, Shrijeet Paliwal
> <[EMAIL PROTECTED]> wrote:
> > I have narrowed it down to following :
> >
> >  // Server to handle client requests
> >    String machineName = DNS.getDefaultHost(conf.get(
> >        "hbase.regionserver.dns.interface", "default"), conf.get(
> >        "hbase.regionserver.dns.nameserver", "default"));
> >
> > I am not using the default interface for RS. I have changed it to 'eth1'
> > . The machineName is getting set as 'server-2.rfiserve.net.'
> > Notice the extra period in the end.
> >
> > Because of above there is an inconsistency in the way zookeeper recorded
> the
> > regionserver address and way ServerManager had it in its cached list of
> > onlineservers.
> > You will notice the extra dot in zookeeper entry but not in the
> ServerManager
> > list.
> >
> > [zk: localhost:2181(CONNECTED) 3] ls /hbase/rs
> > [server-2.domain.net.,60020,1310684522383,server-1.domain.net
> > .,60020,1310680203359]
> >
> >
> > In ServerManager we do following :
> >
> > void recordNewServer(HServerInfo info, boolean useInfoLoad,
> >      HRegionInterface hri) {
> >    HServerLoad load = useInfoLoad? info.getLoad(): new HServerLoad();
> >    String serverName = info.getServerName();
> >    LOG.info("Registering server=" + serverName + ", regionCount=" +
> >      load.getLoad() + ", userLoad=" + useInfoLoad);
> >    info.setLoad(load);
> >    // TODO: Why did we update the RS location ourself?  Shouldn't RS do