Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBaseClient isn't reusing connections but creating a new one each time


+
Jeff Whiting 2013-03-29, 18:41
+
Ted Yu 2013-03-29, 19:00
Copy link to this message
-
Re: HBaseClient isn't reusing connections but creating a new one each time
Nice one..  Good find.
On Sat, Mar 30, 2013 at 12:30 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Can you tell us the version of HBase you are using ?
>
> Gary did some cleanup in:
>
> r1439723 | garyh | 2013-01-28 16:50:02 -0800 (Mon, 28 Jan 2013) | 1 line
>
> HBASE-7626 Backport client connection cleanup from HBASE-7460
>
> This is the current code in getConnection() in 0.94 branch:
>     ConnectionId remoteId = new ConnectionId(addr, protocol, ticket,
> rpcTimeout);
>     synchronized (connections) {
>       connection = connections.get(remoteId);
>       if (connection == null) {
>         connection = createConnection(remoteId);
>         connections.put(remoteId, connection);
>       }
>     }
>     connection.addCall(call);
>
>
> On Fri, Mar 29, 2013 at 11:41 AM, Jeff Whiting <[EMAIL PROTECTED]>
> wrote:
>
> > After noticing a lot of threads, I turned on debugging logging for hbase
> > client and saw this many times counting up constantly:
> > HBaseClient:531 - IPC Client (687163870) connection to
> > /10.1.37.21:60020from jeff: starting, having connections 1364
> >
> > At that point in my code it was up to 1364 different connections (and
> > threads).  Those connections will eventually drop off after the idle time
> > is reached "conf.getInt("hbase.ipc.client.connection.maxidletime",
> 10000)".
> > But during periods of activity the number of threads can get very high.
> >
> > Additionally I was able to confirm the large number of threads by doing:
> >
> > jstack <pid> | grep IPC
> >
> >
> > So I started digging around in the code...
> >
> > In HBaseClient.getConnection it attempts to reuse previous connections:
> >
> >  ConnectionId remoteId = new ConnectionId(addr, protocol, ticket,
> > rpcTimeout);
> >     do {
> >       synchronized (connections) {
> >         connection = connections.get(remoteId);
> >         if (connection == null) {
> >           LOG.error("poolsize: "+getPoolSize(conf));
> >           connection = new Connection(remoteId);
> >           connections.put(remoteId, connection);
> >         }
> >       }
> >     } while (!connection.addCall(call));
> >
> >
> > It does this by using the connection id as the key to the pool. All of
> this
> > seems good except ConnectionId never hashes to the same value so it
> cannot
> > reuse any connection.
> >
> > From my understanding of the code here is why.
> >
> > In HBaseClient.ConnectionId
> >
> >     @Override
> >     public boolean equals(Object obj) {
> >      if (obj instanceof ConnectionId) {
> >        ConnectionId id = (ConnectionId) obj;
> >        return address.equals(id.address) && protocol == id.protocol &&
> >               ((ticket != null && ticket.equals(id.ticket)) ||
> >                (ticket == id.ticket)) && rpcTimeout == id.rpcTimeout;
> >      }
> >      return false;
> >     }
> >
> >     @Override  // simply use the default Object#hashcode() ?
> >     public int hashCode() {
> >       return (address.hashCode() + PRIME * (
> >                   PRIME * System.identityHashCode(protocol) ^
> >              (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout;
> >     }
> >
> > It uses the protocol and the ticket in the both functions.  However going
> > back through all of the layers I think I found the problem.
> >
> > Problem:
> >
> > HBaseRPC.java:  public static VersionedProtocol getProxy(Class<? extends
> > VersionedProtocol> protocol,
> >       long clientVersion, InetSocketAddress addr, Configuration conf,
> >       SocketFactory factory, int rpcTimeout) throws IOException {
> >     return getProxy(protocol, clientVersion, addr,
> >         User.getCurrent(), conf, factory, rpcTimeout);
> >   }
> >
> > User.getCurrent() always returns a new User object.  That user instance
> is
> > eventually passed down to ConnectionId.  However the User object doesn't
> > implement hash() or equals() so one ConnectionId won't ever match another
> > ConnectionId.
> >
> >
> > There are several possible solutions.
> > 1. implement hashCode and equals for the User.
+
Jeff Whiting 2013-03-29, 19:44
+
Ted Yu 2013-03-29, 19:55
+
Jeff Whiting 2013-03-29, 20:05
+
Gary Helmling 2013-03-29, 18:59
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB