Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBaseClient isn't reusing connections but creating a new one each time


+
Jeff Whiting 2013-03-29, 18:41
+
Ted Yu 2013-03-29, 19:00
+
ramkrishna vasudevan 2013-03-29, 19:40
+
Jeff Whiting 2013-03-29, 19:44
Copy link to this message
-
Re: HBaseClient isn't reusing connections but creating a new one each time
Jeff:
Thanks for reporting the bug.

Patch is available in HBASE-8222. It should go into 0.94.7

If you cannot wait, feel free to apply by yourself.
On Fri, Mar 29, 2013 at 12:44 PM, Jeff Whiting <[EMAIL PROTECTED]> wrote:

> I am using cdh4.1.3 which roughly maps to 0.92.1 with patches.
>
> ~Jeff
>
>
> On Fri, Mar 29, 2013 at 1:40 PM, ramkrishna vasudevan <
> [EMAIL PROTECTED]> wrote:
>
> > Nice one..  Good find.
> >
> >
> > On Sat, Mar 30, 2013 at 12:30 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> > > Can you tell us the version of HBase you are using ?
> > >
> > > Gary did some cleanup in:
> > >
> > > r1439723 | garyh | 2013-01-28 16:50:02 -0800 (Mon, 28 Jan 2013) | 1
> line
> > >
> > > HBASE-7626 Backport client connection cleanup from HBASE-7460
> > >
> > > This is the current code in getConnection() in 0.94 branch:
> > >     ConnectionId remoteId = new ConnectionId(addr, protocol, ticket,
> > > rpcTimeout);
> > >     synchronized (connections) {
> > >       connection = connections.get(remoteId);
> > >       if (connection == null) {
> > >         connection = createConnection(remoteId);
> > >         connections.put(remoteId, connection);
> > >       }
> > >     }
> > >     connection.addCall(call);
> > >
> > >
> > > On Fri, Mar 29, 2013 at 11:41 AM, Jeff Whiting <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > After noticing a lot of threads, I turned on debugging logging for
> > hbase
> > > > client and saw this many times counting up constantly:
> > > > HBaseClient:531 - IPC Client (687163870) connection to
> > > > /10.1.37.21:60020from jeff: starting, having connections 1364
> > > >
> > > > At that point in my code it was up to 1364 different connections (and
> > > > threads).  Those connections will eventually drop off after the idle
> > time
> > > > is reached "conf.getInt("hbase.ipc.client.connection.maxidletime",
> > > 10000)".
> > > > But during periods of activity the number of threads can get very
> high.
> > > >
> > > > Additionally I was able to confirm the large number of threads by
> > doing:
> > > >
> > > > jstack <pid> | grep IPC
> > > >
> > > >
> > > > So I started digging around in the code...
> > > >
> > > > In HBaseClient.getConnection it attempts to reuse previous
> connections:
> > > >
> > > >  ConnectionId remoteId = new ConnectionId(addr, protocol, ticket,
> > > > rpcTimeout);
> > > >     do {
> > > >       synchronized (connections) {
> > > >         connection = connections.get(remoteId);
> > > >         if (connection == null) {
> > > >           LOG.error("poolsize: "+getPoolSize(conf));
> > > >           connection = new Connection(remoteId);
> > > >           connections.put(remoteId, connection);
> > > >         }
> > > >       }
> > > >     } while (!connection.addCall(call));
> > > >
> > > >
> > > > It does this by using the connection id as the key to the pool. All
> of
> > > this
> > > > seems good except ConnectionId never hashes to the same value so it
> > > cannot
> > > > reuse any connection.
> > > >
> > > > From my understanding of the code here is why.
> > > >
> > > > In HBaseClient.ConnectionId
> > > >
> > > >     @Override
> > > >     public boolean equals(Object obj) {
> > > >      if (obj instanceof ConnectionId) {
> > > >        ConnectionId id = (ConnectionId) obj;
> > > >        return address.equals(id.address) && protocol == id.protocol
> &&
> > > >               ((ticket != null && ticket.equals(id.ticket)) ||
> > > >                (ticket == id.ticket)) && rpcTimeout == id.rpcTimeout;
> > > >      }
> > > >      return false;
> > > >     }
> > > >
> > > >     @Override  // simply use the default Object#hashcode() ?
> > > >     public int hashCode() {
> > > >       return (address.hashCode() + PRIME * (
> > > >                   PRIME * System.identityHashCode(protocol) ^
> > > >              (ticket == null ? 0 : ticket.hashCode()) )) ^
> rpcTimeout;
> > > >     }
> > > >
> > > > It uses the protocol and the ticket in the both functions.  However
+
Jeff Whiting 2013-03-29, 20:05
+
Gary Helmling 2013-03-29, 18:59
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB