-Re: Mapred job failing with LeaseException
Suraj Varma 2012-07-11, 23:22
The reason you get LeaseExceptions is that the time between two
scanner.next() calls exceeded your hbase.regionserver.lease.period
setting which defaults to 60s. Whether it is your "client" or your
"map task", if it opens a Scan against HBase, scanner.next() should
continue to get invoked within this lease period - else, the client is
considered dead and the lease is expired. When this "dead" client
comes back and tries to do a scanner.next(), it gets a LeaseException.
There are several threads on this ... so - google for "hbase scanner
leaseexception" and such. See:
Are you doing some processing in between two scanner.next() calls that
takes over 60s over time?
On Wed, Jul 11, 2012 at 1:23 AM, 최우용 <[EMAIL PROTECTED]> wrote:
> I'm running a cluster of few hundred servers with Cloudera's CDH3u4
> and having trouble with what I think is a simple map job which uses
> HBase table as an input.
> My mapper code is org.apache.hadoop.hbase.mapreduce.Export with a few
> SingleColumnValueFilter(i.e. a FilterList) added to the Scan object.
> The job seems to progress without any trouble at first, but after
> about 5~7 minutes when little over 50% of map tasks complete,
> I suddenly see a lot of LeaseExceptions and the job ultimately fails.
> Here's the stack print I see on my failed tasks:
> org.apache.hadoop.hbase.regionserver.LeaseException: lease
> '7595201038414594449' does not exist at
> org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230) at
> sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) at
> java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
> java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
> I kind of had a similar problem when I was scanning a particular
> region using ResultScanner in a single-threaded manner with the same
> filters mentioned above
> but I assumed it wouldn't be a problem in mapred since it's more
> resilient to single task errors.
> I tried row caching with Scan.setCaching(), lowered
> mapred.tasktracker.map.tasks.maximum property in hopes of reducing the
> total loads on region servers, but nothing worked.
> Could this be a filter performance problem preventing region servers
> from responding before lease expiration?
> Or maybe a long sequence of rows don't match my filter list and the
> lease expires before it finally hits the one that does.
> I'm kind of new to Hadoop map-reduce and HBase, so any pointers would
> be very much appreciated.