Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scan exception when running MR

Copy link to this message
Re: Scan exception when running MR
Cool… But my map reduce doesn't even start…
It fails while creating a record reader...
The record reader fails in


and throws a

java.io.IOException: version not supported
at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:558)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:255)
at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:105)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Basically the deserializing of Scan object from the conf is failing for some reason.


On 22-Oct-2012, at 9:52 PM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote:

> I'm not on 0.94.1, but I've found a lot of situations that can cause
> scanner timeouts and other scanner exceptions from M/R.  The primary ones
> probably still apply in later versions:
>   - Caching or batching set too high.  If caching is set to, e.g. 1000,
>   and hbase.rpc.timeout is set to 30 seconds, it means you need to be able to
>   be able to process 1000 records in your mapper in under 30 seconds (minus
>   the overhead of actually returning that many records). Otherwise the
>   mapper's next call to next() will throw a timeout.
>   - Similar to the above, this can happen if just the logic in your mapper
>   is too heavy and taking too long.  Just keep in mind that the
>   hbase.rpc.timeout can be triggered by the time between calls to next()
>   - hbase.rpc.timeout > hbase.regionserver.lease.period.  If this is the
>   case, the RS will timeout first and kill the scan.  Then your mapper will
>   call next() and since the scan no longer exists will throw a scan/lease
>   exception.
>   - The filters on the mapper scan are causing too many rows to be
>   skipped, such that not enough rows can be collected to return within the
>   timeout.
> Hope this helps and/or is still accurate for your version.
> On Mon, Oct 22, 2012 at 11:51 AM, J Mohamed Zahoor <[EMAIL PROTECTED]> wrote:
>> I am using 0.94.1
>> ./zahoor
>> On 22-Oct-2012, at 9:17 PM, J Mohamed Zahoor <[EMAIL PROTECTED]> wrote:
>>> Hi
>>> I am facing a scanner exception like this when i run a mr job.
>>> Both the input and output are hbase tables (different tables)…
>>> This comes sporadically on some mapper and all other mappers runs fine..
>>> Even the failed mapper gets passed in the next attempt.
>>> Any clue on what might be wrong?
>>> java.lang.NullPointerException
>>>      at org.apache.hadoop.hbase.client.Scan.<init>(Scan.java:147)
>>>      at
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:123)
>>>      at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:489)
>>>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>      at java.security.AccessController.doPrivileged(Native Method)
>>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>>      at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>>      at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> ./zahoor