Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - silently aborted scans when using hbase.client.scanner.max.result.size


Copy link to this message
-
Re: silently aborted scans when using hbase.client.scanner.max.result.size
Jean-Daniel Cryans 2012-07-25, 19:07
That looks nasty.

Could it be that your client doesn't know about the max result size?
Looking at ClientScanner.next() we iterate while this is true:

} while (remainingResultSize > 0 && countdown > 0 &&
nextScanner(countdown, values == null));

Let's say the region server returns less rows than needed, like 1240,
but the caching is set to 1241. The remaining size would still be
higher than zero and so would the countdown (its value would be 1). So
it's gonna try to get the nextScanner. If you have just one region it
would stop there.

But that would be the case if you have 1 region and did not set the
config on the client-side.

J-D

On Wed, Jul 25, 2012 at 5:04 AM, Ferdy Galema <[EMAIL PROTECTED]> wrote:
> I was experiencing aborted scans on certain conditions. In these cases I
> was simply missing so many rows that only a fraction was inputted, without
> warning. After lots of testing I was able to pinpoint and reproduce the
> error when scanning over a single region, single column family, single
> store file. So really just a single (major_compacted) storefile. I scan
> over this region using a single Scan in a local jobtracker context. (So not
> mapreduce, although this has exactly the same behaviour). Finally, I
> noticed the number of input rows is dependent on the
> hbase.client.scanner.caching property. See following example runs that
> scans over this region with a specific start and stop key:
>
> -Dhbase.client.scanner.caching=1
> inputrows=1506
>
> -Dhbase.client.scanner.caching=10000
> inputrows=1240
>
> -Dhbase.client.scanner.caching=1240
> inputrows=1506
>
> -Dhbase.client.scanner.caching=1241
> inputrows=1240
>
> This is weird huh? So setting the cache to 1241 in this case aborts the
> scan silently. Removing the stoprow yields the same amout. Setting the
> caching to 1 with no stoprow yields all rows. (Several hundreds of
> thousands).
>
> Neither the client nor the regionserver log any warning whatsoever. I had
> the hbase.client.scanner.max.result.size set to 90100100. After removing
> this property it all works like a charm!!! All rows are properly inputted,
> regardless of hbase.client.scanner.caching. As an extra verification I
> checked the regionserver for warnings that I would expect without this
> property and this seems to be the case:
> 2012-07-25 11:46:52,889 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60
> 020, responseTooLarge for: next(-1937592840574159040, 10000) from
> x.x.x.x:39398: Size: 3
> 38.1m
> 2012-07-25 11:47:14,359 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60
> 020, responseTooLarge for: next(-1937592840574159040, 10000) from
> x.x.x.x:39407: Size: 1
> 86.6m
>
> So, anyone know what this could be? I am willing to debug this behaviour at
> the regionserver level, but before I do I want to make sure I am not
> running into something that has already been solved. This is
> on hbase-0.90.6-cdh3u4, using snappy.