Vidhyashankar Venkatarama... 2011-04-13, 07:40
Ted Yu 2011-04-13, 08:44
Vidhyashankar Venkatarama... 2011-04-13, 14:44
Gary Helmling 2011-04-13, 16:15
Jean-Daniel Cryans 2011-04-13, 16:42
Gary Helmling 2011-04-13, 17:05
Jean-Daniel Cryans 2011-04-13, 17:18
Vidhyashankar Venkatarama... 2011-04-13, 17:03
Gary Helmling 2011-04-13, 17:14
Himanshu Vashishtha 2011-04-13, 15:43
Vidhyashankar Venkatarama... 2011-04-13, 17:47
Vidhya, so yes in the case of huge files with valid rows, timerange thing
will not be effective and neither in the case of a scanner hanging in its
next calls either by a gc pause or some exhaustive computation. I voted for
this answer after reading your initial mail (but it got posted after a delay
of 3 hrs, don't know why) and lot of other facts were revealed during that
time frame :)), like jira 2077.
Good learning for me though :)
On Wed, Apr 13, 2011 at 11:47 AM, Vidhyashankar Venkataraman <
[EMAIL PROTECTED]> wrote:
> Thanks, this will resolve the particular case we ran into. But what if
> the files are huge and have a wide range of timestamps and only some of the
> records in the file are valid? And for the other application that we have:
> scans with filters that returns a sparse set, the solution may not help.
> Further, it won't solve the underlying problem. When a scanner is busy,
> but doesn't have any rows to return "yet", neither the client nor the region
> server should mistake it for an unresponsive scanner.
> On 4/13/11 8:43 AM, "Himanshu Vashishtha" <[EMAIL PROTECTED]> wrote:
> Did you try setting scanner time range. It takes min and max timestamps,
> when instantiating the scanner at RS, a time based filtering is done to
> include only selected store files. Have a look at
> Sortedset<byte). I think it should improve the response time.
> On Wed, Apr 13, 2011 at 8:44 AM, Vidhyashankar Venkataraman <
> [EMAIL PROTECTED]> wrote:
> > Hi
> > We had enabled scanner caching but I don't think it is the same issue
> > because scanner.next in this case is blocking: the scanner is busy in the
> > region server but hasn't returned anything yet since a row to be returned
> > hasn't been found yet (all rows have expired but are still there since
> > havent been compacted yet).
> > Vidhya
> > On 4/13/11 1:44 AM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
> > Have you read the following thread ?
> > "ScannerTimeoutException when a scan enables caching, no exception when
> > doesn't"Did you enable caching ? If not, it is different issue.
> > On Wed, Apr 13, 2011 at 12:40 AM, Vidhyashankar Venkataraman <
> > [EMAIL PROTECTED]> wrote:
> > > (This could be a known issue. Please let me know if it is).
> > >
> > > We had a set of uncompacted store files in a region. One of the column
> > > families had a store file of 5 Gigs. The other column families were
> > pretty
> > > small (a few megabytes at most).
> > >
> > > It so turned out that all these files had rows whose TTL had expired.
> > Now
> > > when this region was scanned (which should yield a result of a null
> > we
> > > got Scanner timeouts and UnknownScannerExceptions.
> > >
> > > And when we tried scanning the region without the large column family,
> > the
> > > scanner returned back safely with no result.
> > >
> > > So, I major compacted it and the scan started working correctly.
> > >
> > > So it looks like timeouts happen if the scanner does not return any
> > output
> > > for a specified time.
> > > Which isn't exactly the correct thing to do, because it could be the
> > > that the scanner was indeed busy but it just so happened that there are
> > no
> > > rows yet to return back to the client.
> > >
> > > We can try increasing the scanner timeout, but this doesn't resolve the
> > > underlying problem. Is this a know issue?
> > >
> > > Thank you
> > > Vidhya
> > >