-Full table scan from random starting point?
Robert Dyer 2014-01-31, 22:17
Let's say I have one client on each of my regionservers. Each client needs
to do a full scan on the same table. The order in which the rows are
scanned by clients does not matter.
Is it possible to have each client start at a random (or better, the first
row located on the local rs) point in the table so that if I start all of
them at once they don't all peg the same rs for reads?
Example (to keep it simple, assume 3 RS):
RS1: rows 1-2
RS2: rows 3-4
RS3: rows 5-6
client1 (on RS1) reads rows: 1, 2, 3, 4, 5, 6
client2 (on RS2) reads rows: 3, 4, 5, 6, 1, 2
client3 (on RS3) reads rows: 5, 6, 1, 2, 3, 4
Obviously they may progress at different rates and still wind up hitting
the same RSs, but at least we can start out a bit more distributed.
Is this easily possible, without first obtaining a list of all rows and
manually batching them up?