I have a use case in which I'm investigating setting a locality group on
every column family in a table which has very "dense" rows (many columns
appear within the same tablet).
When scanning over a single column, I see a slow-down as one might expect
(filtering out the columns I don't care about). Setting each column into
its own locality group helps speed things up again for that single column
I'm curious if anyone has any insight to when/if I'm going to start paying
a penalty for having many locality groups. Glancing back over RFile.Reader,
I have to read each LocalityGroupMetadata and its multi-level index (which
shouldn't be bad if I remember Keith's talks) and then I should get log(n)
reads across the locality groups I need to open.
Is the same true for writing data to many a table with many locality
groups? Nothing terrible pops out at me looking at the code.
I was planning to write some tests to try and simulate this, but figured I
can poll the community as well to see if anyone has experimented in this