On Wed, Oct 9, 2013 at 4:21 PM, Eric Newton <[EMAIL PROTECTED]> wrote:
> They do different things.
> Deleting mutations marks each entry with a delete marker. Using the
> iterator marks a whole row with a single mutation.
> If you have a million entries in your row, the iterator is faster for
> the delete, but requires a seek to the start of the row for every
> read, so reads are slower.
> If your row has one entry, they are the same thing.
> Somewhere under N keys... the mutation path will be quite fast, and
> still preserve your reading speed. I'll just pull a number out of
> thin air... let's say a few thousand.
The iterator may still be useful even if rows have few columns because a
row can be deleted w/o reading the row. W/ m.putDelete() you may need to
read the row and insert a delete for each column value. If you know what
columns to delete then you can avoid the read
If I have 10M rows to delete, each row having 10 unpredictable columns.
With the iterator I can batch write 10M row deletion mutations. Without
the iterator I do 10M seeks, 100M reads and write 100M deletes.
> On Wed, Oct 9, 2013 at 4:01 PM, David Medinets <[EMAIL PROTECTED]>
> > Are there any reason to favor one approach over the other?