Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> m.putDelete versus RowDeletingIterator?


+
David Medinets 2013-10-09, 20:01
+
Eric Newton 2013-10-09, 20:21
Copy link to this message
-
Re: m.putDelete versus RowDeletingIterator?
On Wed, Oct 9, 2013 at 4:21 PM, Eric Newton <[EMAIL PROTECTED]> wrote:

> They do different things.
>
> Deleting mutations marks each entry with a delete marker.  Using the
> iterator marks a whole row with a single mutation.
>
> If you have a million entries in your row, the iterator is faster for
> the delete, but requires a seek to the start of the row for every
> read, so reads are slower.
>
> If your row has one entry, they are the same thing.
>
> Somewhere under N keys... the mutation path will be quite fast, and
> still preserve your reading speed.  I'll just pull a number out of
> thin air... let's say a few thousand.
>

The iterator may still be useful even if rows have few columns because a
row can be deleted w/o reading the row.  W/ m.putDelete() you may need to
read the row and insert a delete for each column value.   If you know what
columns to delete then you can avoid the read

If I have 10M rows to delete, each row having 10 unpredictable columns.
 With the iterator I can batch write 10M row deletion mutations.   Without
the iterator I do 10M seeks, 100M reads and write 100M deletes.
>
> -Eric
>
>
>
> On Wed, Oct 9, 2013 at 4:01 PM, David Medinets <[EMAIL PROTECTED]>
> wrote:
> > Are there any reason to favor one approach over the other?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB