-Re: deletion technique question
Keith Turner 2013-05-13, 16:23
On Mon, May 13, 2013 at 11:24 AM, Marc Reichman <
[EMAIL PROTECTED]> wrote:
> The 1.5 solution looks nice.
> Aware of the potential data loss angle and the sort ordering is also an
> interesting angle, thank you.
> In my particular case where I may not necessarily be aware of all
> permutations of column visibility of a given key but want to replace them
> all with a particular new visibility with the same data, how would I go
> about that? Is there a way to use a batchscanner (step 1 of the
> batchdeleter approach) to pull down all the permutations, then putdeletes
> for them and put what I want?
No. Its like you said. You will only see entries based on the auths you
give the scanner. There is no way to turn off colvis checking in a scan.
Using the transforming iterator, from ACCUMULO-956, at compaction time is
a nice option because all data passes through iterators at compaction time.
> In my case I'm pulling one copy of the data down first to verify I have it
> at the user's current scan auth, then using the #1 approach to clear it out
> and then put it in again as the vis I need.
This is a good way to do it. Could possibly clone the table instead of
pulling a copy down.
> On Mon, May 13, 2013 at 10:05 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
>> On Fri, May 10, 2013 at 12:39 PM, Marc Reichman <
>> [EMAIL PROTECTED]> wrote:
>>> I have a table with rows which have 3 column values in one column
>>> family, and a column visibility.
>>> There are situations where I will want to replace the row content with a
>>> new column visibility; I understand that the visibility attributes are
>>> immutable, so I will have to delete and re-put.
>>> Am I better off doing:
>>> 1. BatchDeleter with authorizations to allow access, set range to the
>>> key in question, call delete, and then put in mutations with the new
>>> 2. Create mutations with a putDelete followed by a put with the new
>>> visibility for each value
>>> 3. Something else entirely?
>> In 1.5, you can use ACCUMULO-956
>>> For option #2, can I simply do a putDelete on the column
>>> family/qualifier? Or do I need to "know" the old authorizations to put in a
>>> visibility expression with the putDelete?
>>> For all of these, can a client get up-to-the-minute results immediately
>>> after? Or does some kind of compaction need to occur first?
>> If you send a mutation with a delete and put, the client will be able to
>> see it after the batchwriter flushes or closes. No compaction needed.
>> I am little fuzzy on #1. Will you delete everything in one pass (using
>> batchdeleter), and then do another pass writing data w/ updated colvis? If
>> so this would seems to imply that you are pulling the data from another
>> source (other than the table stuff was deleted from)?
>> Make sure the method you chose is not susceptible to data loss in the
>> event that the client dies. For example if a client was, reading a table
>> and then writing a delete and updates mutation for each key/val read. If
>> the client died and some deletes were written, but not the corresponding
>> updates, then that data would not be seen to be transformed on the second
>> When you change the colvis, you change the sort order. If you read a key
>> and K and change it to K', where K' sorts after K. If you insert K', its
>> possible that you may read it. Its being inserted in front of the scanners
>> pointer. Because of buffering in the batch writer and scanner, this would
>> not occur always, but it would occur occasionally. Something to be aware