Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Is anyone using serialized iterators to provide provenance data?


+
David Medinets 2013-05-15, 18:51
+
Christopher 2013-05-15, 20:25
+
David Medinets 2013-05-16, 00:44
+
Josh Elser 2013-05-16, 00:58
+
David Medinets 2013-05-16, 22:06
Copy link to this message
-
Re: Is anyone using serialized iterators to provide provenance data?
Christopher 2013-05-16, 01:15
Seems to me this is nothing more than "clone and also add these
per-table iterators on all scopes". Might be a neat little utility to
wrap those features into a single step from the user's perspective.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii
On Wed, May 15, 2013 at 8:58 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
> Oh, I see what you mean. Table B was created from table A with a function F
> (where F is some collection of iterators like you said).
>
> It could be a neat application of the clone command. Storing that
> information on table B is some exercise in where to put that immutable
> information (that's me ignoring that problem :P).
>
> You say git: do you actually intend to have a cheap replay ability? Or
> merely be able to view the history and be able to work through the
> transformations again?
>
> Seems reasonable for a 1.6 wish to me.
>
>
> On 05/15/2013 08:44 PM, David Medinets wrote:
>>
>> I don't see those as covering the same ground. Let's say I have an
>> Accumulo table for a given human's genome. As a scientist, I want to apply a
>> set of filters to create a subset of the genome. This provides a transform
>> from data-set A to data-set B. Since iterators were used for the transform,
>> we could serialize the set of iterators used by the transformation. Both
>> data-sets are immutable. Think git for data-sets.
>>
>>
>> On Wed, May 15, 2013 at 4:25 PM, Christopher <[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>     I think this might relate to ACCUMULO-1397, in the form of providing a
>>     mechanism to specify iterator profiles, or ACCUMULO-415.
>>
>>     --
>>     Christopher L Tubbs II
>>     http://gravatar.com/ctubbsii
>>
>>
>>     On Wed, May 15, 2013 at 2:51 PM, David Medinets
>>     <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>     > If you apply a set of iterators to one table to produce another,
>>     it seems
>>     > possible to serialize the iterator stack alongside the new table
>>     in some
>>     > catalog to provide provenance. The assumption is that the tables are
>>     > immutable, I think. Is anyone doing this or has anyone thought
>>     about doing
>>     > so? Just curious and wanted to ask before I forgot about the idea.
>>
>>
>
+
Keith Turner 2013-05-16, 20:56