Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation


Copy link to this message
-
Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in  mutation
The timestamp is part of the key. If two keys differ by timestamp, they are different keys. The versioning iterator filters out certain _keys_ and their values.

If Accumulo allows two identical keys to be inserted, that behavior should change. In my opinion, it should arbitrarily throw away all but one key value pair, so as to behave like a proper map.

On Dec 22, 2011, at 3:55 PM, Keith Turner (Commented) (JIRA) wrote:

>
>    [ https://issues.apache.org/jira/browse/ACCUMULO-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175052#comment-13175052 ]
>
> Keith Turner commented on ACCUMULO-227:
> ---------------------------------------
>
> Aaron,
>
> By default Accumulo is a map (when configured w/ the versioning iterator).  To get the map behavior you mentioned w/ aggregation, I think you could put the versioning iterator below the aggregating iterator.  Then aggregation would never see two identical keys.
>
> Without the versioning iterator, if two identical key values exist in two map file then the user will see both.  This has nothing to do w/ the in memory map.  This change just makes the behavior when the Versioning iterator is removed consistent.
>
> There is one oddity when there are two identical keys, nondeterministic behavior.  If two files have the same key value and you have the versioning iterator configured, then you may see different values for the same key at different times.  Eric suggested sorting on the value to make this deterministic.
>
>> Improve in memory map counts to provide cell level uniqueness for repeated columns in  mutation
>> -----------------------------------------------------------------------------------------------
>>
>>                Key: ACCUMULO-227
>>                URL: https://issues.apache.org/jira/browse/ACCUMULO-227
>>            Project: Accumulo
>>         Issue Type: Improvement
>>         Components: tserver
>>           Reporter: John Vines
>>           Assignee: John Vines
>>            Fix For: 1.5.0
>>
>>
>> Currently for isolation we only isolate mutations. This doesn't allow mutations with identical cells within it. We should increase the mutation counts to account for each individual cell instead of each mutation.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB