Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation


Copy link to this message
-
Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation
Aaron Cordova 2011-12-22, 21:25
and by Fucsh I mean Fuchs of course ..

On Dec 22, 2011, at 4:22 PM, Aaron Cordova wrote:

> by "byte pairs" I mean byte arrays .. of course ...
>
> On Dec 22, 2011, at 4:20 PM, Aaron Cordova wrote:
>
>> _You_ can think of it that way, cause you're Adam Fucsh, distributed database expert extraordinaire, but that's not how the BigTable data model was described by the original authors - "BigTable is a sparse, sorted, distributed, multidimensional map", and most users do understand Accumulo to be a map of keys to values where the keys are made up of a row, colfam, colqual, colvis, and timestamp and the values are arbitrary byte pairs.
>>
>> To start explaining to people that Accumulo is a multi-map, or to actually make it into a multi-map (i.e. allowing identical keys, where a key includes the timestamp), would be a mistake, in my opinion.
>>
>>
>> On Dec 22, 2011, at 4:09 PM, Adam Fuchs wrote:
>>
>>> Sorry, I thought we were talking about users' perceptions of semantics.
>>> Bigtable also supports holding multiple versions of key/value pairs, so it
>>> can be thought of as having an underlying multi-map as well.
>>>
>>> Adam
>>>
>>>
>>> On Thu, Dec 22, 2011 at 4:04 PM, Aaron Cordova <[EMAIL PROTECTED]> wrote:
>>>
>>>>
>>>> On Dec 22, 2011, at 4:00 PM, Adam Fuchs wrote:
>>>>
>>>>> Timestamp doesn't usually make
>>>>> it into the uniqueness concept, from a user's perspective, even though
>>>> that
>>>>> affects the sort order of Keys. In fact, most users let Accumulo set the
>>>>> timestamp for them. I think your definition of uniqueness takes timestamp
>>>>> into account, and from that perspective what we're doing is sort of like
>>>>> providing a finer grained timestamp instead of using one timestamp for an
>>>>> entire Mutation (or for all Mutations that show up within a millisecond).
>>>>
>>>> Timestamps do define separate keys. This is not just my definition - this
>>>> is in the BigTable design as well as Hbase's, and likely every other
>>>> BigTable clone.
>>>>
>>>>
>>>>
>>
>