Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation


Copy link to this message
-
Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation
by "byte pairs" I mean byte arrays .. of course ...

On Dec 22, 2011, at 4:20 PM, Aaron Cordova wrote:

> _You_ can think of it that way, cause you're Adam Fucsh, distributed database expert extraordinaire, but that's not how the BigTable data model was described by the original authors - "BigTable is a sparse, sorted, distributed, multidimensional map", and most users do understand Accumulo to be a map of keys to values where the keys are made up of a row, colfam, colqual, colvis, and timestamp and the values are arbitrary byte pairs.
>
> To start explaining to people that Accumulo is a multi-map, or to actually make it into a multi-map (i.e. allowing identical keys, where a key includes the timestamp), would be a mistake, in my opinion.
>
>
> On Dec 22, 2011, at 4:09 PM, Adam Fuchs wrote:
>
>> Sorry, I thought we were talking about users' perceptions of semantics.
>> Bigtable also supports holding multiple versions of key/value pairs, so it
>> can be thought of as having an underlying multi-map as well.
>>
>> Adam
>>
>>
>> On Thu, Dec 22, 2011 at 4:04 PM, Aaron Cordova <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> On Dec 22, 2011, at 4:00 PM, Adam Fuchs wrote:
>>>
>>>> Timestamp doesn't usually make
>>>> it into the uniqueness concept, from a user's perspective, even though
>>> that
>>>> affects the sort order of Keys. In fact, most users let Accumulo set the
>>>> timestamp for them. I think your definition of uniqueness takes timestamp
>>>> into account, and from that perspective what we're doing is sort of like
>>>> providing a finer grained timestamp instead of using one timestamp for an
>>>> entire Mutation (or for all Mutations that show up within a millisecond).
>>>
>>> Timestamps do define separate keys. This is not just my definition - this
>>> is in the BigTable design as well as Hbase's, and likely every other
>>> BigTable clone.
>>>
>>>
>>>
>