Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: Omid: Transactional Support for HBase


Copy link to this message
-
Re: Omid: Transactional Support for HBase

On Nov 8, 2011, at 10:48 , Daniel Gómez Ferro wrote:

> Hi Jignesh
>
> On Nov 7, 2011, at 21:44 , Jignesh Patel wrote:
>
>> Looks like this transaction is limited for one row. Is that correct?
>>
>
> No, it's not. Transactions can span multiple rows.
>
>> Another thing I don't have zookeepr installed as I am running in
>> pseudo distibuted mode. The document doesn't say anything about
>> integrating in pseudo distributed mode.
>>
>
> Currently Omid requires both ZooKeeper and BookKeeper to operate, but we provide some scripts to launch them locally if you just want to try it. I've just pushed a change so you don't need to install anything manually, just download/checkout Omid, run 'mvn package' and follow the instructions to run the benchmark locally.

Please remember that the repository we are using now is https://github.com/yahoo/omid/

>
> If people still find cumbersome or difficult to run ZK/BK we could provide an option to disable the replication to the WAL.
>
> Daniel
>
>> -Jignesh
>>
>> 2011/11/7 Daniel Gómez Ferro <[EMAIL PROTECTED]>:
>>>
>>> On Nov 6, 2011, at 21:53 , lars hofhansl wrote:
>>>
>>>> Another question: I assume this will not work out of the box with deletes?
>>>
>>> Hi,
>>>
>>> Our current approach does support deletes (i.e., user requested deletes). Right now we use empty values as delete marks: when the user calls TransactionalTable.delete() we insert empty values at the specified timestamp. At the filtering time, we keep track of these delete marks and we can discard the ones that are uncommitted or fall outside our time range of interest. When a transaction aborts, the cleanup procedure deletes the specific values inserted by the transactions (in contrast to all versions). This way we don't insert delete tombstones that mask previous values.
>>>
>>> The drawbacks of this approach are that (i) we give a special meaning to the empty values, and (ii) to delete the whole column family (in contrast with a column) we have to perform a get beforehand to obtain the column qualifiers.
>>>
>>>>
>>>> Deletes always cover all key values in the past (from their timestamps on backwards), so once a delete marker is placed there is no way to get back any of a puts it affects.
>>>>
>>>> HBase trunk has HBASE-4536 to allow time-range scans to work with deleted rows (but needs to be enabled for a column family - I still think it should be the default, but anyway).
>>>>
>>>
>>> I think this feature would be very useful, and enables a cleaner implementation. It would be great if the flag was enabled by default, we want the user to change as little as possible his setup, but it's not a big deal.
>>>
>>>> -- Lars
>>>>
>>>> ________________________________
>>>> From: Flavio Junqueira <[EMAIL PROTECTED]>
>>>> To: Daniel Gómez Ferro <[EMAIL PROTECTED]>
>>>> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Maysam Yabandeh <[EMAIL PROTECTED]>; Benjamin Reed <[EMAIL PROTECTED]>; Ivan Kelly <[EMAIL PROTECTED]>
>>>> Sent: Sunday, November 6, 2011 7:14 AM
>>>> Subject: Re: Omid: Transactional Support for HBase
>>>>
>>>>
>>>> A quick note on Omid for the ones following on github: the repository we will be working with is the fork under the Yahoo! account:
>>>>
>>>>
>>>> https://github.com/yahoo/omid/
>>>>
>>>> -Flavio
>>>>
>>>>
>>>> On Nov 5, 2011, at 9:36 PM, Daniel Gómez Ferro wrote:
>>>>
>>>>
>>>>>
>>>>> On Nov 5, 2011, at 05:37 , lars hofhansl wrote:
>>>>>
>>>>> Cool stuff Daniel,
>>>>>>
>>>>>
>>>>> Hi Lars,
>>>>>
>>>>> Thanks for the good points.
>>>>>
>>>>>
>>>>>
>>>>>> Was looking through the code a bit. Seems like you make a best effort to push as much of
>>>>>> the filtering of KVs of uncommitted transactions to HBase and then do some filtering on the client
>>>>>> not a bad approach. (I hope I didn't misunderstand the approach, only looked through the code for
>>>>>> 1/2 hour or so).
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB