Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - delete rows without writing HLog may be appear in the future?


Copy link to this message
-
Re: delete rows without writing HLog may be appear in the future?
Michael Segel 2012-11-21, 16:18
Ok,

First, I am a firm believer of not bypassing the WAL period.

I'm not sure why you would be seeing the data after a delete. If what you said is true, then either the delete got lost or the delete happened before the insert (which doesn't make sense because the delete should have thrown an exception...)

I am also confused by what you mean that the delete has to be low latency.
What's the timing difference between writing a delete to the WAL or bypassing the WAL.

Also I am concerned by your statement that you do the delete, it looks like it was deleted, only a week or two later, its back.
That doesn't make sense because the data written to the WAL would have long since been flushed and then if you delete, then the flag should have still remained.
Can you check the timestamp of the cell?

Something isn't right.

On Nov 21, 2012, at 9:37 AM, Bing Jiang <[EMAIL PROTECTED]> wrote:

> In our apps, deletes will be frequent, and it occurs to each records every
> time, if write hlog, the performance and response will be low. In fact,we
> can bear with some records with delete fail, but recently I have found more
> records delete some time ago, for example, one week , they reappear
> again.Then, that makes me curious about what should do next., delete with
> writing hlog, or put without hlog....
> On Nov 21, 2012 11:19 PM, "Kevin O'dell" <[EMAIL PROTECTED]> wrote:
>
>> Bing,
>>
>>  I am curious to hear more about Mike's question.  Why are you not using
>> the WAL for your deletes?
>>
>> On Wed, Nov 21, 2012 at 10:17 AM, Bing Jiang <[EMAIL PROTECTED]
>>> wrote:
>>
>>> yes,hbase has made a compaction between batch-put and deletes. any ideas?
>>>
>>> On Nov 21, 2012 11:10 PM, "Michael Segel" <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>> Some time later?
>>>>
>>>> Time of course is relative, so I have to ask what occurred between the
>>> write and the delete?
>>>> How much time? Did you have any compactions in between the write and
>> the
>>> delete?
>>>>
>>>> Why are you not consistent in your use of the WAL ?
>>>>
>>>>
>>>> On Nov 21, 2012, at 6:37 AM, Bing Jiang <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>>> hi,all.
>>>>> I want to describe a phenomenon that happens to our hbase cluster.
>>>>> I use puts(List<Put>) to insert many records with writing hlog
>> enable,
>>>>> and some time later I delete all of these records with writing hlog
>>> disable.
>>>>> When one week later, i scan the table, I found some records I have
>>> delete
>>>>> reappear again.
>>>>> It is an interesting case. In my opinion, if we delete data without
>>> enable
>>>>> writing hlog, when regionserver fails, the log will replay in another
>>>>> regionserver.
>>>>> Can anyone tell me if I persist on deleting records without enable
>>> writing
>>>>> hlog, is there a way to prevent these records from reappearing again
>>> some
>>>>> time later?
>>>>>
>>>>> Cheers!
>>>>> --
>>>>> Bing Jiang
>>>>> weibo: http://weibo.com/jiangbinglover
>>>>> BLOG: http://blog.sina.com.cn/jiangbinglover
>>>>> BLOG: http://www.binospace.com
>>>>> National Research Center for Intelligent Computing Systems
>>>>> Institute of Computing technology
>>>>> Graduate University of Chinese Academy of Science
>>>>
>>>
>>
>>
>>
>> --
>> Kevin O'Dell
>> Customer Operations Engineer, Cloudera
>>