Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> delete rows without writing HLog may be appear in the future?


Copy link to this message
-
Re: delete rows without writing HLog may be appear in the future?
Hi Bing,

I think you are referring to a memstore flush.
The HLog represents the set of changes that are in the memstore (in ram) but not in an HFile on disk, yet.
I am pretty sure there is no flaw in the flush/compaction logic when it comes to deletes.
If you do not write the deletes to the WAL and the RS crashes it is expected that deletes there were not flushed to disk are lost.

(And there's also HBASE-6059, which in some case resurfaces deleted data even when it was flushed to the WAL).
-- Lars
________________________________
 From: Bing Jiang <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Wednesday, November 21, 2012 8:36 PM
Subject: Re: delete rows without writing HLog may be appear in the future?
 
I think when compaction is intrigued, if the records has already flushed
into hdfs, whether it is worthless to retain the Hlog before that timestamp.
In other ways, for example, some rows are deleted, then it executes a
compaction, at the same time , the rows do not exist. So the hlog before
the timestamp of compaction is not useful, and we can drop these unused wal.
This is view of my own,  please correct me if wrong.
---
Bing
2012/11/22 ramkrishna vasudevan <[EMAIL PROTECTED]>

> Sorry Bing.. am not much clear as what you suggest
> 'One idea occurs to me why not check or restore wal when compaction
> executes. If it does, hbase can drop some unused hlog'.
>
> Could you be more clear?  Are you trying to read the WAL while compaction
> is going on?
>
> Regards
> Ram
>
> On Thu, Nov 22, 2012 at 9:23 AM, Bing Jiang <[EMAIL PROTECTED]
> >wrote:
>
> > In our hbase cluster, I test if delete records with hlog or without.
> > Attachment is my my test.
> > The result of test can testify why I make a decision of delete rows
> > without hlog .
> >
> >
> >
> > 2012/11/22 Bing Jiang <[EMAIL PROTECTED]>
> >
> >> Thanks for all your suggestion and talk.
> >> One idea occurs to me why not check or restore wal when compaction
> >> executes. If it does, hbase can drop some unused hlog, I think that
> will be
> >> effective to the issue.
> >> please correct me if I am wrong.
> >>
> >> ---Bing
> >>
> >> 2012/11/22 lars hofhansl <[EMAIL PROTECTED]>
> >>
> >>> I have it on my list of things to do to allow deferred WAL flush as a
> >>> per operation option (right now it's a CF option).
> >>> You really do not want to do anything with the WAL off. If you use
> >>> deferred flush there is still a chance that this might happen (the RS
> could
> >>> die in the few seconds after a Delete before it is flushed to the
> WAL), but
> >>> it should be a rare occurrance.
> >>>
> >>>
> >>> -- Lars
> >>>
> >>>
> >>>
> >>> ________________________________
> >>>  From: Bing Jiang <[EMAIL PROTECTED]>
> >>> To: [EMAIL PROTECTED]
> >>> Sent: Wednesday, November 21, 2012 7:20 AM
> >>> Subject: Re: delete rows without writing HLog may be appear in the
> >>> future?
> >>>
> >>> we need to confirm that put must be safe,but deletes must be quick and
> >>> low-latency.
> >>> On Nov 21, 2012 11:10 PM, "Michael Segel" <[EMAIL PROTECTED]>
> >>> wrote:
> >>>
> >>> > Some time later?
> >>> >
> >>> > Time of course is relative, so I have to ask what occurred between
> the
> >>> > write and the delete?
> >>> > How much time? Did you have any compactions in between the write and
> >>> the
> >>> > delete?
> >>> >
> >>> > Why are you not consistent in your use of the WAL ?
> >>> >
> >>> >
> >>> > On Nov 21, 2012, at 6:37 AM, Bing Jiang <[EMAIL PROTECTED]>
> >>> wrote:
> >>> >
> >>> > > hi,all.
> >>> > > I want to describe a phenomenon that happens to our hbase cluster.
> >>> > > I use puts(List<Put>) to insert many records with writing hlog
> >>> enable,
> >>> > > and some time later I delete all of these records with writing hlog
> >>> > disable.
> >>> > > When one week later, i scan the table, I found some records I have
> >>> delete
> >>> > > reappear again.
> >>> > > It is an interesting case. In my opinion, if we delete data without

Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB