Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> What happened in hlog if data are deleted cuased by ttl?

Copy link to this message
Re: What happened in hlog if data are deleted cuased by ttl?
And also an interesting point is that the ttl data will not exist in
hfile. I have made the following test,

hbase(main):003:0> create 'test',{TTL=>'200',NAME=>'course'}
0 row(s) in 1.1420 seconds

hbase(main):005:0> put 'test','tom','course:english',90
0 row(s) in 0.0320 seconds

hbase(main):006:0> flush 'test'
0 row(s) in 0.1680 seconds

hbase(main):007:0> scan 'test'
ROW                   COLUMN+CELL
 tom                  column=course:english, timestamp=1345623867082, value=90
1 row(s) in 0.0350 seconds

./hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f
Scanning -> /hbase/test/abe4d5adaa650cdd46d26dca0bf85b72/course/8c77fb321f934592869f9852f777b22e
12/08/22 10:27:39 INFO hfile.CacheConfig: Allocating LruBlockCache
with maximum size 247.9m
Scanned kv count -> 1

so, I guess the ttl data is only managed in memstore. But the question
is that if memstore doesn't have enough size to accept new incoming
ttl data what will happen? Can anybody explain?


On Wed, Aug 22, 2012 at 10:19 AM, yonghu <[EMAIL PROTECTED]> wrote:
> I can fully understand normal deletion. But, in my point of view, ttl
> deletion is different than the normal deletion. The insertion of ttl
> data is recorded in hlog. But the ttl deletion is not recorded by
> hlog. So, it failure occurs, should the ttl data be reinserted to data
> or can we discard the certain ttl data? Moreover, ttl deletion is not
> executed at data compaction time. Scanner needs to periodically scan
> each Store file to execute deletion.
> regards!
> Yong
> On Tue, Aug 21, 2012 at 5:29 PM, jmozah <[EMAIL PROTECTED]> wrote:
>> This helped me http://hadoop-hbase.blogspot.in/2011/12/deletion-in-hbase.html
>> ./Zahoor
>> HBase Musings
>> On 14-Aug-2012, at 6:54 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>>> Hi Yonghu,
>>> A timestamp is stored along with each insert. The ttl is maintained at
>>> the region-store level. Hence, when the log replays, all entries with
>>> expired TTLs are automatically omitted.
>>> Also, TTL deletions happen during compactions, and hence do not
>>> carry/need Delete events. When scanning a store file, TTL-expired
>>> entries are automatically skipped away.
>>> On Tue, Aug 14, 2012 at 3:34 PM, yonghu <[EMAIL PROTECTED]> wrote:
>>>> My hbase version is 0.92. I tried something as follows:
>>>> 1.Created a table 'test' with 'course' in which ttl=5.
>>>> 2. inserted one row into the table. 5 seconds later, the row was deleted.
>>>> Later when I checked the log infor of 'test' table, I only found the
>>>> inserted information but not deleted information.
>>>> Can anyone tell me which information is written into hlog when data is
>>>> deleted by ttl or in this situation, no information is written into
>>>> the hlog. If there is no information of deletion in the log, how can
>>>> we guarantee the data recovered by log are correct?
>>>> Thanks!
>>>> Yong
>>> --
>>> Harsh J