Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> the occasion of the major compact?


Copy link to this message
-
Re: the occasion of the major compact?
Yes.

I have already used the way which suggested by Nicolas.

By the way which suggested by Lars, exporting the content of table, I
am not sure if it's a good idea. As I can't control the compactions,
the data completion can not be guaranteed. It means between two export
operations, if there are compactions happening, and then the deleted
data will be lost. BTW,  if I understand right, from Lars's
description, the deleted data might also be removed during the minor
compaction!

Thanks

Yong
On Thu, Jan 26, 2012 at 11:52 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> If you are planning to use trunk (what will be 0.94) you can also enable KEEP_DELETED_CELLS for your column families.
> That will keep deleted cells around (until they get removed because of # of versions, or TTL).
>
> Also note that version # and TTL checks are also performed during minor compactions and even during memstore flushes, and hence cells might be removed on those occasions as well.
>
> If you have time and space, you also backup your tables into text files (using export) and crunch them there (I added support for HBASE-4536) in export as well.
>
> -- Lars
>
>
> ----- Original Message -----
> From: yonghu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Cc:
> Sent: Thursday, January 26, 2012 1:22 PM
> Subject: Re: the occasion of the major compact?
>
> yes. I read this blog
> http://hadoop-hbase.blogspot.com/2011/12/raw-scans.html. And I thought
> if I could disable the major compact, it was possible to use the way
> described in the blog. Otherwise, the major compact will remove the
> deleted data.
>
> Thanks!
>
> Yong
>
> On Thu, Jan 26, 2012 at 10:11 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>> Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles yourself you have no way of actually getting to the deleted data.
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: yonghu <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Cc:
>> Sent: Thursday, January 26, 2012 1:00 PM
>> Subject: Re: the occasion of the major compact?
>>
>> Nicolas,
>>
>> In my use case, I want to extract the deleted data. Hence, if I
>> disable the major compaction, I can prevent the hbase to actually
>> delete the data. After extracting the deleted data, I can issue major
>> compact by myself.
>>
>> Regards
>>
>> Yong
>>
>> On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
>> <[EMAIL PROTECTED]> wrote:
>>> Yong,
>>>
>>> Can you please explain why you want to disable major compactions?  What
>>> are the problems that you're currently seeing or what are you worried will
>>> happen if a major compaction is allowed to occur?  Right now, there are
>>> only an extremely small subset of cases where you must explicitly disable
>>> compactions.  These use cases I know of are very complicated and require
>>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>>> you're not needing this.
>>>
>>> Please also read my follow up commentary to explaining major compaction
>>> logic:
>>> http://search-hadoop.com/m/JR9sK1xnbj21
>>> http://search-hadoop.com/m/X7W7q1xnbj21
>>>
>>>
>>> The vast majority of users need features completely unrelated to
>>> compactions.  The compaction algorithm is an easy target to worry about.
>>>
>>>
>>> On 1/26/12 7:06 AM, "yonghu" <[EMAIL PROTECTED]> wrote:
>>>
>>>>Hello Mikael,
>>>>
>>>>I think disabling the major compaction in the timed and client-issued
>>>>situation is not a problem. The problem is the size-based. From the
>>>>mailing list, it only talks about the situation of minor compaction
>>>>not major compaction, if I understand right. So, I want to know if
>>>>someone can tell me how to close the major compaction in size-based
>>>>situation.
>>>>
>>>>Thanks
>>>>
>>>>Yong
>>>>I saw the description which indicating the size of store file can also
>>>>trigger major compaction.
>>>>
>>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <[EMAIL PROTECTED]>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB