Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Problem In Understanding Compaction Process


+
Anty 2013-02-20, 03:10
+
Sergey Shelukhin 2013-02-20, 22:07
Copy link to this message
-
Re: Problem In Understanding Compaction Process
Thanks Sergey
In my use case. I want to directly analyze the underlying HFiles, So i
can't tolerance duplicate data.

Can you give me some pointers about how to make this procedure atomic?

On Thu, Feb 21, 2013 at 6:07 AM, Sergey Shelukhin <[EMAIL PROTECTED]>wrote:

> There should be no duplicate records despite the file not being deleted -
> between the records with exact same key/version/etc., the newer file would
> be chosen by logical sequence. If that happens to be the same some choice
> (by time, or name), still one file will be chosen.
> Eventually, the file will be compacted again and disappear. Granted, by
> making the move atomic (via some meta/manifest file) we could avoid some
> overhead in this case at the cost of some added complexity, but it should
> be rather rare.
>
> On Tue, Feb 19, 2013 at 7:10 PM, Anty <[EMAIL PROTECTED]> wrote:
>
> > Hi: Guys
> >
> >       I have some problem in understanding the compaction process, Can
> > someone shed some light on me, much appreciate. Here is the problem:
> >
> >       Region Server after successfully generate the final compacted file,
> > it going through two steps:
> >        1. move the above compacted file into region's directory
> >        2. delete replaced files.
> >
> >        the above two steps are not atomic, if Region Server crash after
> > step1, and  before step2, then there are duplication records!  Is this
> > problem handled  in reading process , or there is another mechanism to
> fix
> > this?
> >
> > --
> > Best Regards
> > Anty Rao
> >
>

--
Best Regards
Anty Rao
+
Sergey Shelukhin 2013-02-25, 19:16
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB