Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> [DISCUSS] Hadoop Security Release off Yahoo! patchset


Copy link to this message
-
Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
On Fri, Jan 14, 2011 at 10:25 AM, Eric Baldeschwieler
<[EMAIL PROTECTED]> wrote:
> 2) append is hard. It is so hard we rewrote the entire write pipeline (5 person-years work) in trunk after giving up on the codeline you are suggesting we merge in. That work is what distinguishes all post 20 releases from 20 releases in my mind. I dont trust the 20 append code line. We've been hurt badly by it.  We did the rewrite only after losing a bunch of production data a bunch of times with the previous code line.  I think the various 20 append patch lines may be fine for specialized hbase clusters, but they doesn't have the rigor behind them to bet your business in them.
>

Eric:

A few comments on the above:

+ Append has had a bunch of work done on it since the Y! dataloss of a
few years ago on an ancestor of the branch-0.20-append codebase (IIRC
the issue you refer to in particular -- the 'dataloss' because
partially written blocks were done up in tmp dirs, and on cluster
restart, tmp data was cleared -- has been fixed in
branch-0.20.append).
+ You may not trust 0.20-append (or its close cousin over in CDH) but
a bunch of HBasers do. On the one hand, we have little choice.  Until
the *new* append becomes available in a stable Hadoop the HBase
project has had to sustain itself (What you think?, 3-6 months before
we see 0.22?  HBase project can't hold its breath that long).  On
other hand, the branch-0.20-append work has been carried out by lads
(and lasses!) who know their HDFS.  Its true that it will not have
been tested with Y! rigor but near-derivatives -- CDH or the FB
branches -- already do HDFS-200-based append in production.

St.Ack
P.S. Don't get me wrong.  HBase is looking forward to *new* append.
We just need something to suck on meantime.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB