Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?

Copy link to this message
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?
On Wed, Oct 05, 2011 at 10:40PM, Jagane Sundar wrote:
> On Wed, Oct 5, 2011 at 10:09 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote:
> > On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote:
> > > approaches you are familiar with. Chef/Puppet et. al. are not interesting
> > to
> >
> > Is this a technical lack of interest as in these solutions do not perform
> > as
> > you expect them or this is a policy thing of some kind?
> >
> No policy or anything of that sort. It's a personal preference. Chef,
> puppet, etc. are not full feedback systems. They keep doing the same thing
> over and over again trying to to get the system into a 'desired' state. A
> state machine driven full feedback system works better. When things go
> wrong, that information can be acted upon.

It might be considered as a shortcoming or a design benefit - depending on
one's angle. I don't want to start a religious war about this, apparently ;)

> > > turned out to be slow as sh**, they seem to have hacked the HDFS layer
> > some
> > > more, in order to actually have a NameNode for metadata, but to use S3
> > for
> > > storing blocks. They have a protocol s3 to access this. Both of these
> > > approaches have one severe failing - they do not support Append and
> > Hflush.
> > > ergo - no HBase on EMR. I am sure they are working furiously to address
> > this
> >
> > I wonder if you can delve into these details: is it an inherit problem of
> > s3
> > protocol or something irrelevant to the technicalities?
> >
> I don't know nearly enough. I would speculate that it is because of S3's
> roots as a HTTP based system. It was mostly REST and SOAP Apis that S3 used
> to publish. I know that people have built full blown FUSE filesystems using

That makes sense. Although HTTP supports chunky uploads (e.g. multiparts) but
it doesn't seem enough for append's needs of course.