|
Jagane Sundar
2011-10-02, 23:57
Milind.Bhandarkar@...
2011-10-05, 22:55
Jagane Sundar
2011-10-05, 23:20
Milind.Bhandarkar@...
2011-10-05, 23:55
Jagane Sundar
2011-10-06, 02:00
Roman Shaposhnik
2011-10-06, 02:38
Konstantin Boudnik
2011-10-06, 05:09
Jagane Sundar
2011-10-06, 05:40
Konstantin Boudnik
2011-10-06, 05:58
Steve Loughran
2011-10-06, 09:54
Steve Loughran
2011-10-06, 09:48
Milind.Bhandarkar@...
2011-10-06, 16:49
Steve Loughran
2011-10-07, 09:17
Milind.Bhandarkar@...
2011-10-07, 16:23
Konstantin Boudnik
2011-10-07, 19:05
Roman Shaposhnik
2011-10-06, 02:33
Jagane Sundar
2011-10-02, 23:15
|
-
Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Jagane Sundar 2011-10-02, 23:57
Hello Hadoop experts,
I would like to solicit your input in answering this question. Which proposed distro of Hadoop, 0.20.206 or 0.22, is likely to be the better platform for hosting HBase? My requirements are as follows: 1. The Hadoop must support both HBase and MR jobs in the same cluster. At the very least, MR should be stable and usable for data extraction and transformation from external sources. Ideally, there should be no limits on the types of MR jobs that can be run on the HBase cluster. To the best of my understanding, this implies robust and stable Append and Hflush in HDFS, correct? 2. I want to scale storage independently from compute. For example, if my dataset is 1PB, I expect to make a three replica HDFS cluster of ~150 machines with 24TB each. As for MR and HBase compute, I may want to run anywhere from 50 to 200 machines. Perhaps even scaled on demand, i.e. bring up more machines into the MR cluster when there is more work to be done, and bring down some machines when there is less demand. I think that the MR1 Jobtracker can deal with machines coming in and going out well, but I am not too sure of how HBase works under such dynamic conditions. This example also indicates the scale that I am most interested in - 1 to 2 PB of data, with a dynamically varying compute requirement. Will my choice of 0.20.206 or 0.22 affect any of this? 3. Cloud(EC2 or some similar homebrew) friendly: I am talking about hosting HBase in HDFS on EBS volumes, not HBase on s3 accessed using the s3n protocol, or HBase on HDFS with blocks stored in S3 and accessed using the s3 protocol. There are two vectors to this - the storage itself, i.e. storage performance and efficiency, and the deployment mechanism - whirr or Ambari or pre-built AMIs with scripts cobbled together. Which release is likely to have out-of-the-box support for HBase on HDFS in EBS volumes, and for whirr/Ambari/AMIs? 4. Support for data efficiency improvements such as Erasure Coding - https://issues.apache.org/**jira/browse/HDFS-503<https://issues.apache.org/jira/browse/HDFS-503>. Keeping 3 replicas of big data feels like an expensive proposition. Will 0.20.206 or 0.22 include the above patch as part of the base distro, or at least as an easy to add binary module of some kind? 5. Compatibility with future versions of Hadoop: If I make the (tenuous) argument that data locality does not matter much, that I have 4Gbps from each node, that I have 40 Gbps up from each rack, can I separate the storage from the compute? What I mean is this: I may want to upgrade HDFS less frequently than MR or HBase. So, is there a snowball's chance in hell of running HDFS 0.20.206 or 0.22 against MR 0.23 and HBase-whatever-comes-next- **year? Thanks in advance, and cheers to a vibrant healthy Hadoop community, Jagane +
Jagane Sundar 2011-10-02, 23:57
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Milind.Bhandarkar@... 2011-10-05, 22:55
Jagane,
I think you have forgotten one major deciding factor: Which version is *your* vendor committed to support ? If you are at the same place where you were the last time we met, you have no other choice but to go with 0.20.206. It's in the contract ! :-) - Milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.) On 10/2/11 4:57 PM, "Jagane Sundar" <[EMAIL PROTECTED]> wrote: >Hello Hadoop experts, > >I would like to solicit your input in answering this question. Which >proposed distro of Hadoop, 0.20.206 or 0.22, is likely to be the better >platform for hosting HBase? > >My requirements are as follows: > >1. The Hadoop must support both HBase and MR jobs in the same cluster. At >the very least, MR should be stable and usable for data extraction and >transformation from external sources. Ideally, there should be no limits >on >the types of MR jobs that can be run on the HBase cluster. To the best of >my >understanding, this implies robust and stable Append and Hflush in HDFS, >correct? > >2. I want to scale storage independently from compute. For example, if my >dataset is 1PB, I expect to make a three replica HDFS cluster of ~150 >machines with 24TB each. As for MR and HBase compute, I may want to run >anywhere from 50 to 200 machines. Perhaps even scaled on demand, i.e. >bring >up more machines into the MR cluster when there is more work to be done, >and >bring down some machines when there is less demand. I think that the MR1 >Jobtracker can deal with machines coming in and going out well, but I am >not >too sure of how HBase works under such dynamic conditions. This example >also >indicates the scale that I am most interested in - 1 to 2 PB of data, >with a >dynamically varying compute requirement. Will my choice of 0.20.206 or >0.22 >affect any of this? > >3. Cloud(EC2 or some similar homebrew) friendly: I am talking about >hosting >HBase in HDFS on EBS volumes, not HBase on s3 accessed using the s3n >protocol, or HBase on HDFS with blocks stored in S3 and accessed using the >s3 protocol. There are two vectors to this - the storage itself, i.e. >storage performance and efficiency, and the deployment mechanism - whirr >or >Ambari or pre-built AMIs with scripts cobbled together. Which release is >likely to have out-of-the-box support for HBase on HDFS in EBS volumes, >and >for whirr/Ambari/AMIs? > >4. Support for data efficiency improvements such as Erasure Coding - >https://issues.apache.org/**jira/browse/HDFS-503<https://issues.apache.org >/jira/browse/HDFS-503>. >Keeping 3 replicas of big data feels like an expensive proposition. Will >0.20.206 or 0.22 include the above patch as part of the base distro, or at >least as an easy to add binary module of some kind? > >5. Compatibility with future versions of Hadoop: If I make the (tenuous) >argument that data locality does not matter much, that I have 4Gbps from >each node, that I have 40 Gbps up from each rack, can I separate the >storage >from the compute? What I mean is this: I may want to upgrade HDFS less >frequently than MR or HBase. So, is there a snowball's chance in hell of >running HDFS 0.20.206 or 0.22 against MR 0.23 and >HBase-whatever-comes-next- >**year? > >Thanks in advance, and cheers to a vibrant healthy Hadoop community, >Jagane +
Milind.Bhandarkar@... 2011-10-05, 22:55
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Jagane Sundar 2011-10-05, 23:20
Hello Milind,
A large part of why I sent this email out was to initiate a discussion of the priority of specific features in a Hadoop distro. For example, if we had a distro with support for the following features: 1. Hbase support, i.e. working scale tested Append and Hflush in HDFS 2. Built in support for the cloud. (Whirr is interesting. Ambari more so, but both fall short.) 3. Assumption that 10GBE is around the corner (really, this time), and hence storage locality is irrelevant 4. Storage efficiency is important. Alternatives to a 3 replica HDFS, such as erasure code, should be first class citizens in this distro. 5. H/A for the NN Such a distro would be an outstanding thing for the Hadoop community. I think 0.20.20x is the closest to this, but I am not sure. My hope is that this discussion will get some input from users of Hadoop. I may be wrong, as this may be the wrong forum for this discussion. (The only thing I really accomplished was to evoke a hurried and semi-infuriated Sunday afternoon private email response from some key players in the Hadoop community). My ultimate goal is to influence the product managers at Hadoop startups and established companies to assign high priorities to these items. In short, I don't own the whip, the buggy, or the horse ... but I am trying to crack the whip. :-) Milind - I do look forward to your input as to the importance of these features, and whether these are feasible in one of the source branches in the near future. Cheers, Jagane On Wed, Oct 5, 2011 at 3:55 PM, <[EMAIL PROTECTED]> wrote: > Jagane, > > I think you have forgotten one major deciding factor: > > Which version is *your* vendor committed to support ? > > If you are at the same place where you were the last time we met, you have > no other choice but to go with 0.20.206. It's in the contract ! :-) > > - Milind > > --- > Milind Bhandarkar > Greenplum Labs, EMC > (Disclaimer: Opinions expressed in this email are those of the author, and > do not necessarily represent the views of any organization, past or > present, the author might be affiliated with.) > > > > On 10/2/11 4:57 PM, "Jagane Sundar" <[EMAIL PROTECTED]> wrote: > > >Hello Hadoop experts, > > > >I would like to solicit your input in answering this question. Which > >proposed distro of Hadoop, 0.20.206 or 0.22, is likely to be the better > >platform for hosting HBase? > > > >My requirements are as follows: > > > >1. The Hadoop must support both HBase and MR jobs in the same cluster. At > >the very least, MR should be stable and usable for data extraction and > >transformation from external sources. Ideally, there should be no limits > >on > >the types of MR jobs that can be run on the HBase cluster. To the best of > >my > >understanding, this implies robust and stable Append and Hflush in HDFS, > >correct? > > > >2. I want to scale storage independently from compute. For example, if my > >dataset is 1PB, I expect to make a three replica HDFS cluster of ~150 > >machines with 24TB each. As for MR and HBase compute, I may want to run > >anywhere from 50 to 200 machines. Perhaps even scaled on demand, i.e. > >bring > >up more machines into the MR cluster when there is more work to be done, > >and > >bring down some machines when there is less demand. I think that the MR1 > >Jobtracker can deal with machines coming in and going out well, but I am > >not > >too sure of how HBase works under such dynamic conditions. This example > >also > >indicates the scale that I am most interested in - 1 to 2 PB of data, > >with a > >dynamically varying compute requirement. Will my choice of 0.20.206 or > >0.22 > >affect any of this? > > > >3. Cloud(EC2 or some similar homebrew) friendly: I am talking about > >hosting > >HBase in HDFS on EBS volumes, not HBase on s3 accessed using the s3n > >protocol, or HBase on HDFS with blocks stored in S3 and accessed using the > >s3 protocol. There are two vectors to this - the storage itself, i.e. > >storage performance and efficiency, and the deployment mechanism - whirr +
Jagane Sundar 2011-10-05, 23:20
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Milind.Bhandarkar@... 2011-10-05, 23:55
Jagane,
I understand your use case, I think, and so here are my thoughts, inline: >1. Hbase support, i.e. working scale tested Append and Hflush in HDFS Absolutely. Hbase (and other components of the stack that do not follow the MapReduce paradigm) are increasingly important. It is important to realize that as Hadoop gains popularity, people will look at consolidating their workloads, and are going to need at least the baseline features such as append and flush to achieve that. >2. Built in support for the cloud. (Whirr is interesting. Ambari more so, >but both fall short.) Not very sure. If by "support for the cloud" means ability to provision atop a hypervisor, adding or removing instances etc, I think there are other approaches proven in the industry. >3. Assumption that 10GBE is around the corner (really, this time), and >hence >storage locality is irrelevant Yes, I have been shouting over the rooftops about this for quite some time now. >4. Storage efficiency is important. Alternatives to a 3 replica HDFS, such >as erasure code, should be first class citizens in this distro. Absolutely. Usable space is much more important than raw space. >5. H/A for the NN Yes, it's a must. Some proprietary file systems that provide o.a.h.f.FileSystem API have this feature already, and getting a lot of positive press recently. > >Such a distro would be an outstanding thing for the Hadoop community. I >think 0.20.20x is the closest to this, but I am not sure. Other than the merge of 0.20-append patches into 0.20.205, I am not aware of any other changes that address any of your requirements 1-5. >My hope is that this discussion will get some input from users of Hadoop. >I >may be wrong, as this may be the wrong forum for this discussion. (The >only >thing I really accomplished was to evoke a hurried and semi-infuriated >Sunday afternoon private email response from some key players in the >Hadoop >community). Yeah, some key players in hadoop community are infuriated on Sunday afternoons, based on my informal sentiment analysis of twitter streams. ;-) >My ultimate goal is to influence the product managers at Hadoop startups >and >established companies to assign high priorities to these items. Believe me, I know some product managers at Hadoop startups and established companies, who have a slide highlighting most of the above already. >In short, I don't own the whip, the buggy, or the horse ... but I am >trying >to crack the whip. :-) Ha Ha ! Interesting analogy. But this is open-source world. Here no one "owns" (or at least, not supposed to own) the whip, buggy, or horse. So, you are not alone :-) >Milind - I do look forward to your input as to the importance of these >features, and whether these are feasible in one of the source branches in >the near future. Indeed, these are feasible. Indeed these are important, and indeed they will be in one of the source branches in future. I don¹t know about *near* future, though. - Milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.) +
Milind.Bhandarkar@... 2011-10-05, 23:55
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Jagane Sundar 2011-10-06, 02:00
Thanks for your input, Milind. It's very useful and interesting.
In the interest of brevity, I have truncated most of it except for the point regarding 'cloud friendly'. I have done some research into this, and want to get some more community feedback. >2. Built in support for the cloud. (Whirr is interesting. Ambari more so, > >but both fall short.) > > Not very sure. If by "support for the cloud" means ability to provision > atop a hypervisor, adding or removing instances etc, I think there are > other approaches proven in the industry. > > There are two aspects to cloud friendliness - deployment technologies/automation, and storage. As far as deployment automation is concerned, I am eager to know what other approaches you are familiar with. Chef/Puppet et. al. are not interesting to me. I want this to have end user self-serve service characteristics, not 'end users file ticket, sysadmin runs [chef|puppet|other] script'. Storage is very interesting. My own thoughts, from analyzing EC2 and EMR are as follows. (A lot of the following is speculation and educated guesswork, so I may be totally off, but here it is anyway): Amazon's philosophy is totally 'on-demand bring up when needed and tear down when done'. I like this philosophy a lot. However it does not work well for storage. Storage needs to be always up and available. Hence, they took Hadoop, stripped off HDFS and built a shim to S3, their object storage service. There is no posix there. Map Reduce jobs run in VMs that are brought up on demand, and access the S3 hosted files using the protocol s3n (n stands for native - that's native to s3 not native to Hadoop). When this turned out to be slow as sh**, they seem to have hacked the HDFS layer some more, in order to actually have a NameNode for metadata, but to use S3 for storing blocks. They have a protocol s3 to access this. Both of these approaches have one severe failing - they do not support Append and Hflush. ergo - no HBase on EMR. I am sure they are working furiously to address this shortcoming and add append/hflush support to s3n or s3, in order to make it possible to run HBase on EMR. In the meantime, anecdotal evidence suggests that at least half of Amazon's customers are opting to use Apache Hadoop on EC2 VMs with EBS storage (completely bypassing the EMR offering). EBS itself is an interesting storage technology. It is block storage offered over the ethernet network, from an occassionally sync'd local disk elsewhere. EBS has some storage resiliency built in, so the question of how many replicas when HDFS is built on top of this is very interesting. This problem of offering a cost effective Hadoop as an on-demand self service offering in the cloud is very interesting. This is a nut I want to crack.... Sorry about the long rant, and again, it is all in the hope that I can evoke some postings from people who know more about this than I do. Thanks, Jagane +
Jagane Sundar 2011-10-06, 02:00
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Roman Shaposhnik 2011-10-06, 02:38
On Wed, Oct 5, 2011 at 7:00 PM, Jagane Sundar <[EMAIL PROTECTED]> wrote:
> As far as deployment automation is concerned, I am eager to know what other > approaches you are familiar with. Chef/Puppet et. al. are not interesting to > me. I want this to have end user self-serve service characteristics, not > 'end users file ticket, sysadmin runs [chef|puppet|other] script'. We have a fully automated continuous deployment based on Puppet running test clusters in Bigtop. It is not perfect, but it works for us. We'll post the code soon (it just needs to be cleaned up a bit): https://issues.apache.org/jira/browse/BIGTOP-95 Thanks, Roman. P.S. For all those interested in 100% open source deployment solution, I would totally recommend using Chef/Puppet for any cluster up to 500 nodes or so. I do not understand why Chef/Puppet is given a bad rep when it comes to deploying Hadoop stacks. These tools are way more mature than any other OS alternative I've seen. +
Roman Shaposhnik 2011-10-06, 02:38
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Konstantin Boudnik 2011-10-06, 05:09
On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote:
> approaches you are familiar with. Chef/Puppet et. al. are not interesting to Is this a technical lack of interest as in these solutions do not perform as you expect them or this is a policy thing of some kind? > turned out to be slow as sh**, they seem to have hacked the HDFS layer some > more, in order to actually have a NameNode for metadata, but to use S3 for > storing blocks. They have a protocol s3 to access this. Both of these > approaches have one severe failing - they do not support Append and Hflush. > ergo - no HBase on EMR. I am sure they are working furiously to address this I wonder if you can delve into these details: is it an inherit problem of s3 protocol or something irrelevant to the technicalities? Appreciate your feedback, Cos +
Konstantin Boudnik 2011-10-06, 05:09
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Jagane Sundar 2011-10-06, 05:40
On Wed, Oct 5, 2011 at 10:09 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote:
> On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote: > > approaches you are familiar with. Chef/Puppet et. al. are not interesting > to > > Is this a technical lack of interest as in these solutions do not perform > as > you expect them or this is a policy thing of some kind? > No policy or anything of that sort. It's a personal preference. Chef, puppet, etc. are not full feedback systems. They keep doing the same thing over and over again trying to to get the system into a 'desired' state. A state machine driven full feedback system works better. When things go wrong, that information can be acted upon. > > turned out to be slow as sh**, they seem to have hacked the HDFS layer > some > > more, in order to actually have a NameNode for metadata, but to use S3 > for > > storing blocks. They have a protocol s3 to access this. Both of these > > approaches have one severe failing - they do not support Append and > Hflush. > > ergo - no HBase on EMR. I am sure they are working furiously to address > this > > I wonder if you can delve into these details: is it an inherit problem of > s3 > protocol or something irrelevant to the technicalities? > > I don't know nearly enough. I would speculate that it is because of S3's roots as a HTTP based system. It was mostly REST and SOAP Apis that S3 used to publish. I know that people have built full blown FUSE filesystems using S3 as the backend, but these tend to be used as a replacement for scp and ftp, but not necessarily for running applications that need full POSIX. Internally, there are probably other APIs that are available to EMR, but still, it feels like they may be stressing S3 in ways that are not natural to it. Jagane Jagane +
Jagane Sundar 2011-10-06, 05:40
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Konstantin Boudnik 2011-10-06, 05:58
On Wed, Oct 05, 2011 at 10:40PM, Jagane Sundar wrote:
> On Wed, Oct 5, 2011 at 10:09 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > > > On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote: > > > approaches you are familiar with. Chef/Puppet et. al. are not interesting > > to > > > > Is this a technical lack of interest as in these solutions do not perform > > as > > you expect them or this is a policy thing of some kind? > > > > No policy or anything of that sort. It's a personal preference. Chef, > puppet, etc. are not full feedback systems. They keep doing the same thing > over and over again trying to to get the system into a 'desired' state. A > state machine driven full feedback system works better. When things go > wrong, that information can be acted upon. It might be considered as a shortcoming or a design benefit - depending on one's angle. I don't want to start a religious war about this, apparently ;) > > > turned out to be slow as sh**, they seem to have hacked the HDFS layer > > some > > > more, in order to actually have a NameNode for metadata, but to use S3 > > for > > > storing blocks. They have a protocol s3 to access this. Both of these > > > approaches have one severe failing - they do not support Append and > > Hflush. > > > ergo - no HBase on EMR. I am sure they are working furiously to address > > this > > > > I wonder if you can delve into these details: is it an inherit problem of > > s3 > > protocol or something irrelevant to the technicalities? > > > I don't know nearly enough. I would speculate that it is because of S3's > roots as a HTTP based system. It was mostly REST and SOAP Apis that S3 used > to publish. I know that people have built full blown FUSE filesystems using That makes sense. Although HTTP supports chunky uploads (e.g. multiparts) but it doesn't seem enough for append's needs of course. Thanks, Cos +
Konstantin Boudnik 2011-10-06, 05:58
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Steve Loughran 2011-10-06, 09:54
On 06/10/11 06:40, Jagane Sundar wrote:
> On Wed, Oct 5, 2011 at 10:09 PM, Konstantin Boudnik<[EMAIL PROTECTED]> wrote: > >> On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote: >>> approaches you are familiar with. Chef/Puppet et. al. are not interesting >> to >> >> Is this a technical lack of interest as in these solutions do not perform >> as >> you expect them or this is a policy thing of some kind? >> > > No policy or anything of that sort. It's a personal preference. Chef, > puppet, etc. are not full feedback systems. They keep doing the same thing > over and over again trying to to get the system into a 'desired' state. A > state machine driven full feedback system works better. When things go > wrong, that information can be acted upon. You've just started the script vs goal seeking CM war. goal-seeking has better recovery, but can jitter between different desired states in the classic strange attractor patten. It federates well, and if you look at intranet and internet routing, it's roughly what happens there, though BGP lets autonomous networks make their own policy decisions. Scripts work if the starting state is always the same; they put you in the same final state. Usually. for more details, see: http://www.slideshare.net/steve_l/dynamic-hadoop-clusters In large clusters you want consistent machine state and that is where HPC-style solutions win: they can work with ILO nets to do things like BIOS upgrades, and install the RPM set in a strict order to ensure the final state is the same. (RPMs contain scripts, remember). DevOps style tooling is good for dynamic clusters, esp on cloud infrastructure -which is the most dynamic, and where you can manage a single starting image, which makes it implicitly consistent across all nodes. -steve +
Steve Loughran 2011-10-06, 09:54
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Steve Loughran 2011-10-06, 09:48
On 06/10/11 03:00, Jagane Sundar wrote:
> Thanks for your input, Milind. It's very useful and interesting. > > In the interest of brevity, I have truncated most of it except for the point > regarding 'cloud friendly'. I have done some research into this, and want to > get some more community feedback. > >> 2. Built in support for the cloud. (Whirr is interesting. Ambari more so, >>> but both fall short.) >> >> Not very sure. If by "support for the cloud" means ability to provision >> atop a hypervisor, adding or removing instances etc, I think there are >> other approaches proven in the industry. >> I've just started the wiki page on this topic: http://wiki.apache.org/hadoop/Virtual%20Hadoop > There are two aspects to cloud friendliness - deployment > technologies/automation, and storage. -agility to handle the failure modes of cloud infrastructure -security in a shared infrastructure -flexibility based on demand > As far as deployment automation is concerned, I am eager to know what other > approaches you are familiar with. Chef/Puppet et. al. are not interesting to > me. I want this to have end user self-serve service characteristics, not > 'end users file ticket, sysadmin runs [chef|puppet|other] script'. done this with a web UI: ask for the #of machines, bring up NN/JT/single DN master node, once that is up bring up the workers with a config that includes the hostname of the master node. Also deployed was a web UI for long haul job submission http://www.slideshare.net/steve_l/long-haul-hadoop That was originally all deployed with the SmartFrog framework and a modified version of the Hadoop codebase for tighter integration; these days I just untar and reconfigure a 0.20.20x .tar.gz file on the target machines, patching in the late binding information > > Storage is very interesting. My own thoughts, from analyzing EC2 and EMR are > as follows. (A lot of the following is speculation and educated guesswork, > so I may be totally off, but here it is anyway): > > Amazon's philosophy is totally 'on-demand bring up when needed and tear down > when done'. I like this philosophy a lot. However it does not work well for > storage. Storage needs to be always up and available. Hence, they took > Hadoop, stripped off HDFS and built a shim to S3, their object storage > service. There is no posix there. Map Reduce jobs run in VMs that are > brought up on demand, and access the S3 hosted files using the protocol s3n > (n stands for native - that's native to s3 not native to Hadoop). When this > turned out to be slow as sh**, they seem to have hacked the HDFS layer some > more, in order to actually have a NameNode for metadata, but to use S3 for > storing blocks. They have a protocol s3 to access this. Both of these > approaches have one severe failing - they do not support Append and Hflush. > ergo - no HBase on EMR. I am sure they are working furiously to address this > shortcoming and add append/hflush support to s3n or s3, in order to make it > possible to run HBase on EMR. In the meantime, anecdotal evidence suggests > that at least half of Amazon's customers are opting to use Apache Hadoop on > EC2 VMs with EBS storage (completely bypassing the EMR offering). More expensive, but more flexible in terms of what you can run > EBS itself > is an interesting storage technology. It is block storage offered over the > ethernet network, from an occassionally sync'd local disk elsewhere. EBS has > some storage resiliency built in, so the question of how many replicas when > HDFS is built on top of this is very interesting. > > > This problem of offering a cost effective Hadoop as an on-demand self > service offering in the cloud is very interesting. This is a nut I want to > crack.... > Summary: I'm not sure that HDFS is the right FS in this world, as it contains a lot of assumptions about system stability and HDD persistence that aren't valid any more. With the ability to plug in new placers you could do tricks like ensure 1 replica lives in a persistent blockstore (and rely on it always being there), and add other replicas in transient storage if the data is about to be needed in jobs. +
Steve Loughran 2011-10-06, 09:48
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Milind.Bhandarkar@... 2011-10-06, 16:49
Steve,
>Summary: I'm not sure that HDFS is the right FS in this world, as it >contains a lot of assumptions about system stability and HDD persistence >that aren't valid any more. With the ability to plug in new placers you >could do tricks like ensure 1 replica lives in a persistent blockstore >(and rely on it always being there), and add other replicas in transient >storage if the data is about to be needed in jobs. Can you please shed more light on the statement "... as it contains a lot of assumptions about system stability and HDD persistence that aren't valid any more..." ? I know that you were doing some analysis of disk failure modes sometime ago. Is this the result of that research ? I am very interested. Thanks, - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.) +
Milind.Bhandarkar@... 2011-10-06, 16:49
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Steve Loughran 2011-10-07, 09:17
On 06/10/2011 17:49, [EMAIL PROTECTED] wrote:
> Steve, > >> Summary: I'm not sure that HDFS is the right FS in this world, as it >> contains a lot of assumptions about system stability and HDD persistence >> that aren't valid any more. With the ability to plug in new placers you >> could do tricks like ensure 1 replica lives in a persistent blockstore >> (and rely on it always being there), and add other replicas in transient >> storage if the data is about to be needed in jobs. > > Can you please shed more light on the statement "... as it > contains a lot of assumptions about system stability and HDD persistence > that aren't valid any more..." ? > > I know that you were doing some analysis of disk failure modes sometime > ago. Is this the result of that research ? I am very interested. no, it's unrelated -experience in hosting virtual hadoop infrastructures. Which is how my short-lived clusters exist today -you don't know the hostname of the master nodes until allocated, so you need to allocate them and dynamically push out configs to the workers -the Datanodes spin when the namenode goes down, forever, rather than checking somewhere to see if its changed. HDFS HA may fix that. -It's dangerously easy to have >1 DN on the same physical host, losing independence of that replica. -It's possible for the entire cluster to go down without warning. MR-layer issues -again, the TaskTrackers spin when the JT goes down, rather than look to see if its moved. -Blacklisting isn't the right way to deal with task tracker failures: termination of VM is. -if the TT's are idle, VM termination may be the best action Hadoop is optimised for large physical clusters. If you look at the Stratosphere work at TuBerlin, they've designed something that includes VM allocation in the execution plan. you can improve Hadoop to make it more agile; my defunct Hadoop lifecycle branch did a lot of that, but you have to have everyone else using Hadoop to be willing to let the changes go in -and those changes mustn't impose a cost or risk to the physical cluster model. +
Steve Loughran 2011-10-07, 09:17
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Milind.Bhandarkar@... 2011-10-07, 16:23
Steve,
> >you can improve Hadoop to make it more agile; my defunct Hadoop >lifecycle branch did a lot of that, but you have to have everyone else >using Hadoop to be willing to let the changes go in -and those changes >mustn't impose a cost or risk to the physical cluster model. Until Hadoop 0.20, when Hadoop On Demand (HoD) was in widespread use, quickly bringing up a mapreduce cluster, and making it go away quickly, was an explicit goal. After that, focus shifted to multi-tenancy for MR in hadoop. When HoD went away, I made a comment on one of the internal mailing list, that it will make a comeback when Vms become first class citizens of the hadoop world. I have heard of several efforts from well-known vendors *wink* to make this happen. I have been looking closely at the defunct HoD code to see if it still can be used, but with the new MRv2 architecture, it looks like that will require major surgery. We can have the RM allocate containers, and should be able to run custom MR runtime there (essentially replacing torque in HoD with RM). Is this something you had in mind too ? - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.) +
Milind.Bhandarkar@... 2011-10-07, 16:23
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Konstantin Boudnik 2011-10-07, 19:05
On Fri, Oct 07, 2011 at 10:17AM, Steve Loughran wrote:
> On 06/10/2011 17:49, [EMAIL PROTECTED] wrote: >> Steve, >> >>> Summary: I'm not sure that HDFS is the right FS in this world, as it >>> contains a lot of assumptions about system stability and HDD persistence >>> that aren't valid any more. With the ability to plug in new placers you >>> could do tricks like ensure 1 replica lives in a persistent blockstore >>> (and rely on it always being there), and add other replicas in transient >>> storage if the data is about to be needed in jobs. >> >> Can you please shed more light on the statement "... as it >> contains a lot of assumptions about system stability and HDD persistence >> that aren't valid any more..." ? >> >> I know that you were doing some analysis of disk failure modes sometime >> ago. Is this the result of that research ? I am very interested. > > no, it's unrelated -experience in hosting virtual hadoop > infrastructures. Which is how my short-lived clusters exist today > > -you don't know the hostname of the master nodes until allocated, so you > need to allocate them and dynamically push out configs to the workers This is of course is a big win for non-autodiscoverable architecture ;) > -the Datanodes spin when the namenode goes down, forever, rather than > checking somewhere to see if its changed. HDFS HA may fix that. .. > -again, the TaskTrackers spin when the JT goes down, rather than look to > see if its moved. .. > -Blacklisting isn't the right way to deal with task tracker failures: > termination of VM is. See my above comment. Auto-discovery would solve a lot of these issues and many others such as shared distributed memory suitable for condig management etc. Cos +
Konstantin Boudnik 2011-10-07, 19:05
-
Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Roman Shaposhnik 2011-10-06, 02:33
Hi Jagane!
On Wed, Oct 5, 2011 at 4:20 PM, Jagane Sundar <[EMAIL PROTECTED]> wrote: > For example, if we had a distro with support for the following features: At the risk of repeating myself I'd like to point out that even though your ideal distro my not exist at the moment we, as a community, have all the tools in place to build/validate/package and release any kind of Hadoop stack. This is what Project Bigtop aims at solving. > Such a distro would be an outstanding thing for the Hadoop community. I > think 0.20.20x is the closest to this, but I am not sure. Right now we're trying to build a distro as a release of Bigtop 0.2.0 that could be close to what you need. It is likely to be based on Hadoop 0.22. You can monitor a progress of our work (which will be slowish this and next week) over here: http://bigtop01.cloudera.org:8080/job/Bigtop-hadoop22-smoketest/ This is a link to the smoke testing that we do to the entire stack of components based on .22. The number of tests we execute is rather small at the moment and the cluster is simply a continuous deployment of our stack on a hadnfull of EC2 nodes, but it is a good start. And it is totally transparent and open to contributions! > My ultimate goal is to influence the product managers at Hadoop startups and > established companies to assign high priorities to these items. Well, unless one of the big players is willing to chime in and solve your problems with an existing distribution I think the next best thing is to approach it as an exercise of building what you want through a community process. This is not an appropriate list to discuss the details (the Bigtop mailing list is) but it is for soliciting interested parties (who whould then migrate to Bigtops mailing list ;-)). If we can get enough folks willing to contribute towards the goal that you've outlined -- I'm sure it'll happen pretty soon (if not -- there's a number of organizations on this list who would be happy to talk to you about a commercial support contract). If there's interest -- I'd like to mention that the biggest areas where we need help at the moment is contributing integration tests to our test base. Talk to me either on the bigtop mailing list or in private. Thanks, Roman. P.S. And did I mention Bigtop mailing list? It is: [EMAIL PROTECTED] +
Roman Shaposhnik 2011-10-06, 02:33
-
Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?Jagane Sundar 2011-10-02, 23:15
Hello Hadoop experts,
I would like to solicit your input in answering this question. Which proposed distro of Hadoop, 0.20.206 or 0.22, is likely to be the better platform for hosting HBase? My requirements are as follows: 1. The Hadoop must support both HBase and MR jobs in the same cluster. At the very least, MR should be stable and usable for data extraction and transformation from external sources. Ideally, there should be no limits on the types of MR jobs that can be run on the HBase cluster. To the best of my understanding, this implies robust and stable Append and Hflush in HDFS, correct? 2. I want to scale storage independently from compute. For example, if my dataset is 1PB, I expect to make a three replica HDFS cluster of ~150 machines with 24TB each. As for MR and HBase compute, I may want to run anywhere from 50 to 200 machines. Perhaps even scaled on demand, i.e. bring up more machines into the MR cluster when there is more work to be done, and bring down some machines when there is less demand. I think that the MR1 Jobtracker can deal with machines coming in and going out well, but I am not too sure of how HBase works under such dynamic conditions. This example also indicates the scale that I am most interested in - 1 to 2 PB of data, with a dynamically varying compute requirement. Will my choice of 0.20.206 or 0.22 affect any of this? 3. Cloud(EC2 or some similar homebrew) friendly: I am talking about hosting HBase in HDFS on EBS volumes, not HBase on s3 accessed using the s3n protocol, or HBase on HDFS with blocks stored in S3 and accessed using the s3 protocol. There are two vectors to this - the storage itself, i.e. storage performance and efficiency, and the deployment mechanism - whirr or Ambari or pre-built AMIs with scripts cobbled together. Which release is likely to have out-of-the-box support for HBase on HDFS in EBS volumes, and for whirr/Ambari/AMIs? 4. Support for data efficiency improvements such as Erasure Coding -https://issues.apache.org/jira/browse/HDFS-503 <https://webmail.hmc1.comcast.net/owa/redir.aspx?C=5c8cee83e316488bb1d915029c50b7f4&URL=https%3a%2f%2fissues.apache.org%2fjira%2fbrowse%2fHDFS-503>. Keeping 3 replicas of big data feels like an expensive proposition. Will 0.20.206 or 0.22 include the above patch as part of the base distro, or at least as an easy to add binary module of some kind? 5. Compatibility with future versions of Hadoop: If I make the (tenuous) argument that data locality does not matter much, that I have 4Gbps from each node, that I have 40 Gbps up from each rack, can I separate the storage from the compute? What I mean is this: I may want to upgrade HDFS less frequently than MR or HBase. So, is there a snowball's chance in hell of running HDFS 0.20.206 or 0.22 against MR 0.23 and HBase-whatever-comes-next-year? Thanks in advance, and cheers to a vibrant healthy Hadoop community, Jagane +
Jagane Sundar 2011-10-02, 23:15
|