|
|
Martinus Martinus 2012-05-22, 06:24
Hi,
Is there any hadoop HA distribution out there?
Thanks.
+
Martinus Martinus 2012-05-22, 06:24
+
Todd Lipcon 2012-05-22, 06:26
Martinus Martinus 2012-05-22, 07:08
Hi Todd, Thanks for your answer. Is that will have the same capability as the commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? Thanks. On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > Hi Martinus, > > Hadoop HA is available in Hadoop 2.0.0. This release is currently > being voted on in the community. > > You can read more here: > > http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/> > -Todd > > On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Is there any hadoop HA distribution out there? > > > > Thanks. > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
+
Martinus Martinus 2012-05-22, 07:08
Ted Dunning 2012-05-22, 08:13
No. 2.0.0 will not have the same level of ha as MapR. Specifically, the job tracker hasn't been addressed and the name node Issues have only been partially addressed. On May 22, 2012, at 8:08 AM, Martinus Martinus <[EMAIL PROTECTED]> wrote: > Hi Todd, > > Thanks for your answer. Is that will have the same capability as the commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > Thanks. > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > Hi Martinus, > > Hadoop HA is available in Hadoop 2.0.0. This release is currently > being voted on in the community. > > You can read more here: > http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/> > -Todd > > On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Is there any hadoop HA distribution out there? > > > > Thanks. > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
+
Ted Dunning 2012-05-22, 08:13
Todd Lipcon 2012-05-22, 07:15
On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus <[EMAIL PROTECTED]> wrote: > Hi Todd, > > Thanks for your answer. Is that will have the same capability as the > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? I can't speak to a closed source product's feature set. But, the 2.0.0 release has failover support between an active and passive namenode, and an upcoming release will include automatic failover using Apache ZooKeeper for failure detection and coordination. These have been tested significantly under HBase workloads and should fail over quickly and seamlessly based on our testing results. Furthermore, they are Apache 2 licensed open source, free of vendor lock-in. -Todd -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2012-05-22, 07:15
M. C. Srivas 2012-05-23, 04:45
On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus <[EMAIL PROTECTED]>wrote: > Hi Todd, > > Thanks for your answer. Is that will have the same capability as the > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > Thanks. Hi Martinus, some major differences in HA between MapR's M5 and Apache Hadoop 1. with M5, any node become master at any time. It is a fully active-active system. You can get create a fully bomb-proof cluster, such that in a 20-node cluster, you can configure to survive even if 19 of the 20 nodes are lost. With Apache, it is a 1-1 active-passive system. 2. M5 does not require a NFS filer in the backend. Apache Hadoop requires a Netapp or similar NFS filer to assist in saving the NN data, even in its HA configuration. Note that for true HA, the Netapp or similar also will need to be HA. 3. M5 has full HA for the Job-Tracker as well. Of course, HA is only a small part of the total business continuity story. Full recovery in the face of any kind of failures is critical: With M5: - If there is a complete cluster crash and reboot (eg, a full power-failure of the entire cluster), M5 will recover in 5-10 minutes, and submitted jobs will resume from where they were. - with snapshots, if you upgrade your software and it corrupts data, M5 provides snapshots to help you recover. The number of times I've seen someone running "hadoop fs -rmr /" accidentally and asking for help on this mailing list is beyond counting. With M5, it is completely recoverable - full disaster-recovery across clusters by mirroring. Hope that clarifies some of the differences. > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> Hi Martinus, >> >> Hadoop HA is available in Hadoop 2.0.0. This release is currently >> being voted on in the community. >> >> You can read more here: >> >> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/>> >> -Todd >> >> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus >> <[EMAIL PROTECTED]> wrote: >> > Hi, >> > >> > Is there any hadoop HA distribution out there? >> > >> > Thanks. >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > >
+
M. C. Srivas 2012-05-23, 04:45
Konstantin Boudnik 2012-05-25, 15:03
BTW, Srivas, I could find a single countless example of horror story of 'hadoop fs -rmr' in a form of hypothetical question (and not on this list ;) http://is.gd/55KD1EJust for the sake of full disclosure, of course. Enjoy, Cos On Tue, May 22, 2012 at 09:45PM, M. C. Srivas wrote: > On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus > <[EMAIL PROTECTED]>wrote: > > > Hi Todd, > > > > Thanks for your answer. Is that will have the same capability as the > > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > > > Thanks. > > > Hi Martinus, some major differences in HA between MapR's M5 and Apache > Hadoop > > 1. with M5, any node become master at any time. It is a fully active-active > system. You can get create a fully bomb-proof cluster, such that in a > 20-node cluster, you can configure to survive even if 19 of the 20 nodes > are lost. With Apache, it is a 1-1 active-passive system. > > 2. M5 does not require a NFS filer in the backend. Apache Hadoop requires a > Netapp or similar NFS filer to assist in saving the NN data, even in its HA > configuration. Note that for true HA, the Netapp or similar also will need > to be HA. > > 3. M5 has full HA for the Job-Tracker as well. > > Of course, HA is only a small part of the total business continuity story. > Full recovery in the face of any kind of failures is critical: > > With M5: > > - If there is a complete cluster crash and reboot (eg, a full > power-failure of the entire cluster), M5 will recover in 5-10 minutes, and > submitted jobs will resume from where they were. > > - with snapshots, if you upgrade your software and it corrupts data, M5 > provides snapshots to help you recover. The number of times I've seen > someone running "hadoop fs -rmr /" accidentally and asking for help on > this mailing list is beyond counting. With M5, it is completely recoverable > > - full disaster-recovery across clusters by mirroring. > > Hope that clarifies some of the differences. > > > > > > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > >> Hi Martinus, > >> > >> Hadoop HA is available in Hadoop 2.0.0. This release is currently > >> being voted on in the community. > >> > >> You can read more here: > >> > >> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/> >> > >> -Todd > >> > >> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > >> <[EMAIL PROTECTED]> wrote: > >> > Hi, > >> > > >> > Is there any hadoop HA distribution out there? > >> > > >> > Thanks. > >> > >> > >> > >> -- > >> Todd Lipcon > >> Software Engineer, Cloudera > >> > > > >
+
Konstantin Boudnik 2012-05-25, 15:03
M. C. Srivas 2012-05-26, 15:44
On Fri, May 25, 2012 at 8:03 AM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > BTW, Srivas, > > I could find a single countless example of horror story of 'hadoop fs > -rmr' in > a form of hypothetical question (and not on this list ;) > http://is.gd/55KD1E> > Hi Cos, accidentally deleting files is one of the most common user errors. Here's a real one from just last month http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCAPwpkBvEx4OTUbf6mf8t43oOjZM%2BExUths7XNn3UidqsN3Y8hA%40mail.gmail.com%3EAs Patrick says in the follow-up, the only way to recover in this situation is to shutdown the cluster: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCANS822ga1ivAPi2C9PJsyz6nZgft4msKkH%3Dyj06-i_V%2Bu1B1AA%40mail.gmail.com%3EIn fact, the above procedure is well-known and well-documented. Here's even an excerpt from Jason's book ProHadoop where he says "it is not uncommon for a user to accidentally delete large portions of the HDFS file system due to a program error or a command-line error ... best bet is to terminate the NN and 2-N immediately, and then shutdown the DNs as fast as possible" http://books.google.com/books?id=8DV-EzeKigQC&pg=PA122&lpg=PA122&dq=how+to+recover+deleted+files+%2B+hadoop&source=bl&ots=prgSMk1SHL&sig=LPJ0j5MFwJ3zUAcOrvR6FbiWQuQ&hl=en&sa=X&ei=UfXAT76HJuabiALbkdn8Bw&ved=0CLQBEOgBMAQ#v=onepage&q=how%20to%20recover%20deleted%20files%20%2B%20hadoop&f=falseJust for the sake of full disclosure, of course. > > Enjoy, > Cos > > On Tue, May 22, 2012 at 09:45PM, M. C. Srivas wrote: > > On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus > > <[EMAIL PROTECTED]>wrote: > > > > > Hi Todd, > > > > > > Thanks for your answer. Is that will have the same capability as the > > > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > > > > > Thanks. > > > > > > Hi Martinus, some major differences in HA between MapR's M5 and Apache > > Hadoop > > > > 1. with M5, any node become master at any time. It is a fully > active-active > > system. You can get create a fully bomb-proof cluster, such that in a > > 20-node cluster, you can configure to survive even if 19 of the 20 nodes > > are lost. With Apache, it is a 1-1 active-passive system. > > > > 2. M5 does not require a NFS filer in the backend. Apache Hadoop > requires a > > Netapp or similar NFS filer to assist in saving the NN data, even in its > HA > > configuration. Note that for true HA, the Netapp or similar also will > need > > to be HA. > > > > 3. M5 has full HA for the Job-Tracker as well. > > > > Of course, HA is only a small part of the total business continuity > story. > > Full recovery in the face of any kind of failures is critical: > > > > With M5: > > > > - If there is a complete cluster crash and reboot (eg, a full > > power-failure of the entire cluster), M5 will recover in 5-10 minutes, > and > > submitted jobs will resume from where they were. > > > > - with snapshots, if you upgrade your software and it corrupts data, M5 > > provides snapshots to help you recover. The number of times I've seen > > someone running "hadoop fs -rmr /" accidentally and asking for help on > > this mailing list is beyond counting. With M5, it is completely > recoverable > > > > - full disaster-recovery across clusters by mirroring. > > > > Hope that clarifies some of the differences. > > > > > > > > > > > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> > wrote: > > > > > >> Hi Martinus, > > >> > > >> Hadoop HA is available in Hadoop 2.0.0. This release is currently > > >> being voted on in the community. > > >> > > >> You can read more here: > > >> > > >> > http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/> > >> > > >> -Todd > > >> > > >> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > > >> <[EMAIL PROTECTED]> wrote: > > >> > Hi, > > >> > > > >> > Is there any hadoop HA distribution out there? > > >> > > > >> > Thanks.
+
M. C. Srivas 2012-05-26, 15:44
Konstantin Boudnik 2012-05-26, 18:10
Makes it two, countless enough. It's not that I disagree that the gasket between the keyboard and the chair (aka user) is a typical source of most of the troubles ;) Cos On Sat, May 26, 2012 at 08:44AM, M. C. Srivas wrote: > On Fri, May 25, 2012 at 8:03 AM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > > > BTW, Srivas, > > > > I could find a single countless example of horror story of 'hadoop fs > > -rmr' in > > a form of hypothetical question (and not on this list ;) > > http://is.gd/55KD1E> > > > > Hi Cos, accidentally deleting files is one of the most common user errors. > Here's a real one from just last month > > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCAPwpkBvEx4OTUbf6mf8t43oOjZM%2BExUths7XNn3UidqsN3Y8hA%40mail.gmail.com%3E> > > As Patrick says in the follow-up, the only way to recover in this situation > is to shutdown the cluster: > > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCANS822ga1ivAPi2C9PJsyz6nZgft4msKkH%3Dyj06-i_V%2Bu1B1AA%40mail.gmail.com%3E> > > > In fact, the above procedure is well-known and well-documented. Here's > even an excerpt from Jason's book ProHadoop where he says "it is not > uncommon for a user to accidentally delete large portions of the HDFS file > system due to a program error or a command-line error ... best bet is to > terminate the NN and 2-N immediately, and then shutdown the DNs as fast as > possible" > > http://books.google.com/books?id=8DV-EzeKigQC&pg=PA122&lpg=PA122&dq=how+to+recover+deleted+files+%2B+hadoop&source=bl&ots=prgSMk1SHL&sig=LPJ0j5MFwJ3zUAcOrvR6FbiWQuQ&hl=en&sa=X&ei=UfXAT76HJuabiALbkdn8Bw&ved=0CLQBEOgBMAQ#v=onepage&q=how%20to%20recover%20deleted%20files%20%2B%20hadoop&f=false> > > > Just for the sake of full disclosure, of course. > > > > > Enjoy, > > Cos > > > > On Tue, May 22, 2012 at 09:45PM, M. C. Srivas wrote: > > > On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus > > > <[EMAIL PROTECTED]>wrote: > > > > > > > Hi Todd, > > > > > > > > Thanks for your answer. Is that will have the same capability as the > > > > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > > > > > > > Thanks. > > > > > > > > > Hi Martinus, some major differences in HA between MapR's M5 and Apache > > > Hadoop > > > > > > 1. with M5, any node become master at any time. It is a fully > > active-active > > > system. You can get create a fully bomb-proof cluster, such that in a > > > 20-node cluster, you can configure to survive even if 19 of the 20 nodes > > > are lost. With Apache, it is a 1-1 active-passive system. > > > > > > 2. M5 does not require a NFS filer in the backend. Apache Hadoop > > requires a > > > Netapp or similar NFS filer to assist in saving the NN data, even in its > > HA > > > configuration. Note that for true HA, the Netapp or similar also will > > need > > > to be HA. > > > > > > 3. M5 has full HA for the Job-Tracker as well. > > > > > > Of course, HA is only a small part of the total business continuity > > story. > > > Full recovery in the face of any kind of failures is critical: > > > > > > With M5: > > > > > > - If there is a complete cluster crash and reboot (eg, a full > > > power-failure of the entire cluster), M5 will recover in 5-10 minutes, > > and > > > submitted jobs will resume from where they were. > > > > > > - with snapshots, if you upgrade your software and it corrupts data, M5 > > > provides snapshots to help you recover. The number of times I've seen > > > someone running "hadoop fs -rmr /" accidentally and asking for help on > > > this mailing list is beyond counting. With M5, it is completely > > recoverable > > > > > > - full disaster-recovery across clusters by mirroring. > > > > > > Hope that clarifies some of the differences. > > > > > > > > > > > > > > > > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> > > wrote: > > > > > > > >> Hi Martinus, > > > >> > > > >> Hadoop HA is available in Hadoop 2.0.0. This release is currently
+
Konstantin Boudnik 2012-05-26, 18:10
highpointe 2012-05-27, 04:52
Here is my SS: 259 71 2451 On May 26, 2012, at 8:44 AM, "M. C. Srivas" <[EMAIL PROTECTED]> wrote: > > On Fri, May 25, 2012 at 8:03 AM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote: > BTW, Srivas, > > I could find a single countless example of horror story of 'hadoop fs -rmr' in > a form of hypothetical question (and not on this list ;) http://is.gd/55KD1E> > > Hi Cos, accidentally deleting files is one of the most common user errors. Here's a real one from just last month > > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCAPwpkBvEx4OTUbf6mf8t43oOjZM%2BExUths7XNn3UidqsN3Y8hA%40mail.gmail.com%3E > > As Patrick says in the follow-up, the only way to recover in this situation is to shutdown the cluster: > > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201204.mbox/%3CCANS822ga1ivAPi2C9PJsyz6nZgft4msKkH%3Dyj06-i_V%2Bu1B1AA%40mail.gmail.com%3E > > > In fact, the above procedure is well-known and well-documented. Here's even an excerpt from Jason's book ProHadoop where he says "it is not uncommon for a user to accidentally delete large portions of the HDFS file system due to a program error or a command-line error ... best bet is to terminate the NN and 2-N immediately, and then shutdown the DNs as fast as possible" > > http://books.google.com/books?id=8DV-EzeKigQC&pg=PA122&lpg=PA122&dq=how+to+recover+deleted+files+%2B+hadoop&source=bl&ots=prgSMk1SHL&sig=LPJ0j5MFwJ3zUAcOrvR6FbiWQuQ&hl=en&sa=X&ei=UfXAT76HJuabiALbkdn8Bw&ved=0CLQBEOgBMAQ#v=onepage&q=how%20to%20recover%20deleted%20files%20%2B%20hadoop&f=false > > > Just for the sake of full disclosure, of course. > > Enjoy, > Cos > > On Tue, May 22, 2012 at 09:45PM, M. C. Srivas wrote: > > On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus > > <[EMAIL PROTECTED]>wrote: > > > > > Hi Todd, > > > > > > Thanks for your answer. Is that will have the same capability as the > > > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > > > > > Thanks. > > > > > > Hi Martinus, some major differences in HA between MapR's M5 and Apache > > Hadoop > > > > 1. with M5, any node become master at any time. It is a fully active-active > > system. You can get create a fully bomb-proof cluster, such that in a > > 20-node cluster, you can configure to survive even if 19 of the 20 nodes > > are lost. With Apache, it is a 1-1 active-passive system. > > > > 2. M5 does not require a NFS filer in the backend. Apache Hadoop requires a > > Netapp or similar NFS filer to assist in saving the NN data, even in its HA > > configuration. Note that for true HA, the Netapp or similar also will need > > to be HA. > > > > 3. M5 has full HA for the Job-Tracker as well. > > > > Of course, HA is only a small part of the total business continuity story. > > Full recovery in the face of any kind of failures is critical: > > > > With M5: > > > > - If there is a complete cluster crash and reboot (eg, a full > > power-failure of the entire cluster), M5 will recover in 5-10 minutes, and > > submitted jobs will resume from where they were. > > > > - with snapshots, if you upgrade your software and it corrupts data, M5 > > provides snapshots to help you recover. The number of times I've seen > > someone running "hadoop fs -rmr /" accidentally and asking for help on > > this mailing list is beyond counting. With M5, it is completely recoverable > > > > - full disaster-recovery across clusters by mirroring. > > > > Hope that clarifies some of the differences. > > > > > > > > > > > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > > > >> Hi Martinus, > > >> > > >> Hadoop HA is available in Hadoop 2.0.0. This release is currently > > >> being voted on in the community. > > >> > > >> You can read more here: > > >> > > >> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/> > >> > > >> -Todd > > >> > > >> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > > >> <[EMAIL PROTECTED]> wrote:
+
highpointe 2012-05-27, 04:52
Arun C Murthy 2012-05-25, 15:43
Srivas, On May 22, 2012, at 9:45 PM, M. C. Srivas wrote: > > 3. M5 has full HA for the Job-Tracker as well. Curious. Can you please share some information about what this means? Will tasks continue to run if JT bounces? Will jobs start from scratch? thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
+
Arun C Murthy 2012-05-25, 15:43
M. C. Srivas 2012-05-26, 15:53
On Fri, May 25, 2012 at 8:43 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > Srivas, > > On May 22, 2012, at 9:45 PM, M. C. Srivas wrote: > > > 3. M5 has full HA for the Job-Tracker as well. > > > Curious. Can you please share some information about what this means? > The JT will be restarted (perhaps on another node if the node where it's running has died). On recovery, JT will resume currently running jobs from where they were, ie., they are not lost or abandoned like is the case today with Apache Hadoop or CDH. > Will tasks continue to run if JT bounces? > Yes. > Will jobs start from scratch? > No. Works even across entire cluster reboots. As I said in my original posting, "- If there is a complete cluster crash and reboot (eg, a full power-failure of the entire cluster), M5 will recover in 5-10 minutes, and submitted jobs will resume from where they were." > thanks, > Arun > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/> > >
+
M. C. Srivas 2012-05-26, 15:53
highpointe 2012-05-27, 04:57
Here is my SS: 259 71 2451 On May 26, 2012, at 8:53 AM, "M. C. Srivas" <[EMAIL PROTECTED]> wrote: > > > On Fri, May 25, 2012 at 8:43 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > Srivas, > > On May 22, 2012, at 9:45 PM, M. C. Srivas wrote: >> >> 3. M5 has full HA for the Job-Tracker as well. > > Curious. Can you please share some information about what this means? > > The JT will be restarted (perhaps on another node if the node where it's running has died). On recovery, JT will resume currently running jobs from where they were, ie., they are not lost or abandoned like is the case today with Apache Hadoop or CDH. > > > Will tasks continue to run if JT bounces? > > Yes. > > > Will jobs start from scratch? > > No. Works even across entire cluster reboots. As I said in my original posting, > > "- If there is a complete cluster crash and reboot (eg, a full power-failure of the entire cluster), M5 will recover in 5-10 minutes, and submitted jobs will resume from where they were." > > > > thanks, > Arun > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/> > >
+
highpointe 2012-05-27, 04:57
|
|