|
Teruhiko Kurosaka
2011-07-14, 23:33
Arun C Murthy
2011-07-14, 23:43
Isaac Dooley
2011-07-15, 13:13
Jonathan Coveney
2011-07-15, 14:32
Owen O'Malley
2011-07-14, 23:45
Adarsh Sharma
2011-07-15, 04:49
Robert Evans
2011-07-15, 14:35
Michael Segel
2011-07-15, 14:58
Owen O'Malley
2011-07-15, 16:07
Tom Deutsch
2011-07-15, 16:38
Rita
2011-07-16, 15:53
Steve Loughran
2011-07-17, 19:34
Tom Deutsch
2011-07-16, 17:29
Michael Segel
2011-07-18, 19:09
Jeff.Schmitz@...
2011-07-18, 19:30
Michael Segel
2011-07-18, 20:50
Rita
2011-07-19, 00:01
Allen Wittenauer
2011-07-19, 00:12
Rita
2011-07-19, 01:02
Allen Wittenauer
2011-07-19, 01:12
Arun C Murthy
2011-07-15, 17:06
Steve Loughran
2011-07-15, 21:06
Jeff.Schmitz@...
2011-07-18, 13:30
Michael Segel
2011-07-18, 18:33
M. C. Srivas
2011-07-19, 01:19
Michael Segel
2011-07-19, 01:51
Joe Stein
2011-07-19, 02:00
Arun Murthy
2011-07-19, 03:06
Joe Stein
2011-07-19, 03:19
Rita
2011-07-19, 11:44
Steve Loughran
2011-07-19, 11:50
Vitalii Tymchyshyn
2011-07-19, 12:10
Steve Loughran
2011-07-15, 21:00
Mark Kerzner
2011-07-15, 21:25
Michael Segel
2011-07-15, 21:58
Tom Deutsch
2011-07-17, 20:07
Michael Segel
2011-07-18, 01:52
|
-
Which release to use?Teruhiko Kurosaka 2011-07-14, 23:33
I'm a newbie and I am confused by the Hadoop releases.
I thought 0.21.0 is the latest & greatest release that I should be using but I noticed 0.20.203 has been released lately, and 0.21.X is marked "unstable, unsupported". Should I be using 0.20.203? ---- T. "Kuro" Kurosaka +
Teruhiko Kurosaka 2011-07-14, 23:33
-
Re: Which release to use?Arun C Murthy 2011-07-14, 23:43
Hi,
0.20.203 is the latest stable release which includes a ton of features (security - kerberos based authentication) and fixes. Its currently deployed at over 50k machines at Yahoo too. So, yes, I'd encourage you to use 0.20.203. We, the community, are currently working on hadoop-0.23 and hope to get it out soon. thanks, Arun On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > I'm a newbie and I am confused by the Hadoop releases. > I thought 0.21.0 is the latest & greatest release that I > should be using but I noticed 0.20.203 has been released > lately, and 0.21.X is marked "unstable, unsupported". > > Should I be using 0.20.203? > ---- > T. "Kuro" Kurosaka > > +
Arun C Murthy 2011-07-14, 23:43
-
RE: Which release to use?Isaac Dooley 2011-07-15, 13:13
Will 0.23 include Kerberos authentication? Will this finally unite the Yahoo and Apache branches?
-----Original Message----- From: Arun C Murthy [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 14, 2011 7:43 PM To: [EMAIL PROTECTED] Subject: Re: Which release to use? Hi, 0.20.203 is the latest stable release which includes a ton of features (security - kerberos based authentication) and fixes. Its currently deployed at over 50k machines at Yahoo too. So, yes, I'd encourage you to use 0.20.203. We, the community, are currently working on hadoop-0.23 and hope to get it out soon. thanks, Arun On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > I'm a newbie and I am confused by the Hadoop releases. > I thought 0.21.0 is the latest & greatest release that I > should be using but I noticed 0.20.203 has been released > lately, and 0.21.X is marked "unstable, unsupported". > > Should I be using 0.20.203? > ---- > T. "Kuro" Kurosaka > > +
Isaac Dooley 2011-07-15, 13:13
-
Re: Which release to use?Jonathan Coveney 2011-07-15, 14:32
Isaac: there is no more yahoo branch. They are committing all of their code
to apache. 2011/7/15 Isaac Dooley <[EMAIL PROTECTED]> > Will 0.23 include Kerberos authentication? Will this finally unite the > Yahoo and Apache branches? > > -----Original Message----- > From: Arun C Murthy [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 14, 2011 7:43 PM > To: [EMAIL PROTECTED] > Subject: Re: Which release to use? > > Hi, > > 0.20.203 is the latest stable release which includes a ton of features > (security - kerberos based authentication) and fixes. Its currently deployed > at over 50k machines at Yahoo too. > So, yes, I'd encourage you to use 0.20.203. We, the community, are > currently working on hadoop-0.23 and hope to get it out soon. > > thanks, > Arun > > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > > > I'm a newbie and I am confused by the Hadoop releases. > > I thought 0.21.0 is the latest & greatest release that I > > should be using but I noticed 0.20.203 has been released > > lately, and 0.21.X is marked "unstable, unsupported". > > > > Should I be using 0.20.203? > > ---- > > T. "Kuro" Kurosaka > > > > > > +
Jonathan Coveney 2011-07-15, 14:32
-
Re: Which release to use?Owen O'Malley 2011-07-14, 23:45
On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > I'm a newbie and I am confused by the Hadoop releases. > I thought 0.21.0 is the latest & greatest release that I > should be using but I noticed 0.20.203 has been released > lately, and 0.21.X is marked "unstable, unsupported". > > Should I be using 0.20.203? Yes, I apologize for confusing release numbering, but the best release to use is 0.20.203.0. It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support so it isn't suitable for using with HBase. Most large clusters use a separate version of HDFS for HBase. -- Owen +
Owen O'Malley 2011-07-14, 23:45
-
Re: Which release to use?Adarsh Sharma 2011-07-15, 04:49
Hadoop releases are issued time by time. But one more thing related to
hadoop usage, There are so many providers that provides the distribution of Hadoop ; 1. Apache Hadoop 2. Cloudera 3. Yahoo etc. Which distribution is best among them on production usage. I think Cloudera's is best among them. Best Regards, Adarsh Owen O'Malley wrote: > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > > >> I'm a newbie and I am confused by the Hadoop releases. >> I thought 0.21.0 is the latest & greatest release that I >> should be using but I noticed 0.20.203 has been released >> lately, and 0.21.X is marked "unstable, unsupported". >> >> Should I be using 0.20.203? >> > > Yes, I apologize for confusing release numbering, but the best release to use is 0.20.203.0. It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support so it isn't suitable for using with HBase. Most large clusters use a separate version of HDFS for HBase. > > -- Owen > > +
Adarsh Sharma 2011-07-15, 04:49
-
Re: Which release to use?Robert Evans 2011-07-15, 14:35
Adarsh,
Yahoo! no longer has its own distribution of Hadoop. It has been merged into the 0.20.2XX line so 0.20.203 is what Yahoo is running internally right now, and we are moving towards 0.20.204 which should be out soon. I am not an expert on Cloudera so I cannot really map its releases to the Apache Releases, but their distro is based off of Apache Hadoop with a few bug fixes and maybe a few features like append added in on top of it, but you need to talk to Cloudera about the exact details. For the most part they are all very similar. You need to think most about support, there are several companies that can sell you support if you want/need it. You also need to think about features vs. stability. The 0.20.203 release has been tested on a lot of machines by many different groups, but may be missing some features that are needed in some situations. --Bobby On 7/14/11 11:49 PM, "Adarsh Sharma" <[EMAIL PROTECTED]> wrote: Hadoop releases are issued time by time. But one more thing related to hadoop usage, There are so many providers that provides the distribution of Hadoop ; 1. Apache Hadoop 2. Cloudera 3. Yahoo etc. Which distribution is best among them on production usage. I think Cloudera's is best among them. Best Regards, Adarsh Owen O'Malley wrote: > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > > >> I'm a newbie and I am confused by the Hadoop releases. >> I thought 0.21.0 is the latest & greatest release that I >> should be using but I noticed 0.20.203 has been released >> lately, and 0.21.X is marked "unstable, unsupported". >> >> Should I be using 0.20.203? >> > > Yes, I apologize for confusing release numbering, but the best release to use is 0.20.203.0. It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support so it isn't suitable for using with HBase. Most large clusters use a separate version of HDFS for HBase. > > -- Owen > > +
Robert Evans 2011-07-15, 14:35
-
RE: Which release to use?Michael Segel 2011-07-15, 14:58
Unfortunately the picture is a bit more confusing. Yahoo! is now HortonWorks. Their stated goal is to not have their own derivative release but to sell commercial support for the official Apache release. So those selling commercial support are: *Cloudera *HortonWorks *MapRTech *EMC (reselling MapRTech, but had announced their own) *IBM (not sure what they are selling exactly... still seems like smoke and mirrors...) *DataStax So while you can use the Apache release, it may not make sense for your organization to do so. (Said as I don the flame retardant suit...) The issue is that outside of HortonWorks which is stating that they will support the official Apache release, everything else is a derivative work of Apache's Hadoop. From what I have seen, Cloudera's release is the closest to the Apache release. Like I said, things are getting interesting. HTH -Mike > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Date: Fri, 15 Jul 2011 07:35:45 -0700 > Subject: Re: Which release to use? > > Adarsh, > > Yahoo! no longer has its own distribution of Hadoop. It has been merged into the 0.20.2XX line so 0.20.203 is what Yahoo is running internally right now, and we are moving towards 0.20.204 which should be out soon. I am not an expert on Cloudera so I cannot really map its releases to the Apache Releases, but their distro is based off of Apache Hadoop with a few bug fixes and maybe a few features like append added in on top of it, but you need to talk to Cloudera about the exact details. For the most part they are all very similar. You need to think most about support, there are several companies that can sell you support if you want/need it. You also need to think about features vs. stability. The 0.20.203 release has been tested on a lot of machines by many different groups, but may be missing some features that are needed in some situations. > > --Bobby > > > On 7/14/11 11:49 PM, "Adarsh Sharma" <[EMAIL PROTECTED]> wrote: > > Hadoop releases are issued time by time. But one more thing related to > hadoop usage, > > There are so many providers that provides the distribution of Hadoop ; > > 1. Apache Hadoop > 2. Cloudera > 3. Yahoo > > etc. > Which distribution is best among them on production usage. > I think Cloudera's is best among them. > > > Best Regards, > Adarsh > Owen O'Malley wrote: > > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > > > > > >> I'm a newbie and I am confused by the Hadoop releases. > >> I thought 0.21.0 is the latest & greatest release that I > >> should be using but I noticed 0.20.203 has been released > >> lately, and 0.21.X is marked "unstable, unsupported". > >> > >> Should I be using 0.20.203? > >> > > > > Yes, I apologize for confusing release numbering, but the best release to use is 0.20.203.0. It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support so it isn't suitable for using with HBase. Most large clusters use a separate version of HDFS for HBase. > > > > -- Owen > > > > > > +
Michael Segel 2011-07-15, 14:58
-
Re: Which release to use?Owen O'Malley 2011-07-15, 16:07
On Jul 15, 2011, at 7:58 AM, Michael Segel wrote: > So while you can use the Apache release, it may not make sense for your organization to do so. (Said as I don the flame retardant suit...) I obviously disagree. *grin* Apache Hadoop 0.20.203.0 is the most stable and well tested release and has been deployed on Yahoo's 40,000 Hadoop machines in clusters of up to 4,500 machines and has been used extensively for running production work loads. We are actively working to make the install and deployment of Apache Hadoop easier In terms of commercial support, HortonWorks is absolutely supporting the Apache releases. IBM is also supporting the Apache releases: http://davidmenninger.ventanaresearch.com/2011/05/18/ibm-chooses-hadoop-unity-not-shipping-the-elephant/ So lack of commercial support isn't a problem... -- Owen +
Owen O'Malley 2011-07-15, 16:07
-
RE: Which release to use?Tom Deutsch 2011-07-15, 16:38
One quick clarification - IBM GA'd a product called BigInsights in 2Q. It
faithfully uses the Hadoop stack and many related projects - but provides a number of extensions (that are compatible) based on customer requests. Not appropriate to say any more on this list, but the info on it is all publically available. ------------------------------------------------ Tom Deutsch Program Director CTO Office: Information Management Hadoop Product Manager / Customer Exec IBM 3565 Harbor Blvd Costa Mesa, CA 92626-1420 [EMAIL PROTECTED] Michael Segel <[EMAIL PROTECTED]> 07/15/2011 07:58 AM Please respond to [EMAIL PROTECTED] To <[EMAIL PROTECTED]> cc Subject RE: Which release to use? Unfortunately the picture is a bit more confusing. Yahoo! is now HortonWorks. Their stated goal is to not have their own derivative release but to sell commercial support for the official Apache release. So those selling commercial support are: *Cloudera *HortonWorks *MapRTech *EMC (reselling MapRTech, but had announced their own) *IBM (not sure what they are selling exactly... still seems like smoke and mirrors...) *DataStax So while you can use the Apache release, it may not make sense for your organization to do so. (Said as I don the flame retardant suit...) The issue is that outside of HortonWorks which is stating that they will support the official Apache release, everything else is a derivative work of Apache's Hadoop. From what I have seen, Cloudera's release is the closest to the Apache release. Like I said, things are getting interesting. HTH +
Tom Deutsch 2011-07-15, 16:38
-
Re: Which release to use?Rita 2011-07-16, 15:53
I am curious about the IBM product BigInishgts. Where can we download it? It
seems we have to register to download it? On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> wrote: > One quick clarification - IBM GA'd a product called BigInsights in 2Q. It > faithfully uses the Hadoop stack and many related projects - but provides > a number of extensions (that are compatible) based on customer requests. > Not appropriate to say any more on this list, but the info on it is all > publically available. > > > ------------------------------------------------ > Tom Deutsch > Program Director > CTO Office: Information Management > Hadoop Product Manager / Customer Exec > IBM > 3565 Harbor Blvd > Costa Mesa, CA 92626-1420 > [EMAIL PROTECTED] > > > > > Michael Segel <[EMAIL PROTECTED]> > 07/15/2011 07:58 AM > Please respond to > [EMAIL PROTECTED] > > > To > <[EMAIL PROTECTED]> > cc > > Subject > RE: Which release to use? > > > > > > > > Unfortunately the picture is a bit more confusing. > > Yahoo! is now HortonWorks. Their stated goal is to not have their own > derivative release but to sell commercial support for the official Apache > release. > So those selling commercial support are: > *Cloudera > *HortonWorks > *MapRTech > *EMC (reselling MapRTech, but had announced their own) > *IBM (not sure what they are selling exactly... still seems like smoke and > mirrors...) > *DataStax > > So while you can use the Apache release, it may not make sense for your > organization to do so. (Said as I don the flame retardant suit...) > > The issue is that outside of HortonWorks which is stating that they will > support the official Apache release, everything else is a derivative work > of Apache's Hadoop. From what I have seen, Cloudera's release is the > closest to the Apache release. > > Like I said, things are getting interesting. > > HTH > > > > -- --- Get your facts first, then you can distort them as you please.-- +
Rita 2011-07-16, 15:53
-
Re: Which release to use?Steve Loughran 2011-07-17, 19:34
On 16/07/2011 16:53, Rita wrote:
> I am curious about the IBM product BigInishgts. Where can we download it? It > seems we have to register to download it? > I think you have to pay to use it +
Steve Loughran 2011-07-17, 19:34
-
Re: Which release to use?Tom Deutsch 2011-07-16, 17:29
Hi Rita - I want to make sure we are honoring the purpose/approach of this
list. So you are welcome to ping me for information, but let's take this discussion off the list at this point. ------------------------------------------------ Tom Deutsch Program Director CTO Office: Information Management Hadoop Product Manager / Customer Exec IBM 3565 Harbor Blvd Costa Mesa, CA 92626-1420 [EMAIL PROTECTED] Rita <[EMAIL PROTECTED]> 07/16/2011 08:53 AM Please respond to [EMAIL PROTECTED] To [EMAIL PROTECTED] cc Subject Re: Which release to use? I am curious about the IBM product BigInishgts. Where can we download it? It seems we have to register to download it? On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> wrote: > One quick clarification - IBM GA'd a product called BigInsights in 2Q. It > faithfully uses the Hadoop stack and many related projects - but provides > a number of extensions (that are compatible) based on customer requests. > Not appropriate to say any more on this list, but the info on it is all > publically available. > > > ------------------------------------------------ > Tom Deutsch > Program Director > CTO Office: Information Management > Hadoop Product Manager / Customer Exec > IBM > 3565 Harbor Blvd > Costa Mesa, CA 92626-1420 > [EMAIL PROTECTED] > > > > > Michael Segel <[EMAIL PROTECTED]> > 07/15/2011 07:58 AM > Please respond to > [EMAIL PROTECTED] > > > To > <[EMAIL PROTECTED]> > cc > > Subject > RE: Which release to use? > > > > > > > > Unfortunately the picture is a bit more confusing. > > Yahoo! is now HortonWorks. Their stated goal is to not have their own > derivative release but to sell commercial support for the official Apache > release. > So those selling commercial support are: > *Cloudera > *HortonWorks > *MapRTech > *EMC (reselling MapRTech, but had announced their own) > *IBM (not sure what they are selling exactly... still seems like smoke and > mirrors...) > *DataStax > > So while you can use the Apache release, it may not make sense for your > organization to do so. (Said as I don the flame retardant suit...) > > The issue is that outside of HortonWorks which is stating that they will > support the official Apache release, everything else is a derivative work > of Apache's Hadoop. From what I have seen, Cloudera's release is the > closest to the Apache release. > > Like I said, things are getting interesting. > > HTH > > > > -- --- Get your facts first, then you can distort them as you please.-- +
Tom Deutsch 2011-07-16, 17:29
-
RE: Which release to use?Michael Segel 2011-07-18, 19:09
Tom, I'm not sure that you're really honoring the purpose and approach of this list. I mean on the one hand, you're not under any obligation to respond or participate on the list. And I can respect that. You're not in an S&D role so you're not 'customer facing' and not used to having to deal with these types of questions. On the other, you're not being free with your information. So when this type of question comes up, it becomes very easy to discount IBM as a release or source provider for commercial support. Without information, I'm afraid that I may have to make recommendations to my clients that may be out of date. There is even some speculation from analysts that recent comments from IBM are more of an indication that IBM is still not ready for prime time. I'm sorry you're not in a position to detail your offering. Maybe by September you might be ready and then talk to our CHUG? -Mike > To: [EMAIL PROTECTED] > Subject: Re: Which release to use? > From: [EMAIL PROTECTED] > Date: Sat, 16 Jul 2011 10:29:55 -0700 > > Hi Rita - I want to make sure we are honoring the purpose/approach of this > list. So you are welcome to ping me for information, but let's take this > discussion off the list at this point. > > ------------------------------------------------ > Tom Deutsch > Program Director > CTO Office: Information Management > Hadoop Product Manager / Customer Exec > IBM > 3565 Harbor Blvd > Costa Mesa, CA 92626-1420 > [EMAIL PROTECTED] > > > > > Rita <[EMAIL PROTECTED]> > 07/16/2011 08:53 AM > Please respond to > [EMAIL PROTECTED] > > > To > [EMAIL PROTECTED] > cc > > Subject > Re: Which release to use? > > > > > > > I am curious about the IBM product BigInishgts. Where can we download it? > It > seems we have to register to download it? > > > On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> wrote: > > > One quick clarification - IBM GA'd a product called BigInsights in 2Q. > It > > faithfully uses the Hadoop stack and many related projects - but > provides > > a number of extensions (that are compatible) based on customer requests. > > Not appropriate to say any more on this list, but the info on it is all > > publically available. > > > > > > ------------------------------------------------ > > Tom Deutsch > > Program Director > > CTO Office: Information Management > > Hadoop Product Manager / Customer Exec > > IBM > > 3565 Harbor Blvd > > Costa Mesa, CA 92626-1420 > > [EMAIL PROTECTED] > > > > > > > > > > Michael Segel <[EMAIL PROTECTED]> > > 07/15/2011 07:58 AM > > Please respond to > > [EMAIL PROTECTED] > > > > > > To > > <[EMAIL PROTECTED]> > > cc > > > > Subject > > RE: Which release to use? > > > > > > > > > > > > > > > > Unfortunately the picture is a bit more confusing. > > > > Yahoo! is now HortonWorks. Their stated goal is to not have their own > > derivative release but to sell commercial support for the official > Apache > > release. > > So those selling commercial support are: > > *Cloudera > > *HortonWorks > > *MapRTech > > *EMC (reselling MapRTech, but had announced their own) > > *IBM (not sure what they are selling exactly... still seems like smoke > and > > mirrors...) > > *DataStax > > > > So while you can use the Apache release, it may not make sense for your > > organization to do so. (Said as I don the flame retardant suit...) > > > > The issue is that outside of HortonWorks which is stating that they will > > support the official Apache release, everything else is a derivative > work > > of Apache's Hadoop. From what I have seen, Cloudera's release is the > > closest to the Apache release. > > > > Like I said, things are getting interesting. > > > > HTH > > > > > > > > > > > -- > --- Get your facts first, then you can distort them as you please.-- > +
Michael Segel 2011-07-18, 19:09
-
RE: Which release to use?Jeff.Schmitz@... 2011-07-18, 19:30
Most people are using CH3 - if you need some features from another distro use that - http://www.cloudera.com/hadoop/ I wonder if the Cloudera people realize that CH3 was a pretty happening punk band back in the day (if not they do now = ) http://en.wikipedia.org/wiki/Channel_3_%28band%29 cheers - Jeffery Schmitz Projects and Technology 3737 Bellaire Blvd Houston, Texas 77001 Tel: +1-713-245-7326 Fax: +1 713 245 7678 Email: [EMAIL PROTECTED] Intergalactic Proton Powered Electrical Tentacled Advertising Droids! -----Original Message----- From: Michael Segel [mailto:[EMAIL PROTECTED]] Sent: Monday, July 18, 2011 2:10 PM To: [EMAIL PROTECTED] Subject: RE: Which release to use? Tom, I'm not sure that you're really honoring the purpose and approach of this list. I mean on the one hand, you're not under any obligation to respond or participate on the list. And I can respect that. You're not in an S&D role so you're not 'customer facing' and not used to having to deal with these types of questions. On the other, you're not being free with your information. So when this type of question comes up, it becomes very easy to discount IBM as a release or source provider for commercial support. Without information, I'm afraid that I may have to make recommendations to my clients that may be out of date. There is even some speculation from analysts that recent comments from IBM are more of an indication that IBM is still not ready for prime time. I'm sorry you're not in a position to detail your offering. Maybe by September you might be ready and then talk to our CHUG? -Mike > To: [EMAIL PROTECTED] > Subject: Re: Which release to use? > From: [EMAIL PROTECTED] > Date: Sat, 16 Jul 2011 10:29:55 -0700 > > Hi Rita - I want to make sure we are honoring the purpose/approach of this > list. So you are welcome to ping me for information, but let's take this > discussion off the list at this point. > > ------------------------------------------------ > Tom Deutsch > Program Director > CTO Office: Information Management > Hadoop Product Manager / Customer Exec > IBM > 3565 Harbor Blvd > Costa Mesa, CA 92626-1420 > [EMAIL PROTECTED] > > > > > Rita <[EMAIL PROTECTED]> > 07/16/2011 08:53 AM > Please respond to > [EMAIL PROTECTED] > > > To > [EMAIL PROTECTED] > cc > > Subject > Re: Which release to use? > > > > > > > I am curious about the IBM product BigInishgts. Where can we download it? > It > seems we have to register to download it? > > > On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> wrote: > > > One quick clarification - IBM GA'd a product called BigInsights in 2Q. > It > > faithfully uses the Hadoop stack and many related projects - but > provides > > a number of extensions (that are compatible) based on customer requests. > > Not appropriate to say any more on this list, but the info on it is all > > publically available. > > > > > > ------------------------------------------------ > > Tom Deutsch > > Program Director > > CTO Office: Information Management > > Hadoop Product Manager / Customer Exec > > IBM > > 3565 Harbor Blvd > > Costa Mesa, CA 92626-1420 > > [EMAIL PROTECTED] > > > > > > > > > > Michael Segel <[EMAIL PROTECTED]> > > 07/15/2011 07:58 AM > > Please respond to > > [EMAIL PROTECTED] > > > > > > To > > <[EMAIL PROTECTED]> > > cc > > > > Subject > > RE: Which release to use? > > > > > > > > > > > > > > > > Unfortunately the picture is a bit more confusing. > > > > Yahoo! is now HortonWorks. Their stated goal is to not have their own > > derivative release but to sell commercial support for the official > Apache > > release. > > So those selling commercial support are: > > *Cloudera > > *HortonWorks > > *MapRTech > > *EMC (reselling MapRTech, but had announced their own) > > *IBM (not sure what they are selling exactly... still seems like smoke > and > > mirrors...) your will +
Jeff.Schmitz@... 2011-07-18, 19:30
-
RE: Which release to use?Michael Segel 2011-07-18, 20:50
Well that's CDH3. :-) And yes, that's because up until the past month... other releases didn't exist w commercial support. Now there are more players as we look at the movement from leading edge to mainstream adopters. > Subject: RE: Which release to use? > Date: Mon, 18 Jul 2011 14:30:39 -0500 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > > > Most people are using CH3 - if you need some features from another > distro use that - > > http://www.cloudera.com/hadoop/ > > I wonder if the Cloudera people realize that CH3 was a pretty happening > punk band back in the day (if not they do now = ) > > http://en.wikipedia.org/wiki/Channel_3_%28band%29 > > cheers - > > > Jeffery Schmitz > Projects and Technology > 3737 Bellaire Blvd Houston, Texas 77001 > Tel: +1-713-245-7326 Fax: +1 713 245 7678 > Email: [EMAIL PROTECTED] > Intergalactic Proton Powered Electrical Tentacled Advertising Droids! > > > > > > -----Original Message----- > From: Michael Segel [mailto:[EMAIL PROTECTED]] > Sent: Monday, July 18, 2011 2:10 PM > To: [EMAIL PROTECTED] > Subject: RE: Which release to use? > > > Tom, > > I'm not sure that you're really honoring the purpose and approach of > this list. > > I mean on the one hand, you're not under any obligation to respond or > participate on the list. And I can respect that. You're not in an S&D > role so you're not 'customer facing' and not used to having to deal with > these types of questions. > > On the other, you're not being free with your information. So when this > type of question comes up, it becomes very easy to discount IBM as a > release or source provider for commercial support. > > Without information, I'm afraid that I may have to make recommendations > to my clients that may be out of date. > > There is even some speculation from analysts that recent comments from > IBM are more of an indication that IBM is still not ready for prime > time. > > I'm sorry you're not in a position to detail your offering. > > Maybe by September you might be ready and then talk to our CHUG? > > -Mike > > > > > To: [EMAIL PROTECTED] > > Subject: Re: Which release to use? > > From: [EMAIL PROTECTED] > > Date: Sat, 16 Jul 2011 10:29:55 -0700 > > > > Hi Rita - I want to make sure we are honoring the purpose/approach of > this > > list. So you are welcome to ping me for information, but let's take > this > > discussion off the list at this point. > > > > ------------------------------------------------ > > Tom Deutsch > > Program Director > > CTO Office: Information Management > > Hadoop Product Manager / Customer Exec > > IBM > > 3565 Harbor Blvd > > Costa Mesa, CA 92626-1420 > > [EMAIL PROTECTED] > > > > > > > > > > Rita <[EMAIL PROTECTED]> > > 07/16/2011 08:53 AM > > Please respond to > > [EMAIL PROTECTED] > > > > > > To > > [EMAIL PROTECTED] > > cc > > > > Subject > > Re: Which release to use? > > > > > > > > > > > > > > I am curious about the IBM product BigInishgts. Where can we download > it? > > It > > seems we have to register to download it? > > > > > > On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> > wrote: > > > > > One quick clarification - IBM GA'd a product called BigInsights in > 2Q. > > It > > > faithfully uses the Hadoop stack and many related projects - but > > provides > > > a number of extensions (that are compatible) based on customer > requests. > > > Not appropriate to say any more on this list, but the info on it is > all > > > publically available. > > > > > > > > > ------------------------------------------------ > > > Tom Deutsch > > > Program Director > > > CTO Office: Information Management > > > Hadoop Product Manager / Customer Exec > > > IBM > > > 3565 Harbor Blvd > > > Costa Mesa, CA 92626-1420 > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > Michael Segel <[EMAIL PROTECTED]> > > > 07/15/2011 07:58 AM > > > Please respond to +
Michael Segel 2011-07-18, 20:50
-
Re: Which release to use?Rita 2011-07-19, 00:01
I made the big mistake by using the latest version, 0.21.0 and found bunch
of bugs so I got pissed off at hdfs. Then, after reading this thread it seems I should of used 0.20.x . I really wish we can fix this on the website, stating 0.21.0 as unstable. On Mon, Jul 18, 2011 at 4:50 PM, Michael Segel <[EMAIL PROTECTED]>wrote: > > Well that's CDH3. :-) > > And yes, that's because up until the past month... other releases didn't > exist w commercial support. > > Now there are more players as we look at the movement from leading edge to > mainstream adopters. > > > > > Subject: RE: Which release to use? > > Date: Mon, 18 Jul 2011 14:30:39 -0500 > > From: [EMAIL PROTECTED] > > To: [EMAIL PROTECTED] > > > > > > Most people are using CH3 - if you need some features from another > > distro use that - > > > > http://www.cloudera.com/hadoop/ > > > > I wonder if the Cloudera people realize that CH3 was a pretty happening > > punk band back in the day (if not they do now = ) > > > > http://en.wikipedia.org/wiki/Channel_3_%28band%29 > > > > cheers - > > > > > > Jeffery Schmitz > > Projects and Technology > > 3737 Bellaire Blvd Houston, Texas 77001 > > Tel: +1-713-245-7326 Fax: +1 713 245 7678 > > Email: [EMAIL PROTECTED] > > Intergalactic Proton Powered Electrical Tentacled Advertising Droids! > > > > > > > > > > > > -----Original Message----- > > From: Michael Segel [mailto:[EMAIL PROTECTED]] > > Sent: Monday, July 18, 2011 2:10 PM > > To: [EMAIL PROTECTED] > > Subject: RE: Which release to use? > > > > > > Tom, > > > > I'm not sure that you're really honoring the purpose and approach of > > this list. > > > > I mean on the one hand, you're not under any obligation to respond or > > participate on the list. And I can respect that. You're not in an S&D > > role so you're not 'customer facing' and not used to having to deal with > > these types of questions. > > > > On the other, you're not being free with your information. So when this > > type of question comes up, it becomes very easy to discount IBM as a > > release or source provider for commercial support. > > > > Without information, I'm afraid that I may have to make recommendations > > to my clients that may be out of date. > > > > There is even some speculation from analysts that recent comments from > > IBM are more of an indication that IBM is still not ready for prime > > time. > > > > I'm sorry you're not in a position to detail your offering. > > > > Maybe by September you might be ready and then talk to our CHUG? > > > > -Mike > > > > > > > > > To: [EMAIL PROTECTED] > > > Subject: Re: Which release to use? > > > From: [EMAIL PROTECTED] > > > Date: Sat, 16 Jul 2011 10:29:55 -0700 > > > > > > Hi Rita - I want to make sure we are honoring the purpose/approach of > > this > > > list. So you are welcome to ping me for information, but let's take > > this > > > discussion off the list at this point. > > > > > > ------------------------------------------------ > > > Tom Deutsch > > > Program Director > > > CTO Office: Information Management > > > Hadoop Product Manager / Customer Exec > > > IBM > > > 3565 Harbor Blvd > > > Costa Mesa, CA 92626-1420 > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > Rita <[EMAIL PROTECTED]> > > > 07/16/2011 08:53 AM > > > Please respond to > > > [EMAIL PROTECTED] > > > > > > > > > To > > > [EMAIL PROTECTED] > > > cc > > > > > > Subject > > > Re: Which release to use? > > > > > > > > > > > > > > > > > > > > > I am curious about the IBM product BigInishgts. Where can we download > > it? > > > It > > > seems we have to register to download it? > > > > > > > > > On Fri, Jul 15, 2011 at 12:38 PM, Tom Deutsch <[EMAIL PROTECTED]> > > wrote: > > > > > > > One quick clarification - IBM GA'd a product called BigInsights in > > 2Q. > > > It > > > > faithfully uses the Hadoop stack and many related projects - but > > > provides > > > > a number of extensions (that are compatible) based on customer +
Rita 2011-07-19, 00:01
-
Re: Which release to use?Allen Wittenauer 2011-07-19, 00:12
On Jul 18, 2011, at 5:01 PM, Rita wrote: > I made the big mistake by using the latest version, 0.21.0 and found bunch > of bugs so I got pissed off at hdfs. Then, after reading this thread it > seems I should of used 0.20.x . > > I really wish we can fix this on the website, stating 0.21.0 as unstable. It is stated in a few places on the website that 0.21 isn't stable: http://hadoop.apache.org/common/releases.html#23+August%2C+2010%3A+release+0.21.0+available "It has not undergone testing at scale and should not be considered stable or suitable for production." ... and ... http://hadoop.apache.org/common/releases.html#Download "0.21.X - unstable, unsupported, does not include security" and it isn't in the stable directory on the apache download mirrors. +
Allen Wittenauer 2011-07-19, 00:12
-
Re: Which release to use?Rita 2011-07-19, 01:02
I am a dimwit.
On Mon, Jul 18, 2011 at 8:12 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > > On Jul 18, 2011, at 5:01 PM, Rita wrote: > > > I made the big mistake by using the latest version, 0.21.0 and found > bunch > > of bugs so I got pissed off at hdfs. Then, after reading this thread it > > seems I should of used 0.20.x . > > > > I really wish we can fix this on the website, stating 0.21.0 as unstable. > > > It is stated in a few places on the website that 0.21 isn't stable: > > > http://hadoop.apache.org/common/releases.html#23+August%2C+2010%3A+release+0.21.0+available > > "It has not undergone testing at scale and should not be considered stable > or suitable for production." > > ... and ... > > http://hadoop.apache.org/common/releases.html#Download > > "0.21.X - unstable, unsupported, does not include security" > > and it isn't in the stable directory on the apache download mirrors. > > > -- --- Get your facts first, then you can distort them as you please.-- +
Rita 2011-07-19, 01:02
-
Re: Which release to use?Allen Wittenauer 2011-07-19, 01:12
On Jul 18, 2011, at 6:02 PM, Rita wrote: > I am a dimwit. We are conditioned by marketing that a higher number is always better. Experience tells us that this is not necessarily true. +
Allen Wittenauer 2011-07-19, 01:12
-
Re: Which release to use?Arun C Murthy 2011-07-15, 17:06
Apache Hadoop is a volunteer driven, open-source project. The contributors to Apache Hadoop, both individuals and folks across a diverse set of organizations, are committed to driving the project forward and making timely releases - see discussion on hadoop-0.23 with a raft newer features such as HDFS Federation, NextGen MapReduce and plans for HA NameNode etc.
As with most successful projects there are several options for commercial support to Hadoop or its derivatives. However, Apache Hadoop has thrived before there was any commercial support (I've personally been involved in over 20 releases of Apache Hadoop and deployed them while at Yahoo) and I'm sure it will in this new world order. We, the Apache Hadoop community, are committed to keeping Apache Hadoop 'free', providing support to our users and to move it forward at a rapid rate. Arun On Jul 15, 2011, at 7:58 AM, Michael Segel wrote: > > Unfortunately the picture is a bit more confusing. > > Yahoo! is now HortonWorks. Their stated goal is to not have their own derivative release but to sell commercial support for the official Apache release. > So those selling commercial support are: > *Cloudera > *HortonWorks > *MapRTech > *EMC (reselling MapRTech, but had announced their own) > *IBM (not sure what they are selling exactly... still seems like smoke and mirrors...) > *DataStax > > So while you can use the Apache release, it may not make sense for your organization to do so. (Said as I don the flame retardant suit...) > > The issue is that outside of HortonWorks which is stating that they will support the official Apache release, everything else is a derivative work of Apache's Hadoop. From what I have seen, Cloudera's release is the closest to the Apache release. > > Like I said, things are getting interesting. > > HTH > > -Mike > > > >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED] >> Date: Fri, 15 Jul 2011 07:35:45 -0700 >> Subject: Re: Which release to use? >> >> Adarsh, >> >> Yahoo! no longer has its own distribution of Hadoop. It has been merged into the 0.20.2XX line so 0.20.203 is what Yahoo is running internally right now, and we are moving towards 0.20.204 which should be out soon. I am not an expert on Cloudera so I cannot really map its releases to the Apache Releases, but their distro is based off of Apache Hadoop with a few bug fixes and maybe a few features like append added in on top of it, but you need to talk to Cloudera about the exact details. For the most part they are all very similar. You need to think most about support, there are several companies that can sell you support if you want/need it. You also need to think about features vs. stability. The 0.20.203 release has been tested on a lot of machines by many different groups, but may be missing some features that are needed in some situations. >> >> --Bobby >> >> >> On 7/14/11 11:49 PM, "Adarsh Sharma" <[EMAIL PROTECTED]> wrote: >> >> Hadoop releases are issued time by time. But one more thing related to >> hadoop usage, >> >> There are so many providers that provides the distribution of Hadoop ; >> >> 1. Apache Hadoop >> 2. Cloudera >> 3. Yahoo >> >> etc. >> Which distribution is best among them on production usage. >> I think Cloudera's is best among them. >> >> >> Best Regards, >> Adarsh >> Owen O'Malley wrote: >>> On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: >>> >>> >>>> I'm a newbie and I am confused by the Hadoop releases. >>>> I thought 0.21.0 is the latest & greatest release that I >>>> should be using but I noticed 0.20.203 has been released >>>> lately, and 0.21.X is marked "unstable, unsupported". >>>> >>>> Should I be using 0.20.203? >>>> >>> >>> Yes, I apologize for confusing release numbering, but the best release to use is 0.20.203.0. It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support so it isn't suitable for using with HBase. Most large clusters use a separate version of HDFS for HBase. +
Arun C Murthy 2011-07-15, 17:06
-
Re: Which release to use?Steve Loughran 2011-07-15, 21:06
On 15/07/2011 18:06, Arun C Murthy wrote:
> Apache Hadoop is a volunteer driven, open-source project. The contributors to Apache Hadoop, both individuals and folks across a diverse set of organizations, are committed to driving the project forward and making timely releases - see discussion on hadoop-0.23 with a raft newer features such as HDFS Federation, NextGen MapReduce and plans for HA NameNode etc. > > As with most successful projects there are several options for commercial support to Hadoop or its derivatives. > > However, Apache Hadoop has thrived before there was any commercial support (I've personally been involved in over 20 releases of Apache Hadoop and deployed them while at Yahoo) and I'm sure it will in this new world order. > > We, the Apache Hadoop community, are committed to keeping Apache Hadoop 'free', providing support to our users and to move it forward at a rapid rate. > Arun makes a good point which is that the Apache project depends on contributions from the community to thrive. That includes -bug reports -patches to fix problems -more tests -documentation improvements: more examples, more on getting started, troubleshooting, etc. If there's something lacking in the codebase, and you think you can fix it, please do so. Helping with the documentation is a good start, as it can be improved, and you aren't going to break anything. Once you get into changing the code, you'll end up working with the head of whichever branch you are targeting. The other area everyone can contribute on is testing. Yes, Y! and FB can test at scale, yes, other people can test large clusters too -but nobody has a network that looks like yours but you. And Hadoop does care about network configurations. Testing beta and release candidate releases in your infrastructure, helps verify that the final release will work on your site, and you don't end up getting all the phone calls about something not working +
Steve Loughran 2011-07-15, 21:06
-
RE: Which release to use?Jeff.Schmitz@... 2011-07-18, 13:30
Steve,
I read your blog nice post - I believe EMC is selling the Greenplumb solution as an appliance - Cheers - Jeffery -----Original Message----- From: Steve Loughran [mailto:[EMAIL PROTECTED]] Sent: Friday, July 15, 2011 4:07 PM To: [EMAIL PROTECTED] Subject: Re: Which release to use? On 15/07/2011 18:06, Arun C Murthy wrote: > Apache Hadoop is a volunteer driven, open-source project. The contributors to Apache Hadoop, both individuals and folks across a diverse set of organizations, are committed to driving the project forward and making timely releases - see discussion on hadoop-0.23 with a raft newer features such as HDFS Federation, NextGen MapReduce and plans for HA NameNode etc. > > As with most successful projects there are several options for commercial support to Hadoop or its derivatives. > > However, Apache Hadoop has thrived before there was any commercial support (I've personally been involved in over 20 releases of Apache Hadoop and deployed them while at Yahoo) and I'm sure it will in this new world order. > > We, the Apache Hadoop community, are committed to keeping Apache Hadoop 'free', providing support to our users and to move it forward at a rapid rate. > Arun makes a good point which is that the Apache project depends on contributions from the community to thrive. That includes -bug reports -patches to fix problems -more tests -documentation improvements: more examples, more on getting started, troubleshooting, etc. If there's something lacking in the codebase, and you think you can fix it, please do so. Helping with the documentation is a good start, as it can be improved, and you aren't going to break anything. Once you get into changing the code, you'll end up working with the head of whichever branch you are targeting. The other area everyone can contribute on is testing. Yes, Y! and FB can test at scale, yes, other people can test large clusters too -but nobody has a network that looks like yours but you. And Hadoop does care about network configurations. Testing beta and release candidate releases in your infrastructure, helps verify that the final release will work on your site, and you don't end up getting all the phone calls about something not working +
Jeff.Schmitz@... 2011-07-18, 13:30
-
RE: Which release to use?Michael Segel 2011-07-18, 18:33
EMC has inked a deal with MapRTech to resell their release and support services for MapRTech. Does this mean that they are going to stop selling their own release on Greenplum? Maybe not in the near future, however, a Greenplum appliance may not get the customer transaction that their reselling of MapR will generate. It sounds like they are hedging their bets and are taking an 'IBM' approach. > Subject: RE: Which release to use? > Date: Mon, 18 Jul 2011 08:30:59 -0500 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > > Steve, > > I read your blog nice post - I believe EMC is selling the Greenplumb > solution as an appliance - > > Cheers - > > Jeffery > > -----Original Message----- > From: Steve Loughran [mailto:[EMAIL PROTECTED]] > Sent: Friday, July 15, 2011 4:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Which release to use? > > On 15/07/2011 18:06, Arun C Murthy wrote: > > Apache Hadoop is a volunteer driven, open-source project. The > contributors to Apache Hadoop, both individuals and folks across a > diverse set of organizations, are committed to driving the project > forward and making timely releases - see discussion on hadoop-0.23 with > a raft newer features such as HDFS Federation, NextGen MapReduce and > plans for HA NameNode etc. > > > > As with most successful projects there are several options for > commercial support to Hadoop or its derivatives. > > > > However, Apache Hadoop has thrived before there was any commercial > support (I've personally been involved in over 20 releases of Apache > Hadoop and deployed them while at Yahoo) and I'm sure it will in this > new world order. > > > > We, the Apache Hadoop community, are committed to keeping Apache > Hadoop 'free', providing support to our users and to move it forward at > a rapid rate. > > > > Arun makes a good point which is that the Apache project depends on > contributions from the community to thrive. That includes > > -bug reports > -patches to fix problems > -more tests > -documentation improvements: more examples, more on getting started, > troubleshooting, etc. > > If there's something lacking in the codebase, and you think you can fix > it, please do so. Helping with the documentation is a good start, as it > can be improved, and you aren't going to break anything. > > Once you get into changing the code, you'll end up working with the head > > of whichever branch you are targeting. > > The other area everyone can contribute on is testing. Yes, Y! and FB can > > test at scale, yes, other people can test large clusters too -but nobody > > has a network that looks like yours but you. And Hadoop does care about > network configurations. Testing beta and release candidate releases in > your infrastructure, helps verify that the final release will work on > your site, and you don't end up getting all the phone calls about > something not working > > +
Michael Segel 2011-07-18, 18:33
-
Re: Which release to use?M. C. Srivas 2011-07-19, 01:19
Mike,
Just a minor inaccuracy in your email. Here's setting the record straight: 1. MapR directly sells their distribution of Hadoop. Support is from MapR. 2. EMC also sells the MapR distribution, for use on any hardware. Support is from EMC worldwide. 3. EMC also sells a Hadoop appliance, which has the MapR distribution specially built for it. Support is from EMC. 4. MapR also has a free, unlimited, unrestricted version called M3, which has the same 2-5x performance, management and stability improvements, and includes NFS. It is not crippleware, and the unlimited, unrestricted, free use does not expire on any date. Hope that clarifies what MapR is doing. thanks & regards, Srivas. On Mon, Jul 18, 2011 at 11:33 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > > EMC has inked a deal with MapRTech to resell their release and support > services for MapRTech. > Does this mean that they are going to stop selling their own release on > Greenplum? Maybe not in the near future, however, > a Greenplum appliance may not get the customer transaction that their > reselling of MapR will generate. > > It sounds like they are hedging their bets and are taking an 'IBM' > approach. > > > > Subject: RE: Which release to use? > > Date: Mon, 18 Jul 2011 08:30:59 -0500 > > From: [EMAIL PROTECTED] > > To: [EMAIL PROTECTED] > > > > Steve, > > > > I read your blog nice post - I believe EMC is selling the Greenplumb > > solution as an appliance - > > > > Cheers - > > > > Jeffery > > > > -----Original Message----- > > From: Steve Loughran [mailto:[EMAIL PROTECTED]] > > Sent: Friday, July 15, 2011 4:07 PM > > To: [EMAIL PROTECTED] > > Subject: Re: Which release to use? > > > > On 15/07/2011 18:06, Arun C Murthy wrote: > > > Apache Hadoop is a volunteer driven, open-source project. The > > contributors to Apache Hadoop, both individuals and folks across a > > diverse set of organizations, are committed to driving the project > > forward and making timely releases - see discussion on hadoop-0.23 with > > a raft newer features such as HDFS Federation, NextGen MapReduce and > > plans for HA NameNode etc. > > > > > > As with most successful projects there are several options for > > commercial support to Hadoop or its derivatives. > > > > > > However, Apache Hadoop has thrived before there was any commercial > > support (I've personally been involved in over 20 releases of Apache > > Hadoop and deployed them while at Yahoo) and I'm sure it will in this > > new world order. > > > > > > We, the Apache Hadoop community, are committed to keeping Apache > > Hadoop 'free', providing support to our users and to move it forward at > > a rapid rate. > > > > > > > Arun makes a good point which is that the Apache project depends on > > contributions from the community to thrive. That includes > > > > -bug reports > > -patches to fix problems > > -more tests > > -documentation improvements: more examples, more on getting started, > > troubleshooting, etc. > > > > If there's something lacking in the codebase, and you think you can fix > > it, please do so. Helping with the documentation is a good start, as it > > can be improved, and you aren't going to break anything. > > > > Once you get into changing the code, you'll end up working with the head > > > > of whichever branch you are targeting. > > > > The other area everyone can contribute on is testing. Yes, Y! and FB can > > > > test at scale, yes, other people can test large clusters too -but nobody > > > > has a network that looks like yours but you. And Hadoop does care about > > network configurations. Testing beta and release candidate releases in > > your infrastructure, helps verify that the final release will work on > > your site, and you don't end up getting all the phone calls about > > something not working > > > > > +
M. C. Srivas 2011-07-19, 01:19
-
RE: Which release to use?Michael Segel 2011-07-19, 01:51
> Date: Mon, 18 Jul 2011 18:19:38 -0700 > Subject: Re: Which release to use? > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > > Mike, > > Just a minor inaccuracy in your email. Here's setting the record straight: > > 1. MapR directly sells their distribution of Hadoop. Support is from MapR. > 2. EMC also sells the MapR distribution, for use on any hardware. Support is > from EMC worldwide. > 3. EMC also sells a Hadoop appliance, which has the MapR distribution > specially built for it. Support is from EMC. > > 4. MapR also has a free, unlimited, unrestricted version called M3, which > has the same 2-5x performance, management and stability improvements, and > includes NFS. It is not crippleware, and the unlimited, unrestricted, free > use does not expire on any date. > > Hope that clarifies what MapR is doing. > > thanks & regards, > Srivas. > Srivas, I'm sorry, I thought I was being clear in that I was only addressing EMC and not MapR directly. I was responding to post about EMC selling a Greenplum appliance. I wanted to point out that EMC will resell MapR's release along with their own (EMC) support. The point I was trying to make was that with respect to derivatives of Hadoop, I believe that MapR has a more compelling story than either EMC or DataStax. IMHO replacing Java HDFS w either GreenPlum or Cassandra has a limited market. When a company is going to look at a M/R solution cost and performance are going to be at the top of the list. MapR isn't cheap but if you look at the features in M5, if they work, then you have a very compelling reason to look at their release. Some of the people I spoke to when I was in Santa Clara were in the beta program. They indicated that MapR did what they claimed. Things are definitely starting to look interesting. -Mike > On Mon, Jul 18, 2011 at 11:33 AM, Michael Segel > <[EMAIL PROTECTED]>wrote: > > > > > EMC has inked a deal with MapRTech to resell their release and support > > services for MapRTech. > > Does this mean that they are going to stop selling their own release on > > Greenplum? Maybe not in the near future, however, > > a Greenplum appliance may not get the customer transaction that their > > reselling of MapR will generate. > > > > It sounds like they are hedging their bets and are taking an 'IBM' > > approach. > > > > > > > Subject: RE: Which release to use? > > > Date: Mon, 18 Jul 2011 08:30:59 -0500 > > > From: [EMAIL PROTECTED] > > > To: [EMAIL PROTECTED] > > > > > > Steve, > > > > > > I read your blog nice post - I believe EMC is selling the Greenplumb > > > solution as an appliance - > > > > > > Cheers - > > > > > > Jeffery > > > > > > -----Original Message----- > > > From: Steve Loughran [mailto:[EMAIL PROTECTED]] > > > Sent: Friday, July 15, 2011 4:07 PM > > > To: [EMAIL PROTECTED] > > > Subject: Re: Which release to use? > > > > > > On 15/07/2011 18:06, Arun C Murthy wrote: > > > > Apache Hadoop is a volunteer driven, open-source project. The > > > contributors to Apache Hadoop, both individuals and folks across a > > > diverse set of organizations, are committed to driving the project > > > forward and making timely releases - see discussion on hadoop-0.23 with > > > a raft newer features such as HDFS Federation, NextGen MapReduce and > > > plans for HA NameNode etc. > > > > > > > > As with most successful projects there are several options for > > > commercial support to Hadoop or its derivatives. > > > > > > > > However, Apache Hadoop has thrived before there was any commercial > > > support (I've personally been involved in over 20 releases of Apache > > > Hadoop and deployed them while at Yahoo) and I'm sure it will in this > > > new world order. > > > > > > > > We, the Apache Hadoop community, are committed to keeping Apache > > > Hadoop 'free', providing support to our users and to move it forward at > > > a rapid rate. > > > > > > > > > > Arun makes a good point which is that the Apache project depends on +
Michael Segel 2011-07-19, 01:51
-
Re: Which release to use?Joe Stein 2011-07-19, 02:00
So, last I checked this list was about Apache Hadoop not about derivative works.
The Cloudera team has always been diligent (you rock) about redirecting non apache CDH releases to their list for answers. I commend those supporting apache releases of Hadoop too, very cool!!! But yeah, even I have to ask what the latest release will be. Is there going to be a single Hadoop release or a continued branch that Horton maintains and will only support? There is something to be said for release from trunk that gets everyone on the same page towards our common goals. You can pin the "state the obvious" paper on my back but kinda feel it had to be said. One love, Apache Hadoop! /* Joe Stein http://www.medialets.com Twitter: @allthingshadoop */ On Jul 18, 2011, at 9:51 PM, Michael Segel <[EMAIL PROTECTED]> wrote: > > > >> Date: Mon, 18 Jul 2011 18:19:38 -0700 >> Subject: Re: Which release to use? >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED] >> >> Mike, >> >> Just a minor inaccuracy in your email. Here's setting the record straight: >> >> 1. MapR directly sells their distribution of Hadoop. Support is from MapR. >> 2. EMC also sells the MapR distribution, for use on any hardware. Support is >> from EMC worldwide. >> 3. EMC also sells a Hadoop appliance, which has the MapR distribution >> specially built for it. Support is from EMC. >> >> 4. MapR also has a free, unlimited, unrestricted version called M3, which >> has the same 2-5x performance, management and stability improvements, and >> includes NFS. It is not crippleware, and the unlimited, unrestricted, free >> use does not expire on any date. >> >> Hope that clarifies what MapR is doing. >> >> thanks & regards, >> Srivas. >> > Srivas, > > I'm sorry, I thought I was being clear in that I was only addressing EMC and not MapR directly. > I was responding to post about EMC selling a Greenplum appliance. I wanted to point out that EMC will resell MapR's release along with their own (EMC) support. > > The point I was trying to make was that with respect to derivatives of Hadoop, I believe that MapR has a more compelling story than either EMC or DataStax. IMHO replacing Java HDFS w either GreenPlum or Cassandra has a limited market. When a company is going to look at a M/R solution cost and performance are going to be at the top of the list. MapR isn't cheap but if you look at the features in M5, if they work, then you have a very compelling reason to look at their release. Some of the people I spoke to when I was in Santa Clara were in the beta program. They indicated that MapR did what they claimed. > > Things are definitely starting to look interesting. > > -Mike > >> On Mon, Jul 18, 2011 at 11:33 AM, Michael Segel >> <[EMAIL PROTECTED]>wrote: >> >>> >>> EMC has inked a deal with MapRTech to resell their release and support >>> services for MapRTech. >>> Does this mean that they are going to stop selling their own release on >>> Greenplum? Maybe not in the near future, however, >>> a Greenplum appliance may not get the customer transaction that their >>> reselling of MapR will generate. >>> >>> It sounds like they are hedging their bets and are taking an 'IBM' >>> approach. >>> >>> >>>> Subject: RE: Which release to use? >>>> Date: Mon, 18 Jul 2011 08:30:59 -0500 >>>> From: [EMAIL PROTECTED] >>>> To: [EMAIL PROTECTED] >>>> >>>> Steve, >>>> >>>> I read your blog nice post - I believe EMC is selling the Greenplumb >>>> solution as an appliance - >>>> >>>> Cheers - >>>> >>>> Jeffery >>>> >>>> -----Original Message----- >>>> From: Steve Loughran [mailto:[EMAIL PROTECTED]] >>>> Sent: Friday, July 15, 2011 4:07 PM >>>> To: [EMAIL PROTECTED] >>>> Subject: Re: Which release to use? >>>> >>>> On 15/07/2011 18:06, Arun C Murthy wrote: >>>>> Apache Hadoop is a volunteer driven, open-source project. The >>>> contributors to Apache Hadoop, both individuals and folks across a >>>> diverse set of organizations, are committed to driving the project +
Joe Stein 2011-07-19, 02:00
-
Re: Which release to use?Arun Murthy 2011-07-19, 03:06
Joe,
The dev community is currently gearing up for hadoop-0.23 off trunk. 0.23 is a massive step forward with with HDFS Federation, NextGen MapReduce and possible others such as wire-compat and HA NameNode. In a couple of weeks I plan to create the 0.23 branch off trunk and we then spend all our energies stabilizing & pushing the release out. Please see my note to general@ for more details. Arun On Jul 18, 2011, at 7:01 PM, Joe Stein <[EMAIL PROTECTED]> wrote: > So, last I checked this list was about Apache Hadoop not about derivative works. > > The Cloudera team has always been diligent (you rock) about redirecting non apache CDH releases to their list for answers. > > I commend those supporting apache releases of Hadoop too, very cool!!! > > But yeah, even I have to ask what the latest release will be. Is there going to be a single Hadoop release or a continued branch that Horton maintains and will only support? > > There is something to be said for release from trunk that gets everyone on the same page towards our common goals. You can pin the "state the obvious" paper on my back but kinda feel it had to be said. > > One love, Apache Hadoop! > > /* > Joe Stein > http://www.medialets.com > Twitter: @allthingshadoop > */ > > On Jul 18, 2011, at 9:51 PM, Michael Segel <[EMAIL PROTECTED]> wrote: > >> >> >> >>> Date: Mon, 18 Jul 2011 18:19:38 -0700 >>> Subject: Re: Which release to use? >>> From: [EMAIL PROTECTED] >>> To: [EMAIL PROTECTED] >>> >>> Mike, >>> >>> Just a minor inaccuracy in your email. Here's setting the record straight: >>> >>> 1. MapR directly sells their distribution of Hadoop. Support is from MapR. >>> 2. EMC also sells the MapR distribution, for use on any hardware. Support is >>> from EMC worldwide. >>> 3. EMC also sells a Hadoop appliance, which has the MapR distribution >>> specially built for it. Support is from EMC. >>> >>> 4. MapR also has a free, unlimited, unrestricted version called M3, which >>> has the same 2-5x performance, management and stability improvements, and >>> includes NFS. It is not crippleware, and the unlimited, unrestricted, free >>> use does not expire on any date. >>> >>> Hope that clarifies what MapR is doing. >>> >>> thanks & regards, >>> Srivas. >>> >> Srivas, >> >> I'm sorry, I thought I was being clear in that I was only addressing EMC and not MapR directly. >> I was responding to post about EMC selling a Greenplum appliance. I wanted to point out that EMC will resell MapR's release along with their own (EMC) support. >> >> The point I was trying to make was that with respect to derivatives of Hadoop, I believe that MapR has a more compelling story than either EMC or DataStax. IMHO replacing Java HDFS w either GreenPlum or Cassandra has a limited market. When a company is going to look at a M/R solution cost and performance are going to be at the top of the list. MapR isn't cheap but if you look at the features in M5, if they work, then you have a very compelling reason to look at their release. Some of the people I spoke to when I was in Santa Clara were in the beta program. They indicated that MapR did what they claimed. >> >> Things are definitely starting to look interesting. >> >> -Mike >> >>> On Mon, Jul 18, 2011 at 11:33 AM, Michael Segel >>> <[EMAIL PROTECTED]>wrote: >>> >>>> >>>> EMC has inked a deal with MapRTech to resell their release and support >>>> services for MapRTech. >>>> Does this mean that they are going to stop selling their own release on >>>> Greenplum? Maybe not in the near future, however, >>>> a Greenplum appliance may not get the customer transaction that their >>>> reselling of MapR will generate. >>>> >>>> It sounds like they are hedging their bets and are taking an 'IBM' >>>> approach. >>>> >>>> >>>>> Subject: RE: Which release to use? >>>>> Date: Mon, 18 Jul 2011 08:30:59 -0500 >>>>> From: [EMAIL PROTECTED] >>>>> To: [EMAIL PROTECTED] >>>>> >>>>> Steve, >>>>> >>>>> I read your blog nice post - I believe EMC is selling the Greenplumb +
Arun Murthy 2011-07-19, 03:06
-
Re: Which release to use?Joe Stein 2011-07-19, 03:19
Arun,
Thanks for the update. Again, I hate to have to play the part of captain obvious. Glad to hear the same contiguous mantra for this next release. I think sometimes the plebeians ( of which I am one ) need that affirmation. One love, Apache Hadoop! /* Joe Stein http://www.medialets.com Twitter: @allthingshadoop */ On Jul 18, 2011, at 11:06 PM, Arun Murthy <[EMAIL PROTECTED]> wrote: > Joe, > > The dev community is currently gearing up for hadoop-0.23 off trunk. > > 0.23 is a massive step forward with with HDFS Federation, NextGen > MapReduce and possible others such as wire-compat and HA NameNode. > > In a couple of weeks I plan to create the 0.23 branch off trunk and we > then spend all our energies stabilizing & pushing the release out. > Please see my note to general@ for more details. > > Arun > > On Jul 18, 2011, at 7:01 PM, Joe Stein <[EMAIL PROTECTED]> wrote: > >> So, last I checked this list was about Apache Hadoop not about derivative works. >> >> The Cloudera team has always been diligent (you rock) about redirecting non apache CDH releases to their list for answers. >> >> I commend those supporting apache releases of Hadoop too, very cool!!! >> >> But yeah, even I have to ask what the latest release will be. Is there going to be a single Hadoop release or a continued branch that Horton maintains and will only support? >> >> There is something to be said for release from trunk that gets everyone on the same page towards our common goals. You can pin the "state the obvious" paper on my back but kinda feel it had to be said. >> >> One love, Apache Hadoop! >> >> /* >> Joe Stein >> http://www.medialets.com >> Twitter: @allthingshadoop >> */ >> >> On Jul 18, 2011, at 9:51 PM, Michael Segel <[EMAIL PROTECTED]> wrote: >> >>> >>> >>> >>>> Date: Mon, 18 Jul 2011 18:19:38 -0700 >>>> Subject: Re: Which release to use? >>>> From: [EMAIL PROTECTED] >>>> To: [EMAIL PROTECTED] >>>> >>>> Mike, >>>> >>>> Just a minor inaccuracy in your email. Here's setting the record straight: >>>> >>>> 1. MapR directly sells their distribution of Hadoop. Support is from MapR. >>>> 2. EMC also sells the MapR distribution, for use on any hardware. Support is >>>> from EMC worldwide. >>>> 3. EMC also sells a Hadoop appliance, which has the MapR distribution >>>> specially built for it. Support is from EMC. >>>> >>>> 4. MapR also has a free, unlimited, unrestricted version called M3, which >>>> has the same 2-5x performance, management and stability improvements, and >>>> includes NFS. It is not crippleware, and the unlimited, unrestricted, free >>>> use does not expire on any date. >>>> >>>> Hope that clarifies what MapR is doing. >>>> >>>> thanks & regards, >>>> Srivas. >>>> >>> Srivas, >>> >>> I'm sorry, I thought I was being clear in that I was only addressing EMC and not MapR directly. >>> I was responding to post about EMC selling a Greenplum appliance. I wanted to point out that EMC will resell MapR's release along with their own (EMC) support. >>> >>> The point I was trying to make was that with respect to derivatives of Hadoop, I believe that MapR has a more compelling story than either EMC or DataStax. IMHO replacing Java HDFS w either GreenPlum or Cassandra has a limited market. When a company is going to look at a M/R solution cost and performance are going to be at the top of the list. MapR isn't cheap but if you look at the features in M5, if they work, then you have a very compelling reason to look at their release. Some of the people I spoke to when I was in Santa Clara were in the beta program. They indicated that MapR did what they claimed. >>> >>> Things are definitely starting to look interesting. >>> >>> -Mike >>> >>>> On Mon, Jul 18, 2011 at 11:33 AM, Michael Segel >>>> <[EMAIL PROTECTED]>wrote: >>>> >>>>> >>>>> EMC has inked a deal with MapRTech to resell their release and support >>>>> services for MapRTech. >>>>> Does this mean that they are going to stop selling their own release on +
Joe Stein 2011-07-19, 03:19
-
Re: Which release to use?Rita 2011-07-19, 11:44
Arun,
I second Joeś comment. Thanks for giving us a heads up. I will wait patiently until 0.23 is considered stable. On Mon, Jul 18, 2011 at 11:19 PM, Joe Stein <[EMAIL PROTECTED]>wrote: > Arun, > > Thanks for the update. > > Again, I hate to have to play the part of captain obvious. > > Glad to hear the same contiguous mantra for this next release. I think > sometimes the plebeians ( of which I am one ) need that affirmation. > > One love, Apache Hadoop! > > /* > Joe Stein > http://www.medialets.com > Twitter: @allthingshadoop > */ > > On Jul 18, 2011, at 11:06 PM, Arun Murthy <[EMAIL PROTECTED]> wrote: > > > Joe, > > > > The dev community is currently gearing up for hadoop-0.23 off trunk. > > > > 0.23 is a massive step forward with with HDFS Federation, NextGen > > MapReduce and possible others such as wire-compat and HA NameNode. > > > > In a couple of weeks I plan to create the 0.23 branch off trunk and we > > then spend all our energies stabilizing & pushing the release out. > > Please see my note to general@ for more details. > > > > Arun > > > > On Jul 18, 2011, at 7:01 PM, Joe Stein <[EMAIL PROTECTED]> > wrote: > > > >> So, last I checked this list was about Apache Hadoop not about > derivative works. > >> > >> The Cloudera team has always been diligent (you rock) about redirecting > non apache CDH releases to their list for answers. > >> > >> I commend those supporting apache releases of Hadoop too, very cool!!! > >> > >> But yeah, even I have to ask what the latest release will be. Is there > going to be a single Hadoop release or a continued branch that Horton > maintains and will only support? > >> > >> There is something to be said for release from trunk that gets everyone > on the same page towards our common goals. You can pin the "state the > obvious" paper on my back but kinda feel it had to be said. > >> > >> One love, Apache Hadoop! > >> > >> /* > >> Joe Stein > >> http://www.medialets.com > >> Twitter: @allthingshadoop > >> */ > >> > >> On Jul 18, 2011, at 9:51 PM, Michael Segel <[EMAIL PROTECTED]> > wrote: > >> > >>> > >>> > >>> > >>>> Date: Mon, 18 Jul 2011 18:19:38 -0700 > >>>> Subject: Re: Which release to use? > >>>> From: [EMAIL PROTECTED] > >>>> To: [EMAIL PROTECTED] > >>>> > >>>> Mike, > >>>> > >>>> Just a minor inaccuracy in your email. Here's setting the record > straight: > >>>> > >>>> 1. MapR directly sells their distribution of Hadoop. Support is from > MapR. > >>>> 2. EMC also sells the MapR distribution, for use on any hardware. > Support is > >>>> from EMC worldwide. > >>>> 3. EMC also sells a Hadoop appliance, which has the MapR distribution > >>>> specially built for it. Support is from EMC. > >>>> > >>>> 4. MapR also has a free, unlimited, unrestricted version called M3, > which > >>>> has the same 2-5x performance, management and stability improvements, > and > >>>> includes NFS. It is not crippleware, and the unlimited, unrestricted, > free > >>>> use does not expire on any date. > >>>> > >>>> Hope that clarifies what MapR is doing. > >>>> > >>>> thanks & regards, > >>>> Srivas. > >>>> > >>> Srivas, > >>> > >>> I'm sorry, I thought I was being clear in that I was only addressing > EMC and not MapR directly. > >>> I was responding to post about EMC selling a Greenplum appliance. I > wanted to point out that EMC will resell MapR's release along with their own > (EMC) support. > >>> > >>> The point I was trying to make was that with respect to derivatives of > Hadoop, I believe that MapR has a more compelling story than either EMC or > DataStax. IMHO replacing Java HDFS w either GreenPlum or Cassandra has a > limited market. When a company is going to look at a M/R solution cost and > performance are going to be at the top of the list. MapR isn't cheap but if > you look at the features in M5, if they work, then you have a very > compelling reason to look at their release. Some of the people I spoke to > when I was in Santa Clara were in the beta program. They indicated that MapR +
Rita 2011-07-19, 11:44
-
Re: Which release to use?Steve Loughran 2011-07-19, 11:50
On 19/07/11 12:44, Rita wrote:
> Arun, > > I second Joeś comment. > Thanks for giving us a heads up. > I will wait patiently until 0.23 is considered stable. > API-wise, 0.21 is better. I know that as I'm working with 0.20.203 right now, and it is a step backwards. Regarding future releases, the best way to get it stable is participate in release testing in your own infrastructure. Nothing else will find the problems unique to your setup of hardware, network and software +
Steve Loughran 2011-07-19, 11:50
-
Re: Which release to use?Vitalii Tymchyshyn 2011-07-19, 12:10
19.07.11 14:50, Steve Loughran написав(ла):
> On 19/07/11 12:44, Rita wrote: >> Arun, >> >> I second Joeś comment. >> Thanks for giving us a heads up. >> I will wait patiently until 0.23 is considered stable. >> > > API-wise, 0.21 is better. I know that as I'm working with 0.20.203 > right now, and it is a step backwards. > > Regarding future releases, the best way to get it stable is > participate in release testing in your own infrastructure. Nothing > else will find the problems unique to your setup of hardware, network > and software > My little hadoop adoption story (or why I won't test 0.23) I am among those who think that latest release is what is supported and so we got to 0.21 way. BTW: I've tried to find some release roadmap, but could not find anything up to date. We are using HDFR without Map/Reduce. As far as I can see now 0.21 nowhere near beta quality with non-working new features like backup node or append. Also there is no option for such unlucky people to back off to 0.20 (at least "hadoop downgrade" search do not give any good results). I did already fill 5 tickets in Jira, 3 of them with patches. On two there is no activity at all, on other three answer is the latest non-autogenerated message (and over 3 weeks old). I did send few messages to this list, one to hdfs-user. No answers. With this level of project activity, I can't afford to test a thing that have not got to 0.21 quality level yet. If I will have any problems, I can't afford to wait for months to be heard. I am more or less stable on my own patched 0.21 for now and will either move forward if I will see more project activity or move somewhere else if it will become "less stable". Best regards, Vitalii Tymchyshyn +
Vitalii Tymchyshyn 2011-07-19, 12:10
-
Re: Which release to use?Steve Loughran 2011-07-15, 21:00
On 15/07/2011 15:58, Michael Segel wrote:
> > Unfortunately the picture is a bit more confusing. > > Yahoo! is now HortonWorks. Their stated goal is to not have their own derivative release but to sell commercial support for the official Apache release. > So those selling commercial support are: > *Cloudera > *HortonWorks > *MapRTech > *EMC (reselling MapRTech, but had announced their own) > *IBM (not sure what they are selling exactly... still seems like smoke and mirrors...) > *DataStax + Amazon, indirectly, that do their own derivative work of some release of Hadoop (which version is it based on?) I've used 0.21, which was the first with the new APIs and, with MRUnit, has the best test framework. For my small-cluster uses, it worked well. (oh, and I didn't care about security) +
Steve Loughran 2011-07-15, 21:00
-
Re: Which release to use?Mark Kerzner 2011-07-15, 21:25
Steve,
this is so well said, do you mind if I repeat it here, http://shmsoft.blogspot.com/2011/07/hadoop-commercial-support-options.html Thank you, Mark On Fri, Jul 15, 2011 at 4:00 PM, Steve Loughran <[EMAIL PROTECTED]> wrote: > On 15/07/2011 15:58, Michael Segel wrote: > >> >> Unfortunately the picture is a bit more confusing. >> >> Yahoo! is now HortonWorks. Their stated goal is to not have their own >> derivative release but to sell commercial support for the official Apache >> release. >> So those selling commercial support are: >> *Cloudera >> *HortonWorks >> *MapRTech >> *EMC (reselling MapRTech, but had announced their own) >> *IBM (not sure what they are selling exactly... still seems like smoke and >> mirrors...) >> *DataStax >> > > + Amazon, indirectly, that do their own derivative work of some release of > Hadoop (which version is it based on?) > > I've used 0.21, which was the first with the new APIs and, with MRUnit, has > the best test framework. For my small-cluster uses, it worked well. (oh, and > I didn't care about security) > > > +
Mark Kerzner 2011-07-15, 21:25
-
RE: Which release to use?Michael Segel 2011-07-15, 21:58
See, I knew there was something that I forgot. It all goes back to the question ... 'which release to use'... 2 years ago it was a very simple decision. Now, not so much. :-) And while Arun and Ownen work for a vendor, I do not and I try to follow each company and their offering. As Hadoop goes mainstream, the question of which vendor to choose gets interesting. Just like in the 90's during the database vendor wars, it looks like the vendor who has the best sales force and PR will win. (Not necessarily the best product.) JMHO -Mike > Date: Fri, 15 Jul 2011 16:25:55 -0500 > Subject: Re: Which release to use? > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > > Steve, > > this is so well said, do you mind if I repeat it here, > http://shmsoft.blogspot.com/2011/07/hadoop-commercial-support-options.html > > Thank you, > Mark > > On Fri, Jul 15, 2011 at 4:00 PM, Steve Loughran <[EMAIL PROTECTED]> wrote: > > > On 15/07/2011 15:58, Michael Segel wrote: > > > >> > >> Unfortunately the picture is a bit more confusing. > >> > >> Yahoo! is now HortonWorks. Their stated goal is to not have their own > >> derivative release but to sell commercial support for the official Apache > >> release. > >> So those selling commercial support are: > >> *Cloudera > >> *HortonWorks > >> *MapRTech > >> *EMC (reselling MapRTech, but had announced their own) > >> *IBM (not sure what they are selling exactly... still seems like smoke and > >> mirrors...) > >> *DataStax > >> > > > > + Amazon, indirectly, that do their own derivative work of some release of > > Hadoop (which version is it based on?) > > > > I've used 0.21, which was the first with the new APIs and, with MRUnit, has > > the best test framework. For my small-cluster uses, it worked well. (oh, and > > I didn't care about security) > > > > > > +
Michael Segel 2011-07-15, 21:58
-
Re: Which release to use?Tom Deutsch 2011-07-17, 20:07
There are two release levels - one is free but most of our customers want our additional engineering so they use Enterprise Edition (which is not free).
Happy to answer questions off list. --------------------------------------- Sent from my Blackberry so please excuse typing and spelling errors. ----- Original Message ----- From: Steve Loughran [[EMAIL PROTECTED]] Sent: 07/17/2011 08:34 PM CET To: [EMAIL PROTECTED] Subject: Re: Which release to use? On 16/07/2011 16:53, Rita wrote: > I am curious about the IBM product BigInishgts. Where can we download it? It > seems we have to register to download it? > I think you have to pay to use it +
Tom Deutsch 2011-07-17, 20:07
-
RE: Which release to use?Michael Segel 2011-07-18, 01:52
Well I'm sort of curious as to what is in the 'free' version which differentiates from the Apache release? Earlier you wrote that IBM was faithful to the Apache release, plus a few 'extras'. (I think I can find your exact quote and I'm sorry I'm paraphrasing your statements.) This begs two questions... 1) What is IBM providing to your 'customers' to justify the uplift or premium for IBM's brand name. 2) If your release includes components which are not part of the Apache release, is it Apache's Hadoop? or considered a derivative? The interesting thing about #2 is that I don't know if or what represents Hadoop. I mean if you take an earlier release of Hadoop like 20.2 where current is 20.203 and apply a subset of patches that are Apache committed, is this not Apache Hadoop or a derivative work since you are not 100% at the latest release. Note: This is a broader question than just what IBM is releasing but what is meant by saying Hadoop or derived from Hadoop. Clearly DataStax and MapR are derivatives. Cloudera? This goes back to the OP's question 'Which release to use?'... And I have to apologize if I seem a bit suspect on what IBM has to say. When IBM first entered with an announced Hadoop release it was only for 32bit JVM and only on IBM's JVM. The last I heard, IBM's upsell was a configuration tool, which if anyone has built more than one Cloud/Cluster, its pretty much worthless. So it would be interesting to see what IBM is really offering in this space. HTH -Mike > Subject: Re: Which release to use? > From: [EMAIL PROTECTED] > Date: Sun, 17 Jul 2011 14:07:20 -0600 > To: [EMAIL PROTECTED] > > There are two release levels - one is free but most of our customers want our additional engineering so they use Enterprise Edition (which is not free). > > Happy to answer questions off list. > > --------------------------------------- > Sent from my Blackberry so please excuse typing and spelling errors. > > > ----- Original Message ----- > From: Steve Loughran [[EMAIL PROTECTED]] > Sent: 07/17/2011 08:34 PM CET > To: [EMAIL PROTECTED] > Subject: Re: Which release to use? > > > > On 16/07/2011 16:53, Rita wrote: > > I am curious about the IBM product BigInishgts. Where can we download it? It > > seems we have to register to download it? > > > > I think you have to pay to use it +
Michael Segel 2011-07-18, 01:52
|