Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # general - [VOTE] Direction for Hadoop development


+
Owen OMalley 2010-11-29, 22:30
+
Doug Cutting 2010-11-29, 23:14
+
Owen OMalley 2010-11-30, 00:22
+
Doug Cutting 2010-12-01, 15:21
+
Owen OMalley 2010-12-01, 15:40
+
Doug Cutting 2010-12-06, 19:30
+
Chris Douglas 2010-12-06, 22:40
+
Doug Cutting 2010-12-06, 23:45
+
Roy T. Fielding 2010-12-07, 01:09
+
Arun C Murthy 2010-12-07, 16:45
+
Doug Cutting 2010-12-07, 17:18
+
Roy T. Fielding 2010-12-07, 22:37
+
Doug Cutting 2010-12-08, 18:12
+
Owen OMalley 2010-12-14, 03:08
+
Eric Sammer 2010-12-14, 04:49
+
Owen OMalley 2010-12-14, 05:43
+
Eric Sammer 2010-12-14, 07:14
+
Owen OMalley 2010-12-14, 19:08
+
Jay Booth 2010-12-01, 16:29
+
Scott Carey 2010-12-08, 03:33
+
Konstantin Shvachko 2010-12-01, 01:57
+
Owen OMalley 2010-12-01, 19:11
+
Owen OMalley 2010-12-06, 17:16
+
Chris Douglas 2010-12-06, 18:40
+
Arun C Murthy 2010-12-06, 18:46
+
Tom White 2010-12-06, 21:14
+
Konstantin Shvachko 2010-12-07, 11:27
+
Doug Cutting 2010-12-07, 17:22
+
Konstantin Shvachko 2010-12-07, 18:26
+
Doug Cutting 2010-12-08, 18:55
+
Steve Loughran 2010-12-01, 12:25
+
Eric Sammer 2010-12-07, 03:36
+
Owen OMalley 2010-12-07, 08:13
Copy link to this message
-
Re: [VOTE] Direction for Hadoop development
Jeff Hammerbacher 2010-12-07, 10:23
>
> A critical part of Hadoop's usability comes from its framework combined
> with library code that allows users to get the desired functionality without
> writing it themselves.
>
> The goal is to make Hadoop useful out of the box.
>

To the best of my knowledge, Owen, your organization requires users to
petition a committee before writing MapReduce jobs. At Facebook, the vast
majority of jobs are submitted via Hive. Our customers at Cloudera primarily
consume MapReduce through Pig, Hive, and other high-level tools.

Users of Hadoop have moved beyond MapReduce. The community would be far
better served by a compact, reliable, and efficient kernel. That's the
project direction Doug has suggested for MapReduce, and it's one that Eric
and Tom have supported. I also support this direction for the project.

We're clearly having a hard time, as a community, agreeing on standards for
library code. We've also shipped updates to the framework without updating
the library code, seriously damaging the usability of the project. In this
discussion, we're prioritizing the rapidly shrinking proportion of users of
MapReduce library code in favor of the far larger community of consumers of
the framework.

Arun recently asked on Quora about issues that users face with Hadoop
MapReduce: http://qr.ae/pPNK. There are currently five issues brought up
there, with 19 votes for those issues; none of them are addressed directly
by this extended debate.

I'd be ecstatic to see this discussion result in moving the file formats,
input and output formats, and other library code out to a separate Apache
project or Github where they can evolve rapidly based on user needs, so that
the MapReduce project can begin to address some of the outstanding issues
with the framework itself.

HDFS, HBase, Hive, Pig, Oozie, and other Hadoop-related projects continue to
make forward progress at a remarkable rate; I'd like to see MapReduce return
to health as well. Clearing away these major sources of conflict seems like
one promising path forward.

So, I'm not on the PMC, but I'm -1 on the proposed vote.
+
Arun C Murthy 2010-12-07, 16:12
+
Doug Cutting 2010-12-07, 17:26
+
Owen OMalley 2010-12-07, 18:25
+
Doug Cutting 2010-12-08, 19:20
+
Eric Sammer 2010-12-07, 18:08
+
Arun C Murthy 2010-12-07, 15:55
+
Jay Booth 2010-12-07, 16:06