Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # dev >> Re: [VOTE] Merge HDFS-3077 (QuorumJournalManager) branch to trunk

sanjay Radia 2012-09-28, 19:06
Todd Lipcon 2012-09-28, 22:02
sanjay Radia 2012-10-01, 17:55
Todd Lipcon 2012-10-09, 00:46
Suresh Srinivas 2012-10-09, 01:01
Todd Lipcon 2012-10-09, 01:20
Suresh Srinivas 2012-10-09, 02:21
Eli Collins 2012-10-09, 04:11
sanjay Radia 2012-10-09, 20:44
sanjay Radia 2012-10-11, 02:17
Suresh Srinivas 2012-10-11, 04:10
Todd Lipcon 2012-10-11, 08:12
Andrew Purtell 2012-10-09, 03:03
Suresh Srinivas 2012-10-09, 03:32
Andrew Purtell 2012-10-09, 03:09
Eli Collins 2012-10-09, 01:48
sanjay Radia 2012-09-28, 19:03
Stack 2012-09-27, 16:59
Todd Lipcon 2012-09-27, 20:29
Copy link to this message
Re: [VOTE] Merge HDFS-3077 (QuorumJournalManager) branch to trunk
I am in favor of keeping QJM in HDFS.

QJM is very specific to HDFS and is tightly coupled with HDFS code,
essentially extending the current editlog functionality that writes to
local disk to writing to a separate set of daemons. Clearly there is a need
for this in HDFS. Konstantin, I see your point that it brings in
complexity. To certain extent this complexity cannot be avoided given the
goals and the feature set. Additionally people can chose not to use this
optional functionality and avoid the complexity it brings both in terms
more involved journaling mechanism and management of additional set of
daemons it introduces.

In future, we could chose to make this generic enough or standardise the
HDFS interfaces/code that it depends on and perhaps spin it off as another
project. Also if this is complex and error prone, perhaps we could simplify
the functionality or even replace it. But for now I feel it belongs to HDFS.

On Wed, Sep 26, 2012 at 10:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> On Tue, Sep 25, 2012 at 11:21 PM, Konstantin Shvachko
> <[EMAIL PROTECTED]> wrote:
> > I think this is a great work, Todd.
> > And I think we should not merge it into trunk or other branches.
> > As I suggested earlier on this list I think this should be spinned off
> > as a separate project or a subproject.
> >
> > - The code is well detached as a self contained package.
> The addition is mostly self-contained, but it makes use of a bunch of
> "private" parts of HDFS and Common:
> - Reuses all of the Hadoop security infrastructure, IPC, metrics, etc
> - Coupled to the JournalManager interface which is still evolving. In
> fact there were several patches in trunk which were done during the
> development of this project, specifically to make this API more
> general. There's still some further work to be done in this area on
> the generic interface -- eg support for upgrade/rollback.
> - The functional tests make use of a bunch of "private" HDFS APIs as well.
> > - It is a logically stand-alone project that can be replaced by other
> > technologies.
> > - If it is a separate project then there is no need to port it to
> > other versions. You can package it as a dependent jar.
> Per above, it's not that separate, because in order to build it, we
> had to make a number of changes to core HDFS internal interfaces. It
> currently couldn't be used to store anything except for NN logs. It
> would be a nice extension to truly separate it out into a
> content-agnostic quorum-based edit log, but today it actually uses the
> existing edit log validation code to determine valid lengths, etc.
> > - Finally, it will be a good precedent of spinning new projects out of
> > HDFS rather than bringing everything under HDFS umbrella.
> >
> > Todd, I had a feeling you were in favor of this direction?
> I'm not in favor of it - I had said previously that it's worth
> discussing if several other people believe the same.
> I know that we plan to ship it as part of CDH and will be our
> recommended way of running HA HDFS. If the community doesn't accept
> the contribution, and prefers that we maintain it in a fork on github,
> then it's worth hearing. But I imagine that many other community
> members will want to either use or it ship it as part of their
> distros. Moving it to an entirely separate standalone project will
> just add extra work for these folks who, like us, think it's currently
> the best option for HA log storage.
> If at some point in the future, the internal APIs have fully
> stabilized (security, IPC, edit log streams, JournalManager, metrics,
> etc) then we can pull it out at that time.
> -Todd
> > On Tue, Sep 25, 2012 at 4:58 PM, Eli Collins <[EMAIL PROTECTED]> wrote:
> >> +1   Awesome work Todd.
> >>
> >> On Tue, Sep 25, 2012 at 4:02 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> >>> Dear fellow HDFS developers,
> >>>
> >>> Per my email thread last week ("Heads up: merge for QJM branch soon"
> >>> at http://markmail.org/message/vkyh5culdsuxdb6t) I would like to

Andrew Purtell 2012-09-27, 09:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB