|
|
-
MRv1 JT Availability (was [DISCUSS] Spin out MR, HDFS and YARN ...)Andrew Purtell 2012-09-09, 17:57
Hi Arun,
On Mon, Sep 3, 2012 at 4:02 AM Arun C Murthy wrote: > > On Sep 1, 2012, at 6:32 AM, Andrew Purtell wrote: > > I'd imagine such a MR(v1) in Hadoop, if this happened, would concentrate on > > performance improvements, maybe such things as alternate shuffle plugins. > > Perhaps a HA JobTracker for parity with HDFS. > > Lots of this has already happened in branch-1, please look at: > # JT Availability: MAPREDUCE-3837, MAPREDUCE-4328, MAPREDUCE-4603 (WIP) Thanks for the pointers! I just want to be more clear in what I meant by "HA JobTracker for parity with HDFS". There should be no need to quiesce the JT with a highly available NameNode, and restarting jobs from the beginning if the JT crashes isn't good enough to meet the user expectations implied by "high availability", at least those who are our internal customers. I meant hot JT failover, that there is a primary and backup JT, that they share state sufficient for the backup to take over immediately if the primary fails, and that the TTs and JobClients both will switch seamlessly to the backup should their communications with the primary fail. I'd expect state sharing to limit scalability to the small- and medium-cluster range, and that's fine, YARN is the answer for scalability issues in the large and largest clusters already. > # Performance - backports of PureJavaCrc32 in spills (MAPREDUCE-782), fadvise backports (MAPREDUCE-3289) and other several misc. fixes. -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) |