Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Can we replace namenode machine with some other machine ?


Copy link to this message
-
RE: Can we replace namenode machine with some other machine ?

I agree w Steve except on one thing...

RAID 5 Bad. RAID 10 (1+0) good.

Sorry this goes back to my RDBMs days where RAID 5 will kill your performance and worse...

> Date: Thu, 22 Sep 2011 11:28:39 +0100
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: Can we replace namenode machine with some other machine ?
>
> On 22/09/11 05:42, praveenesh kumar wrote:
> > Hi all,
> >
> > Can we replace our namenode machine later with some other machine. ?
> > Actually I got a new  server machine in my cluster and now I want to make
> > this machine as my new namenode and jobtracker node ?
> > Also Does Namenode/JobTracker machine's configuration needs to be better
> > than datanodes/tasktracker's ??
> >
>
> 1. I'd give it lots of RAM - holding data about many files, avoiding
> swapping, etc.
>
> 2. I'd make sure the disks are RAID5, with some NFS-mounted FS that the
> secondary namenode can talk to. avoids risk of loss of the index, which,
> if it happens, renders your filesystem worthless. If I was really
> paranoid I'd have twin raid controllers with separate connections to
> disk arrays in separate racks, as [Jiang2008] shows that interconnect
> problems on disk arrays can be higher than HDD failures.
>
> 3. if your central switches are at 10 GbE, consider getting a 10GbE NIC
> and hooking it up directly -this stops the network being the bottleneck,
> though it does mean the server can have a lot more packets hitting it,
> so putting more load on it.
>
> 4. Leave space for a second CPU and time for GC tuning.
>
>
> JT's are less important; they need RAM but use HDFS for storage. If your
> cluster is small, NN and JT can be run locally. If you do this, set up
> DNS to have two hostnames to point to same network address. Then if you
> ever split them off, everyone whose bookmark says http://jobtracker
> won't notice
>
> Either way: the NN and the JT are the machines whose availability you
> care about. The rest is just a source of statistics you can look at later.
>
> -Steve
>
>
>
> [Jiang2008] "Are disks the dominant contributor for storage failures?: A
> comprehensive study of storage subsystem failure characteristics". ACM
> Transactions on Storage.
>