Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Multiple NFS/HttpFS gateways


Copy link to this message
-
Multiple NFS/HttpFS gateways
I need to copy log files from our web servers (only 14 servers) to HDFS.
Before deploying Kfaka or Flume, I just want to provide a simple solution
to our TechOp guys using NFS or HTTP with which they are familiar.

My questions are:

   1. Compared with NFS/HttpFS, which one is faster?
   2. If I start a NFS or HttpFS gateway on each DataNode, can I setup a
   load balancer for those gateways? Does NFS work with load balancer?
   3. If use NFS and load balancer doesn't work with NFS, can "automounter
   + DNS round robin" help?
      1. Different servers will write different files into HDFS.
      2. A cron job will be invoked every hour to copy the archived log
      file of the previous hour to HDFS.
      3. I expect it works like this if it is possible:
         1. The cron job tries to copy a file into the mounted NFS
         directory;
         2. autofs get one of NFS gateway's IP address and mount it,
         3. After copying the log file, the NFS directory will be idle. And
         autofs umount the directory.
         4. Next hour another NFS gateway is mounted.
         5. Different servers will mount from different NFS gateways at the
         same time so that throughput will be better.

If you setup a system like this, using either NFS or HttpFS, could you
share what need to be done by TechOp guys? I can setup those gateways on
Hadoop Cluster, but I am not familiar with load balancer and DNS stuff.

Thanks a lot.

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB