Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> DFS respond very slow


+
Alexey 2012-10-09, 10:12
+
Alexey 2012-10-10, 05:23
+
Harsh J 2012-10-10, 06:50
+
Alexey 2012-10-10, 06:54
+
Harsh J 2012-10-10, 06:56
+
Alexey 2012-10-10, 07:20
+
Vinod Kumar Vavilapalli 2012-10-16, 00:22
+
Vinod Kumar Vavilapalli 2012-10-16, 02:36
+
Andy Isaacson 2012-10-16, 01:56
Copy link to this message
-
Re: DFS respond very slow
Uhhh... Alexey, did you really mean that you are running 100 mega bit per
second network links?

That is going to make hadoop run *really* slowly.

Also, putting RAID under any DFS, be it Hadoop or MapR is not a good recipe
for performance.  Not that it matters if you only have 10mega bytes per
second available from the network.

On Mon, Oct 15, 2012 at 6:56 PM, Andy Isaacson <[EMAIL PROTECTED]> wrote:

> Also, note that JVM startup overhead, etc, means your -ls time is not
> completely unreasonable. Using OpenJDK on a cluster of VMs, my "hdfs
> dfs -ls" takes 1.88 seconds according to time (and 1.59 seconds of
> user CPU time).
>
> I'd be much more concerned about your slow transfer times.  On the
> same cluster, I can easily push 4 MB/sec even with only a 100MB file
> using "hdfs dfs -put - foo.txt". And of course using distcp or
> multiple -put workloads HDFS can saturate multiple GigE links.
>
> -andy
>
> On Mon, Oct 15, 2012 at 5:22 PM, Vinod Kumar Vavilapalli
> <[EMAIL PROTECTED]> wrote:
> > Try picking up a single operation say "hadoop dfs -ls" and start
> profiling.
> >  - Time the client JVM is taking to start. Enable debug logging on the
> > client side by exporting HADOOP_ROOT_LOGGER=DEBUG,CONSOLE
> >  - Time between the client starting and the namenode audit logs showing
> the
> > read request. Also enable debug logging on the daemons too.
> >  - Also, you can wget the namenode web pages and see how fast they
> return.
> >
> > To repeat what is already obvious, It is most likely related to your
> network
> > setup and/or configuration.
> >
> > Thanks,
> > +Vinod
> >
> > On Oct 10, 2012, at 12:20 AM, Alexey wrote:
> >
> > ok, here you go:
> > I have 3 servers:
> > datanode on server 1, 2, 3
> > namenode on server 1
> > secondarynamenode on server 2
> >
> > all servers are at the hetzner datacenter and connected through 100Mbit
> > link, pings between them about 0.1ms
> >
> > each server has 24Gb ram and intel core i7 3Ghz CPU
> > disk is 700Gb RAID
> >
> > the bindings related configuration is the following:
> > server 1:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 2:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 3:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB