Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - DFS respond very slow


+
Alexey 2012-10-09, 10:12
+
Alexey 2012-10-10, 05:23
+
Harsh J 2012-10-10, 06:50
+
Alexey 2012-10-10, 06:54
+
Harsh J 2012-10-10, 06:56
+
Alexey 2012-10-10, 07:20
+
Vinod Kumar Vavilapalli 2012-10-16, 00:22
+
Vinod Kumar Vavilapalli 2012-10-16, 02:36
+
Andy Isaacson 2012-10-16, 01:56
Copy link to this message
-
Re: DFS respond very slow
Ted Dunning 2012-10-16, 02:23
Uhhh... Alexey, did you really mean that you are running 100 mega bit per
second network links?

That is going to make hadoop run *really* slowly.

Also, putting RAID under any DFS, be it Hadoop or MapR is not a good recipe
for performance.  Not that it matters if you only have 10mega bytes per
second available from the network.

On Mon, Oct 15, 2012 at 6:56 PM, Andy Isaacson <[EMAIL PROTECTED]> wrote:

> Also, note that JVM startup overhead, etc, means your -ls time is not
> completely unreasonable. Using OpenJDK on a cluster of VMs, my "hdfs
> dfs -ls" takes 1.88 seconds according to time (and 1.59 seconds of
> user CPU time).
>
> I'd be much more concerned about your slow transfer times.  On the
> same cluster, I can easily push 4 MB/sec even with only a 100MB file
> using "hdfs dfs -put - foo.txt". And of course using distcp or
> multiple -put workloads HDFS can saturate multiple GigE links.
>
> -andy
>
> On Mon, Oct 15, 2012 at 5:22 PM, Vinod Kumar Vavilapalli
> <[EMAIL PROTECTED]> wrote:
> > Try picking up a single operation say "hadoop dfs -ls" and start
> profiling.
> >  - Time the client JVM is taking to start. Enable debug logging on the
> > client side by exporting HADOOP_ROOT_LOGGER=DEBUG,CONSOLE
> >  - Time between the client starting and the namenode audit logs showing
> the
> > read request. Also enable debug logging on the daemons too.
> >  - Also, you can wget the namenode web pages and see how fast they
> return.
> >
> > To repeat what is already obvious, It is most likely related to your
> network
> > setup and/or configuration.
> >
> > Thanks,
> > +Vinod
> >
> > On Oct 10, 2012, at 12:20 AM, Alexey wrote:
> >
> > ok, here you go:
> > I have 3 servers:
> > datanode on server 1, 2, 3
> > namenode on server 1
> > secondarynamenode on server 2
> >
> > all servers are at the hetzner datacenter and connected through 100Mbit
> > link, pings between them about 0.1ms
> >
> > each server has 24Gb ram and intel core i7 3Ghz CPU
> > disk is 700Gb RAID
> >
> > the bindings related configuration is the following:
> > server 1:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 2:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 3:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>