|
|
+
Gayatri Rao 2012-03-20, 15:59
-
Re: Reduce copy speed too slowMarcos Ortiz 2012-03-20, 16:12
Hi, Gayatri
On 03/20/2012 11:59 AM, Gayatri Rao wrote: > Hi all, > > I am running a map reduce job in EC2 instances and it seems to be very > slow. It takes hours together for simple projection and aggregation of > data. What filesystem are you using for data storage: HDFS in EC2 or Amazon S3? Which is the data size that you are analyzing? > Upon observation, I gathered that the reduce copy speed is 0.01 MB/sec. I > am new to hadoop. Could any one please share insights about the reduce > copy speeds > are good to work with. If any one has an experience any tips in improving > it. Hadoop Map/Reduce jobs shuffle lots of data, so the recommended configuration is to use 10Gbps networks for the underline connection (and dedicated switches on dual-gigabit networks) Remember too that Hadoop is not a real-time system, if you need real-time random access to your data, use HBase http://hbase.apache.org Regards > > Thanks > Gayatri > > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cu > http://www.facebook.com/universidad.uci > http://www.flickr.com/photos/universidad_uci -- Marcos Luis Ort�z Valmaseda (@marcosluis2186) Data Engineer at UCI http://marcosluis2186.posterous.com 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci |