Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Reduce copy speed too slow


Copy link to this message
-
Re: Reduce copy speed too slow
Hi, Gayatri
On 03/20/2012 11:59 AM, Gayatri Rao wrote:
> Hi all,
>
> I am running a map reduce job in EC2 instances and it seems to be very
> slow. It takes hours together for simple projection and aggregation of
> data.
What filesystem are you using for data storage: HDFS in EC2 or Amazon S3?
Which is the data size that you are analyzing?

> Upon observation, I gathered that the reduce copy speed is 0.01 MB/sec. I
> am new to hadoop. Could any one please share  insights about the reduce
> copy speeds
> are good to work with. If any one has an experience any tips in improving
> it.
Hadoop Map/Reduce jobs shuffle lots of data, so the recommended
configuration is to use 10Gbps networks for
the underline connection (and dedicated switches on dual-gigabit networks)

Remember too that Hadoop is not a real-time system, if you need
real-time random access to your data, use HBase
http://hbase.apache.org

Regards
>
> Thanks
> Gayatri
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci

--
Marcos Luis Ort�z Valmaseda (@marcosluis2186)
  Data Engineer at UCI
  http://marcosluis2186.posterous.com
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci