Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Unable to perform terasort for 50GB of data


Copy link to this message
-
Re: Unable to perform terasort for 50GB of data
Hai,

Check the individual data nodes usage:
Hadoop dfsadmin -report
And moreover override the config parameter mapred.local.dir to store
intermediate data in some path rather than /tmp directory and don't use
single reducer, increase no of reducers and use  totalorderpartitioner

Thanks
Nagamallikarjuna
On Nov 8, 2013 10:40 AM, "Khai Cher LIM (NYP)" <[EMAIL PROTECTED]>
wrote:

>  Dear all,
>
>
>
> I have just started learning Hadoop setup and I am having problem with
> running terasort on my Hadoop cluster. My input folder contains 50 GB of
> data but when I run the terasort, the tasks failed and it gave me the error
> message as shown in the following screenshot.
>
>
>
>
>
> I've set my dfs block size to be 128 MB. Actually with the default 64 MB,
> the tasks failed also with the same reason.
>
>
>
> Server information - HP ProLiant DL380p Gen8 (2U)
>
> •             two Intel Xeon E5-2640 processors with 15 MB cache, 2.5Ghz,
> 7.2GT/s
>
> •             48GB RAM
>
> •             12 x 1TB (or a raw capacity of 12TB) 6G SAS 7.2K 3.5 HDD
>
> •             RAID controller that supports RAID 5 with at least 512MB
> Flash-Backed Write Cache (FBWC)
>
> •             on-board adapter of 4 x 1GbE Ethernet port
>
> •             2 hot-pluggable power supply units
>
>
>
> I've configured two servers with virtual machines as decribed below:
>
> Server 1:
>
> 1 Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> Server 2:
>
> 1 Secondary Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> I've checked that the diskspace used per data node is about 20% on
> average. Thus I couldn't understand the error message complaining about "no
> space left on device".
>
>
>
> Any help is much appreciated.
>
>
>
> Thank you.
>
>
>
> Regards,
>
> Khai Cher
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB