Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - error while running reduce


Copy link to this message
-
error while running reduce
Arindam Choudhury 2013-03-07, 12:31
Hi,

I am trying to do a performance analysis of hadoop on virtual machine. When
I try to run terasort with 2GB of input data with 1 map and 1 reduce, the
map finishes properly, but reduce gives error. I can not understand why?
any help?

I have a single node hadoop deployment in a virtual machine. The F18
virtual machine have 1 core and 2 GB of memory.

my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name</name>
  <value>hdfs://hadoopa.arindam.com:54310</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000</value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311</value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>

I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602 (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602 (1.92 GB)
Non DFS Used: 7125718806 (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%
But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713 (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713 (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%
Thanks,
+
Arindam Choudhury 2013-03-08, 12:10