|
|
-
Create Index Map/Reduce failurePeter Marron 2012-10-27, 19:21
Hi,
I have a fairly low-end machine running Ubuntu 12.0.4 I'm running Hadoop in pseudo-distributed and storing in HDFS. I have a file which is 137Gb with 36.6 million rows and 466 columns. I am trying to create an index on this table in hive with these commands. (I seem to have to build the index in two separate commands.) LOAD DATA INPATH 'E3/score.csv' OVERWRITE INTO TABLE score; CREATE INDEX bigIndex ON TABLE score(Ath_Seq_Num) AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; ALTER INDEX bigIndex ON score REBUILD; The resulting Map/Reduce is failing with OutOfMemoryError. I attach the end of the only log which seems to contain any useful information about the error. When I googled a bit I found a suggestion that it could be the mapred.child.java.opts, so I added this to my mapred-site.xml (and it increased the maximum from 200Mb to 1000Mb) <property> <name>mapred.child.java.opts</name> <value>-Xmx1000m</value> </property> But this didn't seem to help. I also saw some mention that I should decrease the io.sort.mb, and so I reduced this to 1Mb. However this didn't seem to help either. Maybe this is the wrong list for this question and I should post to [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>? Any help appreciated. Peter Marron 2012-10-25 15:55:27,429 INFO org.apache.hadoop.mapred.ReduceTask: In-memory merge complete: 511 files left. 2012-10-25 15:55:27,432 WARN org.apache.hadoop.fs.FileSystem: "localhost" is a deprecated filesystem name. Use "hdfs://localhost/" instead. 2012-10-25 15:55:27,449 INFO org.apache.hadoop.mapred.Merger: Merging 511 sorted segments 2012-10-25 15:55:27,455 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 511 segments left of total size: 173620406 bytes 2012-10-25 15:55:27,885 INFO org.apache.hadoop.mapred.ReduceTask: Merged 511 segments, 173620406 bytes to disk to satisfy reduce memory limit 2012-10-25 15:55:27,885 INFO org.apache.hadoop.mapred.ReduceTask: Merging 1 files, 173619390 bytes from disk 2012-10-25 15:55:27,886 INFO org.apache.hadoop.mapred.ReduceTask: Merging 0 segments, 0 bytes from memory into reduce 2012-10-25 15:55:27,886 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments 2012-10-25 15:55:27,888 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 173619386 bytes 2012-10-25 15:55:27,895 INFO ExecReducer: maximum memory = 932118528 2012-10-25 15:55:27,895 INFO ExecReducer: conf classpath = [file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/classes, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/attempt_201210251304_0001_r_000093_3/] 2012-10-25 15:55:27,896 INFO ExecReducer: thread classpath = [file:/data/hadoop-1.0.3/conf/, file:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/tools.jar, file:/data/hadoop-1.0.3/, file:/data/hadoop-1.0.3/hadoop-core-1.0.3.jar, file:/data/hadoop-1.0.3/lib/asm-3.2.jar, file:/data/hadoop-1.0.3/lib/aspectjrt-1.6.5.jar, file:/data/hadoop-1.0.3/lib/aspectjtools-1.6.5.jar, file:/data/hadoop-1.0.3/lib/commons-beanutils-1.7.0.jar, file:/data/hadoop-1.0.3/lib/commons-beanutils-core-1.8.0.jar, file:/data/hadoop-1.0.3/lib/commons-cli-1.2.jar, file:/data/hadoop-1.0.3/lib/commons-codec-1.4.jar, file:/data/hadoop-1.0.3/lib/commons-collections-3.2.1.jar, file:/data/hadoop-1.0.3/lib/commons-configuration-1.6.jar, file:/data/hadoop-1.0.3/lib/commons-daemon-1.0.1.jar, file:/data/hadoop-1.0.3/lib/commons-digester-1.8.jar, file:/data/hadoop-1.0.3/lib/commons-el-1.0.jar, file:/data/hadoop-1.0.3/lib/commons-httpclient-3.0.1.jar, file:/data/hadoop-1.0.3/lib/commons-io-2.1.jar, file:/data/hadoop-1.0.3/lib/commons-lang-2.4.jar, file:/data/hadoop-1.0.3/lib/commons-logging-1.1.1.jar, file:/data/hadoop-1.0.3/lib/commons-logging-api-1.0.4.jar, file:/data/hadoop-1.0.3/lib/commons-math-2.1.jar, file:/data/hadoop-1.0.3/lib/commons-net-1.4.1.jar, file:/data/hadoop-1.0.3/lib/core-3.1.1.jar, file:/data/hadoop-1.0.3/lib/hadoop-capacity-scheduler-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hadoop-fairscheduler-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hadoop-thriftfs-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hsqldb-1.8.0.10.jar, file:/data/hadoop-1.0.3/lib/jackson-core-asl-1.8.8.jar, file:/data/hadoop-1.0.3/lib/jackson-mapper-asl-1.8.8.jar, file:/data/hadoop-1.0.3/lib/jasper-compiler-5.5.12.jar, file:/data/hadoop-1.0.3/lib/jasper-runtime-5.5.12.jar, file:/data/hadoop-1.0.3/lib/jdeb-0.8.jar, file:/data/hadoop-1.0.3/lib/jersey-core-1.8.jar, file:/data/hadoop-1.0.3/lib/jersey-json-1.8.jar, file:/data/hadoop-1.0.3/lib/jersey-server-1.8.jar, file:/data/hadoop-1.0.3/lib/jets3t-0.6.1.jar, file:/data/hadoop-1.0.3/lib/jetty-6.1.26.jar, file:/data/hadoop-1.0.3/lib/jetty-util-6.1.26.jar, file:/data/hadoop-1.0.3/lib/jsch-0.1.42.jar, file:/data/hadoop-1.0.3/lib/junit-4.5.jar, file:/data/hadoop-1.0.3/lib/kfs-0.2.2.jar, file:/data/hadoop-1.0.3/lib/log4j-1.2.15.jar, file:/data/hadoop-1.0.3/lib/mockito-all-1.8.5.jar, file:/data/hadoop-1.0.3/lib/oro-2.0.8.jar, file:/data/hadoop-1.0.3/lib/servlet-api-2.5-20081211.jar, file:/data/hadoop-1.0.3/lib/slf4j-api-1.4.3.jar, file:/data/hadoop-1.0.3/lib/slf4j-log4j12-1.4.3.jar, file:/data/hadoop-1.0.3/lib/xmlenc-0.52.jar, file:/data/hadoop-1.0.3/lib/jsp-2.1/jsp-2.1.jar, file:/data/hadoop-1.0.3/lib/jsp-2.1/jsp-api-2.1.jar, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/classes, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/, file:/data/tmp/mapred/local/taskTracker/pmarron/distcache/3928617505704526765_23348021_405451127/localhost/data/tmp/mapred/staging/pmarron/.staging/job_201210251304_0001/libjars/hive-builtins-0.8.1.jar/, file:/data/tmp/mapred/local/taskTracker/pmarr |