Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: libjar and Mahout


Copy link to this message
-
Re: libjar and Mahout
Chris Mawata 2013-12-21, 02:55
In your hadoop command I see a space in the part
...-core-0.9-SNAPSHOT.jar /:/apps/mahout/trunk

just after .jar
Should it not be
...-core-0.9-SNAPSHOT.jar:/apps/mahout/trunk
Chris

On 12/20/2013 2:44 PM, Sameer Tilak wrote:
> Hi All,
> I am running Hadoop 1.0.3 -- probably will upgrade mid-next year. We
> are using Apache Pig to build our data pipeline and are planning to
> use Apache Mahout for data analysis.
>
> javac -d /apps/analytics/ -classpath
> .:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar:/users/p529444/software/hadoop-1.0.3/hadoop-core-1.0.3.jar:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar:/apps/mahout/trunk/math/target/mahout-math-0.9-SNAPSHOT.jar:/users/p529444/software/hadoop-1.0.3/hadoop-tools-1.0.3.jar:/users/p529444/software/hadoop-1.0.3/lib/commons-logging-1.1.1.jar
> SimpleKMeansClustering.java
>
> jar -cf myanalytics.jar myanalytics/
>
>
> hadoop jar /apps/analytics/myanalytics.jar
> myanalytics.SimpleKMeansClustering -libjars
> /apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar
> /:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar:/apps/mahout/trunk/math/target/mahout-math-0.9-SNAPSHOT.jar
>
> I have call the following method in my SimpleKMeansClustering class:
>
>             KMeansDriver.run(conf, new
> Path("/scratch/dummyvector.seq"), new
> Path("/scratch/dummyvector-initclusters/part-randomSeed/"),
>                              new Path("/scratch/dummyvectoroutput"),
> new EuclideanDistanceMeasure(), 0.001, 10,
>                              true, 1.0, false);
>
>
> I unfortunately get the following error, In think somehow the jars are
> not made available in the distributed cached. I use Vectors to
> repreent my data and I write it to a sequence file. I then use that
> Driver to analyze that in the mapreduce mode. I think locally all the
> required jar files are available, however somehow in the mapreduce
> mode they are not available. Any help with this would be great!
>
> 13/12/19 16:59:02 INFO kmeans.KMeansDriver: Input:
> /scratch/dummyvector.seq Clusters In:
> /scratch/dummyvector-initclusters/part-randomSeed Out:
> /scratch/dummyvectoroutput Distance:
> org.apache.mahout.common.distance.EuclideanDistanceMeasure
> 13/12/19 16:59:02 INFO kmeans.KMeansDriver: convergence: 0.001 max
> Iterations: 10
> 13/12/19 16:59:02 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 13/12/19 16:59:02 INFO zlib.ZlibFactory: Successfully loaded &
> initialized native-zlib library
> 13/12/19 16:59:02 INFO compress.CodecPool: Got brand-new decompressor
> 13/12/19 16:59:02 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 13/12/19 16:59:02 INFO input.FileInputFormat: Total input paths to
> process : 1
> 13/12/19 16:59:03 INFO mapred.JobClient: Running job:
> job_201311111627_0310
> 13/12/19 16:59:04 INFO mapred.JobClient:  map 0% reduce 0%
> 13/12/19 16:59:19 INFO mapred.JobClient: Task Id :
> attempt_201311111627_0310_m_000000_0, Status : FAILED
> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:264)
>     at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
>     at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1671)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1613)