Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: libjar and Mahout


Copy link to this message
-
Re: libjar and Mahout
In your hadoop command I see a space in the part
...-core-0.9-SNAPSHOT.jar /:/apps/mahout/trunk

just after .jar
Should it not be
...-core-0.9-SNAPSHOT.jar:/apps/mahout/trunk
Chris

On 12/20/2013 2:44 PM, Sameer Tilak wrote:
> Hi All,
> I am running Hadoop 1.0.3 -- probably will upgrade mid-next year. We
> are using Apache Pig to build our data pipeline and are planning to
> use Apache Mahout for data analysis.
>
> javac -d /apps/analytics/ -classpath
> .:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar:/users/p529444/software/hadoop-1.0.3/hadoop-core-1.0.3.jar:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar:/apps/mahout/trunk/math/target/mahout-math-0.9-SNAPSHOT.jar:/users/p529444/software/hadoop-1.0.3/hadoop-tools-1.0.3.jar:/users/p529444/software/hadoop-1.0.3/lib/commons-logging-1.1.1.jar
> SimpleKMeansClustering.java
>
> jar -cf myanalytics.jar myanalytics/
>
>
> hadoop jar /apps/analytics/myanalytics.jar
> myanalytics.SimpleKMeansClustering -libjars
> /apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar
> /:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar:/apps/mahout/trunk/math/target/mahout-math-0.9-SNAPSHOT.jar
>
> I have call the following method in my SimpleKMeansClustering class:
>
>             KMeansDriver.run(conf, new
> Path("/scratch/dummyvector.seq"), new
> Path("/scratch/dummyvector-initclusters/part-randomSeed/"),
>                              new Path("/scratch/dummyvectoroutput"),
> new EuclideanDistanceMeasure(), 0.001, 10,
>                              true, 1.0, false);
>
>
> I unfortunately get the following error, In think somehow the jars are
> not made available in the distributed cached. I use Vectors to
> repreent my data and I write it to a sequence file. I then use that
> Driver to analyze that in the mapreduce mode. I think locally all the
> required jar files are available, however somehow in the mapreduce
> mode they are not available. Any help with this would be great!
>
> 13/12/19 16:59:02 INFO kmeans.KMeansDriver: Input:
> /scratch/dummyvector.seq Clusters In:
> /scratch/dummyvector-initclusters/part-randomSeed Out:
> /scratch/dummyvectoroutput Distance:
> org.apache.mahout.common.distance.EuclideanDistanceMeasure
> 13/12/19 16:59:02 INFO kmeans.KMeansDriver: convergence: 0.001 max
> Iterations: 10
> 13/12/19 16:59:02 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 13/12/19 16:59:02 INFO zlib.ZlibFactory: Successfully loaded &
> initialized native-zlib library
> 13/12/19 16:59:02 INFO compress.CodecPool: Got brand-new decompressor
> 13/12/19 16:59:02 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 13/12/19 16:59:02 INFO input.FileInputFormat: Total input paths to
> process : 1
> 13/12/19 16:59:03 INFO mapred.JobClient: Running job:
> job_201311111627_0310
> 13/12/19 16:59:04 INFO mapred.JobClient:  map 0% reduce 0%
> 13/12/19 16:59:19 INFO mapred.JobClient: Task Id :
> attempt_201311111627_0310_m_000000_0, Status : FAILED
> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:264)
>     at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
>     at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1671)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1613)
+
Sameer Tilak 2013-12-23, 20:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB