Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> FileSystem Error


Copy link to this message
-
Re: FileSystem Error
using  haddop jar, instead of java -jar.

hadoop script can set a proper classpath for you.
On Mar 29, 2013 11:55 PM, "Cyril Bogus" <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am running a small java program that basically write a small input data
> to the Hadoop FileSystem, run a Mahout Canopy and Kmeans Clustering and
> then output the content of the data.
>
> In my hadoop.properties I have included the core-site.xml definition for
> the Java program to connect to my single node setup so that I will not use
> the Java Project file system but hadoop instead (Basically all write and
> read are done on hadoop and not in the class file).
>
> When I run the program, as soon as the Canopy (even the KMeans),
> configuration tries to lookup for the file in the class path instead of the
> Hadoop FileSystem path where the proper files are located.
>
> Is there a problem with the way I have my conf defined?
>
> hadoop.properties:
> fs.default.name=hdfs//mylocation
>
> Program:
>
> public class DataFileWriter {
>
>     private static Properties props = new Properties();
>     private static Configuration conf = new Configuration();
>
>     /**
>      * @param args
>      * @throws ClassNotFoundException
>      * @throws InterruptedException
>      * @throws IOException
>      */
>     public static void main(String[] args) throws IOException,
>             InterruptedException, ClassNotFoundException {
>
>         props.load(new FileReader(new File(
>                 "/home/cyril/workspace/Newer/src/hadoop.properties")));
>
>         // TODO Auto-generated method stub
>         FileSystem fs = null;
>         SequenceFile.Writer writer;
>         SequenceFile.Reader reader;
>
>         conf.set("fs.default.name", props.getProperty("fs.default.name"));
>
>         List<NamedVector> vectors = new LinkedList<NamedVector>();
>         NamedVector v1 = new NamedVector(new DenseVector(new double[] {
> 0.1,
>                 0.2, 0.5 }), "Hello");
>         vectors.add(v1);
>         v1 = new NamedVector(new DenseVector(new double[] { 0.5, 0.1, 0.2
> }),
>                 "Bored");
>         vectors.add(v1);
>         v1 = new NamedVector(new DenseVector(new double[] { 0.2, 0.5, 0.1
> }),
>                 "Done");
>         vectors.add(v1);
>         // Write the data to SequenceFile
>         try {
>             fs = FileSystem.get(conf);
>
>             Path path = new Path("testdata_seq/data");
>             writer = new SequenceFile.Writer(fs, conf, path, Text.class,
>                     VectorWritable.class);
>
>             VectorWritable vec = new VectorWritable();
>             for (NamedVector vector : vectors) {
>                 vec.set(vector);
>                 writer.append(new Text(vector.getName()), vec);
>             }
>             writer.close();
>
>         } catch (Exception e) {
>             System.out.println("ERROR: " + e);
>         }
>
>         Path input = new Path("testdata_seq/data");
>         boolean runSequential = false;
>         Path clustersOut = new Path("testdata_seq/clusters");
>         Path clustersIn = new
> Path("testdata_seq/clusters/clusters-0-final");
>         double convergenceDelta = 0;
>         double clusterClassificationThreshold = 0;
>         boolean runClustering = true;
>         Path output = new Path("testdata_seq/output");
>         int maxIterations = 12;
>         CanopyDriver.run(conf, input, clustersOut, new
> EuclideanDistanceMeasure(), 1, 1, 1, 1, 0, runClustering,
> clusterClassificationThreshold, runSequential);
>         KMeansDriver.run(conf, input, clustersIn, output, new
> EuclideanDistanceMeasure(), convergenceDelta, maxIterations, runClustering,
> clusterClassificationThreshold, runSequential);
>
>         reader = new SequenceFile.Reader(fs,
>                 new Path("testdata_seq/clusteredPoints/part-m-00000"),
> conf);
>
>         IntWritable key = new IntWritable();
>         WeightedVectorWritable value = new WeightedVectorWritable();
>         while (reader.next(key, value)) {
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB