Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Reasons of misoperation of the hadoop program


Copy link to this message
-
Reasons of misoperation of the hadoop program
I wrote hadoop the program on which input of a mapper the text file
`hdfs://192.168.1.8:7000/export/hadoop-1.0.1/bin/input/paths.txt` with the
written ways of local file system (which it is identical on all computers of
a cluster) the program `./readwritepaths` in one line and partitioned by the
character `|`. At first in a mapper there is a reading quantity of the
subordinate nodes of a cluster from the `/usr/countcomputers.txt` file,
which equally 2 also it was read correctly, judging by program execution.
Further the contents of the input file arrived in the form of value on an
input of a mapper and transformed to a line, are segmented by means of a
separator `|` and the received ways are added in `ArrayList<String> paths`.

 

                package org.myorg;

 

                import java.io.*;

                import java.util.*;

                import org.apache.hadoop.fs.Path;

                import org.apache.hadoop.conf.*;

                import org.apache.hadoop.io.*;

                import org.apache.hadoop.mapred.*;

                import org.apache.hadoop.util.*;

 

                public class ParallelIndexation {

                               public static class Map extends MapReduceBase
implements

 
Mapper<LongWritable, Text, Text, LongWritable> {

                                               private final static
LongWritable zero = new LongWritable(0);

                                               private Text word = new
Text();

 

                                               public void map(LongWritable
key, Text value,

 
OutputCollector<Text, LongWritable> output, Reporter reporter)

 
throws IOException {

                                                               String line value.toString();

                                                               int
CountComputers;

 
FileInputStream fstream = new FileInputStream(

 
"/usr/countcomputers.txt");

 
DataInputStream in = new DataInputStream(fstream);

 
BufferedReader br = new BufferedReader(new InputStreamReader(in));

                                                               String
result=br.readLine();

 
CountComputers=Integer.parseInt(result);

                                                               in.close();

 
fstream.close();

 
System.out.println("CountComputers="+CountComputers);

 
ArrayList<String> paths = new ArrayList<String>();

 
StringTokenizer tokenizer = new StringTokenizer(line, "|");

                                                               while
(tokenizer.hasMoreTokens()) {

 
paths.add(tokenizer.nextToken());

                                                               }

Then for check I take out values of the `ArrayList<String> paths` elements
to the `/export/hadoop-1.0.1/bin/readpathsfromdatabase.txt` file which
contents are given below and which speaks about correctness of filling of
`ArrayList<String> paths`.

 

                                                              PrintWriter
zzz = null;

                               try

                               {

                                               zzz = new PrintWriter(new
FileOutputStream("/export/hadoop-1.0.1/bin/readpathsfromdatabase.txt"));

                               }

                               catch(FileNotFoundException e)

                               {

                                               System.out.println("Error");

                                               System.exit(0);

                               }

                               for (int i=0; i<paths.size(); i++)

                                                               {

 
zzz.println("paths[" + i + "]=" + paths.get(i) + "\n");

                               }

                               zzz.close();

Then concatenation of these ways through character `\n` and record of
connected results in array by `String[] ConcatPaths = new String
[CountComputers]` is made.

 

                                               String[] ConcatPaths = new
String[CountComputers];

                                               int
NumberOfElementConcatPaths = 0;

                                               if (paths.size() %
CountComputers == 0) {

                                                               for (int i 0; i < CountComputers; i++) {

 
ConcatPaths[i] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers;

 
for (int j = 1; j < paths.size() / CountComputers; j++) {

 
ConcatPaths[i] += "\n"

 
+ paths.get(i * paths.size() / CountComputers

 
+ j);

 
}

                                                               }

                                               } else {

 
NumberOfElementConcatPaths = 0;

                                                               for (int i 0; i < paths.size() % CountComputers; i++) {

 
ConcatPaths[i] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers

 
+ 1;

 
for (int j = 1; j < paths.size() / CountComputers + 1; j++) {

 
ConcatPaths[i] += "\n"

 
+ paths.get(i

 
* (paths.size() / CountComputers + 1)

 
+ j);

 
}

                                                               }

                                                               for (int k paths.size() % CountComputers; k < CountComputers; k++) {

 
ConcatPaths[k] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers;

 
for (int j = 1; j < paths.size() / CountComputers; j++) {

 
ConcatPaths[k] += "\n"

 
+ paths.get((k - paths.size() % CountComputers)

 
* paths.size() / CountComputers

 
+ paths.size() % CountComputers

 
* (paths.size() / CountComputers + 1)

 
+ j);

 
}

                                                               }

                                               }

I also take out array cells `String[] ConcatPaths` to the
`/export/hadoop-1.0.1/bin/concatpaths.txt` file for c