Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Reasons of misoperation of the hadoop program


Copy link to this message
-
Reasons of misoperation of the hadoop program
I wrote hadoop the program on which input of a mapper the text file
`hdfs://192.168.1.8:7000/export/hadoop-1.0.1/bin/input/paths.txt` with the
written ways of local file system (which it is identical on all computers of
a cluster) the program `./readwritepaths` in one line and partitioned by the
character `|`. At first in a mapper there is a reading quantity of the
subordinate nodes of a cluster from the `/usr/countcomputers.txt` file,
which equally 2 also it was read correctly, judging by program execution.
Further the contents of the input file arrived in the form of value on an
input of a mapper and transformed to a line, are segmented by means of a
separator `|` and the received ways are added in `ArrayList<String> paths`.

 

                package org.myorg;

 

                import java.io.*;

                import java.util.*;

                import org.apache.hadoop.fs.Path;

                import org.apache.hadoop.conf.*;

                import org.apache.hadoop.io.*;

                import org.apache.hadoop.mapred.*;

                import org.apache.hadoop.util.*;

 

                public class ParallelIndexation {

                               public static class Map extends MapReduceBase
implements

 
Mapper<LongWritable, Text, Text, LongWritable> {

                                               private final static
LongWritable zero = new LongWritable(0);

                                               private Text word = new
Text();

 

                                               public void map(LongWritable
key, Text value,

 
OutputCollector<Text, LongWritable> output, Reporter reporter)

 
throws IOException {

                                                               String line value.toString();

                                                               int
CountComputers;

 
FileInputStream fstream = new FileInputStream(

 
"/usr/countcomputers.txt");

 
DataInputStream in = new DataInputStream(fstream);

 
BufferedReader br = new BufferedReader(new InputStreamReader(in));

                                                               String
result=br.readLine();

 
CountComputers=Integer.parseInt(result);

                                                               in.close();

 
fstream.close();

 
System.out.println("CountComputers="+CountComputers);

 
ArrayList<String> paths = new ArrayList<String>();

 
StringTokenizer tokenizer = new StringTokenizer(line, "|");

                                                               while
(tokenizer.hasMoreTokens()) {

 
paths.add(tokenizer.nextToken());

                                                               }

Then for check I take out values of the `ArrayList<String> paths` elements
to the `/export/hadoop-1.0.1/bin/readpathsfromdatabase.txt` file which
contents are given below and which speaks about correctness of filling of
`ArrayList<String> paths`.

 

                                                              PrintWriter
zzz = null;

                               try

                               {

                                               zzz = new PrintWriter(new
FileOutputStream("/export/hadoop-1.0.1/bin/readpathsfromdatabase.txt"));

                               }

                               catch(FileNotFoundException e)

                               {

                                               System.out.println("Error");

                                               System.exit(0);

                               }

                               for (int i=0; i<paths.size(); i++)

                                                               {

 
zzz.println("paths[" + i + "]=" + paths.get(i) + "\n");

                               }

                               zzz.close();

Then concatenation of these ways through character `\n` and record of
connected results in array by `String[] ConcatPaths = new String
[CountComputers]` is made.

 

                                               String[] ConcatPaths = new
String[CountComputers];

                                               int
NumberOfElementConcatPaths = 0;

                                               if (paths.size() %
CountComputers == 0) {

                                                               for (int i 0; i < CountComputers; i++) {

 
ConcatPaths[i] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers;

 
for (int j = 1; j < paths.size() / CountComputers; j++) {

 
ConcatPaths[i] += "\n"

 
+ paths.get(i * paths.size() / CountComputers

 
+ j);

 
}

                                                               }

                                               } else {

 
NumberOfElementConcatPaths = 0;

                                                               for (int i 0; i < paths.size() % CountComputers; i++) {

 
ConcatPaths[i] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers

 
+ 1;

 
for (int j = 1; j < paths.size() / CountComputers + 1; j++) {

 
ConcatPaths[i] += "\n"

 
+ paths.get(i

 
* (paths.size() / CountComputers + 1)

 
+ j);

 
}

                                                               }

                                                               for (int k paths.size() % CountComputers; k < CountComputers; k++) {

 
ConcatPaths[k] = paths.get(NumberOfElementConcatPaths);

 
NumberOfElementConcatPaths += paths.size() / CountComputers;

 
for (int j = 1; j < paths.size() / CountComputers; j++) {

 
ConcatPaths[k] += "\n"

 
+ paths.get((k - paths.size() % CountComputers)

 
* paths.size() / CountComputers

 
+ paths.size() % CountComputers

 
* (paths.size() / CountComputers + 1)

 
+ j);

 
}

                                                               }

                                               }

I also take out array cells `String[] ConcatPaths` to the
`/export/hadoop-1.0.1/bin/concatpaths.txt` file for c
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB