Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Want to Sort the values in one line using map reduce


Copy link to this message
-
Want to Sort the values in one line using map reduce
Hi,

*I have input file and my data looks like:*

 date  country  city pagePath visits  20120301  India Ahmedabad / 1
20120302  India Ahmedabad /gtuadmissionhelpline-team 1  20120302  India
Mumbai / 1  20120302  India Mumbai /merit-calculator 1

* I wrote the map and reduce application to convert it into page_url by
city:*
*
*
*
*
*
*
*
*
package data.ga;

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class pharmecy
{
public static class MapClass extends MapReduceBase implements
Mapper<LongWritable,Text,Text,Text>
{
Text k = new Text();
Text v = new Text();
 public void map(LongWritable key,Text
value,OutputCollector<Text,Text>output,Reporter reporter) throws IOException
{
try
{
String[] line = value.toString().split(",",5);
 String city = String.valueOf(line[2]);
String url = String.valueOf(line[3]);
 k.set(city);
v.set(url);
 output.collect(k, v);
}
catch(Exception e)
{
System.out.println(e);
}
 }
}
 public static class ReduceClass extends MapReduceBase implements Reducer
<Text,Text,Text,Text>
{
Text v = new Text();
 public void reduce(Text key,Iterator<Text>
values,OutputCollector<Text,Text>output,Reporter reporter) throws
IOException
{

 while(values.hasNext())

{
String val=values.next().toString();
 v.set(val);
 output.collect(key,v);
 }
 }
 public static void main(String[] args) {
JobClient client = new JobClient();
JobConf conf = new JobConf(data.ga.pharmecy.class);

conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(Text.class);
// TODO: specify output types
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);

FileInputFormat.setInputPaths(conf, new
Path("hdfs://localhost:54310/user/manish/gadatainput/pharmecydata.txt"));
FileOutputFormat.setOutputPath(conf, new
Path("hdfs://localhost:54310/user/manish/gadataoutput11"));

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
 conf.setMapperClass(MapClass.class);
conf.setReducerClass(ReduceClass.class);
 client.setConf(conf);
try {
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}
}

}
}

*Output:*
*
*
*
*
*#city*                 * #pagepath*
"Aachen" "/medicalcollege/m-p-shah-medical-college"
"Abbottabad" "/merit-calculator"
"Abbottabad" "/merit-calculator"
"Abidjan"
"/pharmacycollege/shree-swaminarayan-pharmacy-college-kevadiya-colony"
"Abidjan"
"/pharmacycollege/amruta-college-of-pharmacy-research-institute-gandhinagar"
*My question is:*

I want to convert this output in below format::

#city                        #pagepath
city1                        url1,url2,url3
city2                        url1,url2,url3

Is it possible to convert it in this format using map and reduce ???

If yes then how??

--
MANISH DUNANI
-THANX
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB