Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - using sequencefile generated by Sqoop in Mapreduce

Copy link to this message
RE: using sequencefile generated by Sqoop in Mapreduce
Kartashov, Andy 2012-10-09, 20:58
Gents, please ignore my below. Everything works as a glove.

conf.setInputFormat(SequenceFileInputFormat.class) indeed works well  with Sqoop generated class.

The reason why I was getting only the last line in my output is because I failed to notice that I am using fs.create() i/o fs.append(). *blash*

Andy Kartashov
Architecture R&D, Co-op
1340 Pickering Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787

From: Kartashov, Andy
Sent: Tuesday, October 09, 2012 1:19 PM
Subject: using sequencefile generated by Sqoop in Mapreduce


I have trouble using sequence file  in Mar-Reduce.  The output I get is very last record.

I am creating sequence file while importing MySQL table into Hadoop using:
$Sqoop import...... --as-sequencefile

I am then are trying to read from this file into the mapper and create keys from object's Ids and values - the actual objects with attributes per each table's record.
In the Reducer I am iterating those objects and outputting objects's attributes to a .txt file.

My Mapreduce code:

     public static void main(String[] args) throws Exception {
    conf.setOutputKeyClass(Text.class); // this will be one of the fields of the exported table, say ids
   conf.setOutputValueClass(<Sqoop_class.class>); //say Sqoop_class.class generated_class_during_import
  conf.setOutputFormat(NullOutputFormat.class); // output will be to a .txt file
//===================================================================================================================public static class MyMapper extends MapReduceBase implements Mapper<LongWritable, Sqoop_class, Text, Sqoop_class> {
public void map(LongWritable key, Sqoop_class value, OutputCollector<Text, Sqoop_class> output, Reporter reporter) throws IOException {
        output.collect(new Text(value.get_foo_id ().toString()), value);
     } // end of map()
} // end of static class MyMapper

//===================================================================================================================  public static class MyReducer extends MapReduceBase implements Reducer<Text, Sqoop_class, Text, Sqoop_class> {

   public void reduce(Text key, Iterator<Sqoop_class> values, OutputCollector<Text, Sqoop_class> output, Reporter reporter) throws IOException {
  while (values.hasNext()){
       output.collect(key,epa); //output is to Null...
       out.writeBytes(" values.next().get_foo_name() + "!\n "  );
       } // end of while loop
        }//end of reduce()
} // end of static class MyReducer

Would not Mapper create Keys from each Sqoop_class id value and Values will be instance of each Sqoop_class
Then we group those instances in the reducer and Iterate through them retrieving attribute names from each object.
Somehow the values of one last instance of the object is only written.

Should not conf.setInputFormat(SequenceFileInputFormat.class);
Sqoop_class.class work together reading from Sequence file?

From: nagarjuna kanamarlapudi [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, October 09, 2012 11:03 AM
Subject: Re: Hive-Site XML changing any proprty.

by restarting the hive server your problem should be solved.

Not sure if we have any other ways of starting the hive server other than .
1. bin/hive --service hiveserver

2. HIVE_PORT=xxxx ./hive --service hiveserver

On Tue, Oct 9, 2012 at 8:10 PM, Uddipan Mukherjee <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi hadoop, hive gurus,

    I have a requirement to change the path of the scratch folder of Hive.  Hence I have added following property in Hive-Site.xml and changed its value as required.

  <description>Scratch space for Hive jobs</description>

But still it is not reflecting as required. Do I need to restart Hive server to read the updated value in the file.

Also is there any other way other than restarting Hive server?

Any pointers will be helpful.

Thanks And Regards
Uddipan Mukherjee
**************** CAUTION - Disclaimer *****************

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely

for the use of the addressee(s). If you are not the intended recipient, please

notify the sender by e-mail and delete the original message. Further, you are not

to copy, disclose, or distribute this e-mail or its contents to any other person and

any such actions are unlawful. This e-mail may contain viruses. Infosys has taken

every reasonable precaution to minimize this risk, but is not liable for any damage

you may sustain as a result of any virus in this e-mail. You should carry out your

own virus checks before opening the e-mail or attachment. Infosys reserves the

right to monitor and review the content of all messages sent to or from this e-mail

address. Messages sent to or from this e-mail address may be stored on the

Infosys e-mail system.

***INFOSYS******** End of Disclaimer ********INFOSYS***
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisat