Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - Map/Reduce and sequence file metadata...


Copy link to this message
-
Map/Reduce and sequence file metadata...
Andy Sautins 2009-10-01, 16:10

   Hi all. I'm struggling a bit to figure this out and wondering if anyone had any  pointers.

   I'm using SequenceFiles as output from a MapReduce job ( using SequenceFileOutputFormat ) and then in a followup MapReduce job reading in the results using SequenceFileInputFormat.  All seems to work fine.  What I haven't figured out is how to write the SequenceFile.Metadata in the SequenceFileOutputFormat and then read the metadata in SequenceFileInputFormat.  Is that possible to do using the new mapreduce.* API?

   I have two types of files I want to process in the Mapper.  Currently I'm using the  context.getInputSplit() and parsing the resulting fileSplit.getPath() to determine what file I'm processing.  It seems cleaner to use the SequenceFile.Metadata if I can.  Does that make sense or am I off in the weeds?

   Thanks

   Andy
+
Tom White 2009-10-02, 09:25
+
Andy Sautins 2009-10-02, 20:48