Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Map/Reduce and sequence file metadata...

Copy link to this message
Map/Reduce and sequence file metadata...

   Hi all. I'm struggling a bit to figure this out and wondering if anyone had any  pointers.

   I'm using SequenceFiles as output from a MapReduce job ( using SequenceFileOutputFormat ) and then in a followup MapReduce job reading in the results using SequenceFileInputFormat.  All seems to work fine.  What I haven't figured out is how to write the SequenceFile.Metadata in the SequenceFileOutputFormat and then read the metadata in SequenceFileInputFormat.  Is that possible to do using the new mapreduce.* API?

   I have two types of files I want to process in the Mapper.  Currently I'm using the  context.getInputSplit() and parsing the resulting fileSplit.getPath() to determine what file I'm processing.  It seems cleaner to use the SequenceFile.Metadata if I can.  Does that make sense or am I off in the weeds?