Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Map/Reduce and sequence file metadata...


Copy link to this message
-
Map/Reduce and sequence file metadata...

   Hi all. I'm struggling a bit to figure this out and wondering if anyone had any  pointers.

   I'm using SequenceFiles as output from a MapReduce job ( using SequenceFileOutputFormat ) and then in a followup MapReduce job reading in the results using SequenceFileInputFormat.  All seems to work fine.  What I haven't figured out is how to write the SequenceFile.Metadata in the SequenceFileOutputFormat and then read the metadata in SequenceFileInputFormat.  Is that possible to do using the new mapreduce.* API?

   I have two types of files I want to process in the Mapper.  Currently I'm using the  context.getInputSplit() and parsing the resulting fileSplit.getPath() to determine what file I'm processing.  It seems cleaner to use the SequenceFile.Metadata if I can.  Does that make sense or am I off in the weeds?

   Thanks

   Andy
+
Tom White 2009-10-02, 09:25
+
Andy Sautins 2009-10-02, 20:48
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB