Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> questions regarding data storage and inputformat

Copy link to this message
Re: questions regarding data storage and inputformat
You could either use a custom RecordReader or you could override the
run() method on your Mapper class to do the merging before calling the
map() method.


On Wed, Jul 27, 2011 at 11:09 AM, Tom Melendez <[EMAIL PROTECTED]> wrote:
>>> 3. Another idea might be create separate seq files for chunk of
>>> records and make them non-splittable, ensuring that they go to a
>>> single mapper.  Assuming I can get away with this, see any pros/cons
>>> with that approach?
>> Separate sequence files would require the least amount of custom code.
> Thanks for the response, Joey.
> So, if I were to do the above, I would still need a custom record
> reader to put all the keys and values together, right?
> Thanks,
> Tom
> --
> ==================> Skybox is hiring.
> http://www.skyboximaging.com/careers/jobs

Joseph Echeverria
Cloudera, Inc.