Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> questions regarding data storage and inputformat


Copy link to this message
-
Re: questions regarding data storage and inputformat
You could either use a custom RecordReader or you could override the
run() method on your Mapper class to do the merging before calling the
map() method.

-Joey

On Wed, Jul 27, 2011 at 11:09 AM, Tom Melendez <[EMAIL PROTECTED]> wrote:
>>
>>> 3. Another idea might be create separate seq files for chunk of
>>> records and make them non-splittable, ensuring that they go to a
>>> single mapper.  Assuming I can get away with this, see any pros/cons
>>> with that approach?
>>
>> Separate sequence files would require the least amount of custom code.
>>
>
> Thanks for the response, Joey.
>
> So, if I were to do the above, I would still need a custom record
> reader to put all the keys and values together, right?
>
> Thanks,
>
> Tom
>
> --
> ==================> Skybox is hiring.
> http://www.skyboximaging.com/careers/jobs
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434