-Re: Streaming value of (200MB) from a SequenceFile
Rahul Bhattacharjee 2013-04-01, 03:32
I am also new to Hadoop and have a question here.
The writable does have a DataInput stream so that the objects can be
constructed from the byte stream.
Are you suggesting to save the stream for later use ,but late we cannot
ascertain the state of the stream.
For a large value , I think we can actually take the useful part and emmit
it out of from a mapper , we might also have a custom input format to do
this thing so that large value doesn't even reach the mapper.
Am I missing anything here?
On Sat, Mar 30, 2013 at 11:22 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> Hi everyone,
> I'm having a problem to stream individual key-value pair of 200MB to 1GB
> from a MapFile.
> I need to stream the large value to an outputstream instead of reading the
> entire value before processing because it potentially uses too much memory.
> I read the API for MapFile, the next(WritableComparable key, Writable val)
> does not return an input stream.
> How can I accomplish this?