Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Streaming value of (200MB) from a SequenceFile


Copy link to this message
-
Re: Streaming value of (200MB) from a SequenceFile
Hi Sandy:

Thank you for the advice. It sounds a logical way to resolve this issue. I will look into the writable interface and see how I can stream the value from HDFS in a MapFileInputFormat.
I'm a bit concern when no one discussed about this issue because it might mean that I'm not using hdfs the right way.

Regards,

Jerry

On 2013-03-31, at 14:10, Sandy Ryza <[EMAIL PROTECTED]> wrote:

> Hi Jerry,
>
> I assume you're providing your own Writable implementation? The Writable readFields method is given a stream.  Are you able to perform you able to perform your processing while reading the it there?
>
> -Sandy
>
> On Sat, Mar 30, 2013 at 10:52 AM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> Hi everyone,
>
> I'm having a problem to stream individual key-value pair of 200MB to 1GB from a MapFile.
> I need to stream the large value to an outputstream instead of reading the entire value before processing because it potentially uses too much memory.
>
> I read the API for MapFile, the next(WritableComparable key, Writable val) does not return an input stream.
>
> How can I accomplish this?
>
> Thanks,
>
> Jerry
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB