Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Large static structures in M/R heap


Copy link to this message
-
Re: Large static structures in M/R heap
On 02/27/2013 01:42 PM, Adam Phelps wrote:
> We have a job that uses a large lookup structure that gets created as a
> static class during the map setup phase (and we have the JVM reused so
> this only takes place once).  However of late this structure has grown
> drastically (due to items beyond our control) and we've seen a
> substantial increase in map time due to the lower available memory.
>
> Are there any easy solutions to this sort of problem?  My first thought
> was to see if it was possible to have all tasks for a job execute in
> parallel within the same JVM, but I'm not seeing any setting that would
> allow that.  Beyond that my only ideas are to move that data into an
> external one-per-node key-value store like memcached, but I'm worried
> the additional overhead of sending a query for each value being mapped
> would also kill the job performance.
>
> - Adam
>

We use a similar solution to what you suggested to address this issue.
Though, the in-memory app we run on each datanode is a proprietary one
which allows for pipelineing of queries, and obviously helps optimize this.

Still, even using off-the-shelf memcached, and incurring the overhead of
query-per-value, speed might work out to be more acceptable on this than
you think.  Maybe give it a test in the small to benchmark first.

HTH,

DR
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB