Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Using Distributed Cache in PIG


Copy link to this message
-
Re: Using Distributed Cache in PIG
You are talking about changing the way hadoop works; something like
this would be transparent to Pig.

Note that Hadoop Distributed Cache != "distributed memory cache".

I suppose you could replace the value of fs.file.impl from
org.apache.hadoop.fs.LocalFileSystem to something else..  might be
quite an endeavor.

D

On Sun, Aug 12, 2012 at 8:36 PM, kapil bhosale <[EMAIL PROTECTED]> wrote:
> Hello
> Can we use Distributed Cache to store intermediate results after the Map
> Phase so that these can be used in Reduce phase from cache.
> So as to improve performance of Map-Reduce Job.
>
> I found a Paper regarding usage of Cache in Map-Reduce,
> http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5395321&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F5394475%2F5394991%2F05395321.pdf%3Farnumber%3D5395321
>
> if Hadoop Map-Reduce can be improved with Cache then ultimately Pig script
> running in Map-Reduce can be improved.
>
> Thanks
> Kapil
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB