Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Using Distributed Cache in PIG


Copy link to this message
-
Re: Using Distributed Cache in PIG
Dmitriy Ryaboy 2012-08-13, 23:49
You are talking about changing the way hadoop works; something like
this would be transparent to Pig.

Note that Hadoop Distributed Cache != "distributed memory cache".

I suppose you could replace the value of fs.file.impl from
org.apache.hadoop.fs.LocalFileSystem to something else..  might be
quite an endeavor.

D

On Sun, Aug 12, 2012 at 8:36 PM, kapil bhosale <[EMAIL PROTECTED]> wrote:
> Hello
> Can we use Distributed Cache to store intermediate results after the Map
> Phase so that these can be used in Reduce phase from cache.
> So as to improve performance of Map-Reduce Job.
>
> I found a Paper regarding usage of Cache in Map-Reduce,
> http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5395321&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F5394475%2F5394991%2F05395321.pdf%3Farnumber%3D5395321
>
> if Hadoop Map-Reduce can be improved with Cache then ultimately Pig script
> running in Map-Reduce can be improved.
>
> Thanks
> Kapil