Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Using Distributed Cache in Hive UDF's??

Copy link to this message
RE: Using Distributed Cache in Hive UDF's??
Hi Edward,

 I was able to use the distributed cache, using the set
mapred.cache.files option. I could read the files locally using standard
java api's.





From: Edward Capriolo [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 22, 2010 7:44 AM
Subject: Re: Using Distributed Cache in Hive UDF's??


Shameless plug.

IF you put a file in the distributed cache it is in the working
directory of the UDF so you do not need fancy hadoop isms to access it.

Shameless plug:
My geo-ip-udf does exactly this.


On Mon, Jun 21, 2010 at 7:03 PM, Viraj Bhat <[EMAIL PROTECTED]> wrote:

Hi all,

 I have a lookup function in hive which looks if a certain pattern is
present in a large text file. I upload this text file to HDFS. I hope to
use this text file in my UDF  evaluate() method.

Is there some documentation I can look up?

Distributed Cache relies on

lookupFiles = DistributedCache.getLocalCacheFiles(job);

job is of type JobConf.

Where do I get the JobConf object from within the UDF?