-RE: Using Distributed Cache in Hive UDF's??
Viraj Bhat 2010-06-24, 17:33
I was able to use the distributed cache, using the set
mapred.cache.files option. I could read the files locally using standard
From: Edward Capriolo [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 22, 2010 7:44 AM
To: [EMAIL PROTECTED]
Subject: Re: Using Distributed Cache in Hive UDF's??
IF you put a file in the distributed cache it is in the working
directory of the UDF so you do not need fancy hadoop isms to access it.
My geo-ip-udf does exactly this.
On Mon, Jun 21, 2010 at 7:03 PM, Viraj Bhat <[EMAIL PROTECTED]> wrote:
I have a lookup function in hive which looks if a certain pattern is
present in a large text file. I upload this text file to HDFS. I hope to
use this text file in my UDF evaluate() method.
Is there some documentation I can look up?
Distributed Cache relies on
lookupFiles = DistributedCache.getLocalCacheFiles(job);
job is of type JobConf.
Where do I get the JobConf object from within the UDF?