Hi Edward,
I was able to use the distributed cache, using the set
mapred.cache.files option. I could read the files locally using standard
java api's.
Thanks
Viraj
________________________________
From: Edward Capriolo [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 22, 2010 7:44 AM
To: [EMAIL PROTECTED]
Subject: Re: Using Distributed Cache in Hive UDF's??
Shameless plug.
IF you put a file in the distributed cache it is in the working
directory of the UDF so you do not need fancy hadoop isms to access it.
Shameless plug:
My geo-ip-udf does exactly this.
http://www.jointhegrid.com/hive-udf-geo-ip-jtg/index.jsphttp://www.jointhegrid.com/svn/hive-udf-geo-ip-jtg/Edward
On Mon, Jun 21, 2010 at 7:03 PM, Viraj Bhat <[EMAIL PROTECTED]> wrote:
Hi all,
I have a lookup function in hive which looks if a certain pattern is
present in a large text file. I upload this text file to HDFS. I hope to
use this text file in my UDF evaluate() method.
Is there some documentation I can look up?
Distributed Cache relies on
lookupFiles = DistributedCache.getLocalCacheFiles(job);
job is of type JobConf.
Where do I get the JobConf object from within the UDF?
Thanks
Viraj