Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Using Distributed Cache in Hive UDF's??


Copy link to this message
-
RE: Using Distributed Cache in Hive UDF's??
Hi Edward,

 I was able to use the distributed cache, using the set
mapred.cache.files option. I could read the files locally using standard
java api's.

Thanks

Viraj

 

________________________________

From: Edward Capriolo [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 22, 2010 7:44 AM
To: [EMAIL PROTECTED]
Subject: Re: Using Distributed Cache in Hive UDF's??

 

Shameless plug.

IF you put a file in the distributed cache it is in the working
directory of the UDF so you do not need fancy hadoop isms to access it.

Shameless plug:
My geo-ip-udf does exactly this.
http://www.jointhegrid.com/hive-udf-geo-ip-jtg/index.jsp
http://www.jointhegrid.com/svn/hive-udf-geo-ip-jtg/

Edward

On Mon, Jun 21, 2010 at 7:03 PM, Viraj Bhat <[EMAIL PROTECTED]> wrote:

Hi all,

 I have a lookup function in hive which looks if a certain pattern is
present in a large text file. I upload this text file to HDFS. I hope to
use this text file in my UDF  evaluate() method.

Is there some documentation I can look up?

Distributed Cache relies on

lookupFiles = DistributedCache.getLocalCacheFiles(job);

job is of type JobConf.

Where do I get the JobConf object from within the UDF?

 

Thanks

Viraj

 

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB