Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Options for Loading Side Data / small files in UDF

Copy link to this message
Re: Options for Loading Side Data / small files in UDF

You can use distributed cache and hive add file command

See here for example syntax



On Sat, Sep 14, 2013 at 9:57 AM, Stephen Boesch <[EMAIL PROTECTED]> wrote:

> We have a UDF that is configured via a small properties file.  What are
> the options for distributing the file for the task nodes?  Also we want to
> be able to update the file frequently.
> We are not running on AWS so S3 is not an option - and we do not have
> access to NFS/other shared disk from the Mappers.
> If the hive classes can access HDFS that would be likely most ideal - and
> it would seem should be possible.  I am not clear how to do that - since
> the standard hdfs api requires the  Configuration to be supplied - which is
> not available.
> Pointers appreciated.
> stephenb