-Re: Options for Loading Side Data / small files in UDF
Jagat Singh 2013-09-14, 00:06
You can use distributed cache and hive add file command
See here for example syntax
On Sat, Sep 14, 2013 at 9:57 AM, Stephen Boesch <[EMAIL PROTECTED]> wrote:
> We have a UDF that is configured via a small properties file. What are
> the options for distributing the file for the task nodes? Also we want to
> be able to update the file frequently.
> We are not running on AWS so S3 is not an option - and we do not have
> access to NFS/other shared disk from the Mappers.
> If the hive classes can access HDFS that would be likely most ideal - and
> it would seem should be possible. I am not clear how to do that - since
> the standard hdfs api requires the Configuration to be supplied - which is
> not available.
> Pointers appreciated.