Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Using Reflect: A thread for ideas


Copy link to this message
-
Re: Using Reflect: A thread for ideas
Edward Capriolo 2013-02-19, 15:09
We could very easily write hive so that a UDF is a piece of groovy
loaded dynamically. This is my go-to system to make things plugable.

On Tue, Feb 19, 2013 at 10:03 AM, John Meagher <[EMAIL PROTECTED]> wrote:
> Another option for this functionality would be to use the Java scripting
> API.  The basic structure of the call would be...
>
> select script( scriptLanguage, scriptToRun, args... )
>
> I haven't seen that in Hive, but something similar is available for Pig.
> Documentation for that is available on
> http://pig.apache.org/docs/r0.9.2/udf.html#js-udfs.  There's also a
> variation in Jira https://issues.apache.org/jira/browse/PIG-1777.
>
>
>
> On Wed, Feb 13, 2013 at 11:38 PM, John Omernik <[EMAIL PROTECTED]> wrote:
>>
>> I stumbled across the little documented reflect function today. I've
>> always known about it, but java scares me if it's not in a cup so I didn't
>> dig.  Well today I dug, and found an awesome use case for reflect (for me)
>> and wanted to share.  I also thought it would be nice to validate some
>> thoughts I had on reflect, and how we could possibly share ideas on reflect
>> so that folks could get more use out of this great feature of hive.
>>
>> Here's my example: A simple URL decode function:
>>
>> select url, reflect('java.net.URLDecoder', 'decode', url, 'utf-8') as
>> decoded_url from logs
>> Basically I am using the decode function of the java.net.URLDecoder class.
>> Pretty awesome, works great, no files to distribute either.  Even works
>> through JDBC!
>>
>> Ok that being said, I realized now that the function I am trying to call
>> has to return data in a simple data type.  For example, I struggle to come
>> up with a simple reflect() for making an Hex MD5 out of a string because the
>> built in function return an object, which have methods that can return what
>> I am looking for. Which is great, but then I have to compile java code,
>> distribute a jar, and then run the code. I am looking for simple like the
>> URLDecoding function.
>>
>> I love this reflect feature, but I think it's probably underutilized due
>> to the perceived usability issues for beginners.  So that leads me to my
>> next thought. What if we brain storm here handy functions in Java that are
>> not included in the standard hive language, that make the transition to hive
>> well using the reflect function and the show an example of it's use? I went
>> first with my URLDecode, and obviously will be looking for more, but have
>> you seen some examples that we neat and worked well for you? Can you share?
>>
>> Perhaps if we get enough examples we could roll some of these into a wiki
>> page on the hive wiki that folks can use to get over the "perceived"
>> complexity of using java reflect?
>>
>> Thanks to those who have worked hard to implement features like this, it
>> is truly awesome.
>
>