Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Using Reflect: A thread for ideas


+
John Omernik 2013-02-14, 04:38
+
John Meagher 2013-02-19, 15:03
Copy link to this message
-
Re: Using Reflect: A thread for ideas
We could very easily write hive so that a UDF is a piece of groovy
loaded dynamically. This is my go-to system to make things plugable.

On Tue, Feb 19, 2013 at 10:03 AM, John Meagher <[EMAIL PROTECTED]> wrote:
> Another option for this functionality would be to use the Java scripting
> API.  The basic structure of the call would be...
>
> select script( scriptLanguage, scriptToRun, args... )
>
> I haven't seen that in Hive, but something similar is available for Pig.
> Documentation for that is available on
> http://pig.apache.org/docs/r0.9.2/udf.html#js-udfs.  There's also a
> variation in Jira https://issues.apache.org/jira/browse/PIG-1777.
>
>
>
> On Wed, Feb 13, 2013 at 11:38 PM, John Omernik <[EMAIL PROTECTED]> wrote:
>>
>> I stumbled across the little documented reflect function today. I've
>> always known about it, but java scares me if it's not in a cup so I didn't
>> dig.  Well today I dug, and found an awesome use case for reflect (for me)
>> and wanted to share.  I also thought it would be nice to validate some
>> thoughts I had on reflect, and how we could possibly share ideas on reflect
>> so that folks could get more use out of this great feature of hive.
>>
>> Here's my example: A simple URL decode function:
>>
>> select url, reflect('java.net.URLDecoder', 'decode', url, 'utf-8') as
>> decoded_url from logs
>> Basically I am using the decode function of the java.net.URLDecoder class.
>> Pretty awesome, works great, no files to distribute either.  Even works
>> through JDBC!
>>
>> Ok that being said, I realized now that the function I am trying to call
>> has to return data in a simple data type.  For example, I struggle to come
>> up with a simple reflect() for making an Hex MD5 out of a string because the
>> built in function return an object, which have methods that can return what
>> I am looking for. Which is great, but then I have to compile java code,
>> distribute a jar, and then run the code. I am looking for simple like the
>> URLDecoding function.
>>
>> I love this reflect feature, but I think it's probably underutilized due
>> to the perceived usability issues for beginners.  So that leads me to my
>> next thought. What if we brain storm here handy functions in Java that are
>> not included in the standard hive language, that make the transition to hive
>> well using the reflect function and the show an example of it's use? I went
>> first with my URLDecode, and obviously will be looking for more, but have
>> you seen some examples that we neat and worked well for you? Can you share?
>>
>> Perhaps if we get enough examples we could roll some of these into a wiki
>> page on the hive wiki that folks can use to get over the "perceived"
>> complexity of using java reflect?
>>
>> Thanks to those who have worked hard to implement features like this, it
>> is truly awesome.
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB