Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Custom UserDefinedFunction in Hive


Copy link to this message
-
Re: Custom UserDefinedFunction in Hive
I tested that function using main and by printing it out and it works fine.
As I am trying to get the Yesterday's date.

I need my query to be like this as today's date is Aug 6th, so query should
be for Aug 5th. And this works fine for me.

*SELECT * FROM REALTIME where dt= '20120805' LIMIT 10;*

So Instead of doing the above way, I wanted to do it like below- And the
below query should give the same result as above query. And when I tried
doing this way, I get zero result back.

*SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT 10;*

So something is wrong the way I am doing it for sure?

*Raihan Jamal*

On Mon, Aug 6, 2012 at 10:56 PM, Jan Dolinár <[EMAIL PROTECTED]> wrote:

> Hi Jamal,
>
> Check if the function really returns what it should and that your data are
> really in yyyyMMdd format. You can do this by simple query like this:
>
> SELECT dt, yesterdaydate('yyyyMMdd') FROM REALTIME LIMIT 1;
>
> I don't see anything wrong with the function itself, it works well for me
> (although I tested it in hive 0.7.1). The only thing I would change about
> it would be to optimize it by calling 'new' only at the time of
> construction and reusing the object when the function is called, but that
> should not affect the functionality at all.
>
> Best regards,
> Jan
>
>
>
>
> On Tue, Aug 7, 2012 at 3:39 AM, Raihan Jamal <[EMAIL PROTECTED]>wrote:
>
>> *Problem*
>>
>> I created the below UserDefinedFunction to get the yesterday's day in the
>> format I wanted as I will be passing the format into this below method from
>> the query.
>>
>>
>>
>> *public final class YesterdayDate extends UDF {*
>>
>> * *
>>
>> *                public String evaluate(final String format) { *
>>
>> *                                DateFormat dateFormat = new
>> SimpleDateFormat(format); *
>>
>> *                                Calendar cal = Calendar.getInstance();*
>>
>> *                                cal.add(Calendar.DATE, -1);     *
>>
>> *                                return
>> dateFormat.format(cal.getTime()).toString(); *
>>
>> *                } *
>>
>> *}*
>>
>>
>>
>>
>>
>> So whenever I try to run the query like below by adding the jar to
>> classpath and creating the temporary function yesterdaydate, I always get
>> zero result back-
>>
>>
>>
>> hive> create temporary function *yesterdaydate* as
>> 'com.example.hive.udf.YesterdayDate';
>>
>> OK
>>
>> Time taken: 0.512 seconds
>>
>>
>>
>> Below is the query I am running-
>>
>>
>>
>> *hive> SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT
>> 10;*
>>
>> *OK*
>>
>> * *
>>
>> And I always get zero result back but the data is there in that table for
>> Aug 5th.**
>>
>>
>>
>> What wrong I am doing? Any suggestions will be appreciated.
>>
>>
>>
>>
>>
>> NOTE:- As I am working with Hive 0.6 so it doesn’t support variable
>> substitution thing, so I cannot use hiveconf here and the above table has
>> been partitioned on dt(date) column.**
>>
>
>