Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Custom UserDefinedFunction in Hive


Copy link to this message
-
Re: Custom UserDefinedFunction in Hive
Hi Jamal,

Check if the function really returns what it should and that your data are
really in yyyyMMdd format. You can do this by simple query like this:

SELECT dt, yesterdaydate('yyyyMMdd') FROM REALTIME LIMIT 1;

I don't see anything wrong with the function itself, it works well for me
(although I tested it in hive 0.7.1). The only thing I would change about
it would be to optimize it by calling 'new' only at the time of
construction and reusing the object when the function is called, but that
should not affect the functionality at all.

Best regards,
Jan
On Tue, Aug 7, 2012 at 3:39 AM, Raihan Jamal <[EMAIL PROTECTED]> wrote:

> *Problem*
>
> I created the below UserDefinedFunction to get the yesterday's day in the
> format I wanted as I will be passing the format into this below method from
> the query.
>
>
>
> *public final class YesterdayDate extends UDF {*
>
> * *
>
> *                public String evaluate(final String format) { *
>
> *                                DateFormat dateFormat = new
> SimpleDateFormat(format); *
>
> *                                Calendar cal = Calendar.getInstance();*
>
> *                                cal.add(Calendar.DATE, -1);     *
>
> *                                return
> dateFormat.format(cal.getTime()).toString(); *
>
> *                } *
>
> *}*
>
>
>
>
>
> So whenever I try to run the query like below by adding the jar to
> classpath and creating the temporary function yesterdaydate, I always get
> zero result back-
>
>
>
> hive> create temporary function *yesterdaydate* as
> 'com.example.hive.udf.YesterdayDate';
>
> OK
>
> Time taken: 0.512 seconds
>
>
>
> Below is the query I am running-
>
>
>
> *hive> SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT
> 10;*
>
> *OK*
>
> * *
>
> And I always get zero result back but the data is there in that table for
> Aug 5th.**
>
>
>
> What wrong I am doing? Any suggestions will be appreciated.
>
>
>
>
>
> NOTE:- As I am working with Hive 0.6 so it doesn’t support variable
> substitution thing, so I cannot use hiveconf here and the above table has
> been partitioned on dt(date) column.**
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB