|
|
-
Date format - any easier way
Zoltán Tóth-Czifra 2012-05-15, 13:11
Hi guys,
Thanks you very much in advance for your help.
My problem in short is getting the date for yesterday in a YYYYMMDD format. As I use this format for partitions, I need this format in quite some queries.
So far I have this:
concat( year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), CASE WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END, CASE WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END ); ...but it seems to be a bit crazy, especially if you have to repeat it in hundreds of queries. Is there any other (better) way to get this format from yesterday? - there has to be. As I can't use local user variables nor macros whatsoever, I need to repeat myself a lot here. If there is no other way, probably I need to change my partitions.
Any ideas are appreciated. Thank you!
Zoltan
-
Re: Date format - any easier way
Nitin Pawar 2012-05-15, 13:24
you may want to have a look at this function
date_sub(string startdate, int days)Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30' On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra < [EMAIL PROTECTED]> wrote:
> Hi guys, > > Thanks you very much in advance for your help. > > My problem in short is getting the date for yesterday in a YYYYMMDD > format. As I use this format for partitions, I need this format in quite > some queries. > > So far I have this: > > concat( > year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), > CASE > WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) > < 10 > THEN concat( '0', month( date_sub( to_date( from_unixtime( > unix_timestamp() ) ), 1 ) ) ) > ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), > 1 ) ) ) > END, > CASE > WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < > 10 > THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() > ) ), 1 ) ) ) > ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) > ) ) > END > ); > > > ...but it seems to be a bit crazy, especially if you have to repeat it > in hundreds of queries. Is there any other (better) way to get this format > from yesterday? - there has to be. As I can't use local user variables nor > macros whatsoever, I need to repeat myself a lot here. If there is no other > way, probably I need to change my partitions. > > Any ideas are appreciated. Thank you! > > Zoltan >
-- Nitin Pawar
-
RE: Date format - any easier way
Zoltán Tóth-Czifra 2012-05-15, 13:26
Nitin,
Thank you. As you see below I know and use this function. My problem is that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and formatting is not trivial as you can see it too.
________________________________ From: Nitin Pawar [[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 3:24 PM To: [EMAIL PROTECTED] Subject: Re: Date format - any easier way
you may want to have a look at this function
date_sub(string startdate, int days) Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30'
On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi guys,
Thanks you very much in advance for your help.
My problem in short is getting the date for yesterday in a YYYYMMDD format. As I use this format for partitions, I need this format in quite some queries.
So far I have this:
concat( year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), CASE WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END, CASE WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END ); ...but it seems to be a bit crazy, especially if you have to repeat it in hundreds of queries. Is there any other (better) way to get this format from yesterday? - there has to be. As I can't use local user variables nor macros whatsoever, I need to repeat myself a lot here. If there is no other way, probably I need to change my partitions.
Any ideas are appreciated. Thank you!
Zoltan
-- Nitin Pawar
-
Re: Date format - any easier way
Nitin Pawar 2012-05-15, 13:32
may be something like this will work
can you try using concat(split(date_sub(),"-")))
split returns the array and then you can concat them as you want
if this does not work for you, writing a simple UDF is easy as well
Thanks, nitin On Tue, May 15, 2012 at 6:56 PM, Zoltán Tóth-Czifra < [EMAIL PROTECTED]> wrote:
> Nitin, > > Thank you. As you see below I know and use this function. My problem is > that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and > formatting is not trivial as you can see it too. > > ------------------------------ > *From:* Nitin Pawar [[EMAIL PROTECTED]] > *Sent:* Tuesday, May 15, 2012 3:24 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: Date format - any easier way > > you may want to have a look at this function > > date_sub(string startdate, int days) Subtract a number of days to > startdate: date_sub('2008-12-31', 1) = '2008-12-30' > On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra < > [EMAIL PROTECTED]> wrote: > >> Hi guys, >> >> Thanks you very much in advance for your help. >> >> My problem in short is getting the date for yesterday in a YYYYMMDD >> format. As I use this format for partitions, I need this format in quite >> some queries. >> >> So far I have this: >> >> concat( >> year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), >> CASE >> WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) >> < 10 >> THEN concat( '0', month( date_sub( to_date( from_unixtime( >> unix_timestamp() ) ), 1 ) ) ) >> ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), >> 1 ) ) ) >> END, >> CASE >> WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < >> 10 >> THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() >> ) ), 1 ) ) ) >> ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 >> ) ) ) >> END >> ); >> >> >> ...but it seems to be a bit crazy, especially if you have to repeat it >> in hundreds of queries. Is there any other (better) way to get this format >> from yesterday? - there has to be. As I can't use local user variables nor >> macros whatsoever, I need to repeat myself a lot here. If there is no other >> way, probably I need to change my partitions. >> >> Any ideas are appreciated. Thank you! >> >> Zoltan >> > > > > -- > Nitin Pawar > > -- Nitin Pawar
-
RE: Date format - any easier way
Tucker, Matt 2012-05-15, 13:33
What about wrapping it in regexp_replace(..., "[-]", "") ? It may not be the cleanest, but I'd recommend passing variables from the shell :)
Matt Tucker
From: Zoltán Tóth-Czifra [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 9:27 AM To: [EMAIL PROTECTED] Subject: RE: Date format - any easier way
Nitin,
Thank you. As you see below I know and use this function. My problem is that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and formatting is not trivial as you can see it too.
________________________________ From: Nitin Pawar [[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 3:24 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: Date format - any easier way you may want to have a look at this function
date_sub(string startdate, int days)
Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30' On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi guys,
Thanks you very much in advance for your help.
My problem in short is getting the date for yesterday in a YYYYMMDD format. As I use this format for partitions, I need this format in quite some queries.
So far I have this:
concat( year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), CASE WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END, CASE WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END ); ...but it seems to be a bit crazy, especially if you have to repeat it in hundreds of queries. Is there any other (better) way to get this format from yesterday? - there has to be. As I can't use local user variables nor macros whatsoever, I need to repeat myself a lot here. If there is no other way, probably I need to change my partitions.
Any ideas are appreciated. Thank you!
Zoltan
-- Nitin Pawar
-
RE: Date format - any easier way
Zoltán Tóth-Czifra 2012-05-15, 13:54
Matt, thanks!
Luckily the order of the parts of the date is correct (reordering them would bet he same craziness).
Finally it is:
regexp_replace( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ), "[-]", "" )
Nitin, concat apparently doesn't take arrays, and I did not find any other way to join arrays in HQL. However, it would be very handy.
Thanks guys!
________________________________ From: Tucker, Matt [[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 3:33 PM To: [EMAIL PROTECTED] Subject: RE: Date format - any easier way
What about wrapping it in regexp_replace(…, “[-]”, “”) ? It may not be the cleanest, but I’d recommend passing variables from the shell :)
Matt Tucker
From: Zoltán Tóth-Czifra [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 9:27 AM To: [EMAIL PROTECTED] Subject: RE: Date format - any easier way
Nitin,
Thank you. As you see below I know and use this function. My problem is that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and formatting is not trivial as you can see it too.
________________________________ From: Nitin Pawar [[EMAIL PROTECTED]] Sent: Tuesday, May 15, 2012 3:24 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: Date format - any easier way you may want to have a look at this function
date_sub(string startdate, int days)
Subtract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30' On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi guys,
Thanks you very much in advance for your help.
My problem in short is getting the date for yesterday in a YYYYMMDD format. As I use this format for partitions, I need this format in quite some queries.
So far I have this:
concat( year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), CASE WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END, CASE WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < 10 THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) ) END ); ...but it seems to be a bit crazy, especially if you have to repeat it in hundreds of queries. Is there any other (better) way to get this format from yesterday? - there has to be. As I can't use local user variables nor macros whatsoever, I need to repeat myself a lot here. If there is no other way, probably I need to change my partitions.
Any ideas are appreciated. Thank you!
Zoltan
-- Nitin Pawar
-
Re: Date format - any easier way
Nitin Pawar 2012-05-15, 14:33
I will write an UDF for array concatenation and upload on GIT if anyone does not have it already
On Tue, May 15, 2012 at 7:24 PM, Zoltán Tóth-Czifra < [EMAIL PROTECTED]> wrote:
> Matt, thanks! > > Luckily the order of the parts of the date is correct (reordering them > would bet he same craziness). > > Finally it is: > > regexp_replace( > date_sub( > to_date( > from_unixtime( > unix_timestamp() > ) > ), 1 > ), "[-]", "" > ) > > Nitin, concat apparently doesn't take arrays, and I did not find any > other way to join arrays in HQL. However, it would be very handy. > > Thanks guys! > > ------------------------------ > *From:* Tucker, Matt [[EMAIL PROTECTED]] > *Sent:* Tuesday, May 15, 2012 3:33 PM > > *To:* [EMAIL PROTECTED] > *Subject:* RE: Date format - any easier way > > What about wrapping it in regexp_replace(…, “[-]”, “”) ? It may not be > the cleanest, but I’d recommend passing variables from the shell :) > > > > Matt Tucker > > > > *From:* Zoltán Tóth-Czifra [mailto:[EMAIL PROTECTED]] > *Sent:* Tuesday, May 15, 2012 9:27 AM > *To:* [EMAIL PROTECTED] > *Subject:* RE: Date format - any easier way > > > > Nitin, > > > > Thank you. As you see below I know and use this function. My problem is > that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and > formatting is not trivial as you can see it too. > > > ------------------------------ > > *From:* Nitin Pawar [[EMAIL PROTECTED]] > *Sent:* Tuesday, May 15, 2012 3:24 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: Date format - any easier way > > you may want to have a look at this function > > > > date_sub(string startdate, int days) > > Subtract a number of days to startdate: date_sub('2008-12-31', 1) > '2008-12-30' > > > > On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra < > [EMAIL PROTECTED]> wrote: > > Hi guys, > > > > Thanks you very much in advance for your help. > > > > My problem in short is getting the date for yesterday in a YYYYMMDD > format. As I use this format for partitions, I need this format in quite > some queries. > > > > So far I have this: > > > > concat( > > year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), > > CASE > > WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) > < 10 > > THEN concat( '0', month( date_sub( to_date( from_unixtime( > unix_timestamp() ) ), 1 ) ) ) > > ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), > 1 ) ) ) > > END, > > CASE > > WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < > 10 > > THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() > ) ), 1 ) ) ) > > ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) > ) ) > > END > > ); > > > > > > ...but it seems to be a bit crazy, especially if you have to repeat it in > hundreds of queries. Is there any other (better) way to get this format > from yesterday? - there has to be. As I can't use local user variables nor > macros whatsoever, I need to repeat myself a lot here. If there is no other > way, probably I need to change my partitions. > > > > Any ideas are appreciated. Thank you! > > > > Zoltan > > > > > > -- > Nitin Pawar >
-- Nitin Pawar
-
Re: Date format - any easier way
Philip Tromans 2012-05-15, 15:23
I knocked up the following when we were experimenting with Hive. I've been meaning to go and tidy it up for a while, but using it with a separator of "" (empty string) should have the desired effect. (Obviously the UDF throws an exception if the array is empty, been meaning to fix that for a while...)
Cheers,
Phil.
import java.util.List;
import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDF; import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
@Description(name = "implode", value = "_FUNC_(list,separator) - joins the elements of list together, separated by the given string." + " Returns a string. [Add an example here]") public class GenericUDFImplode extends GenericUDF { private ListObjectInspector listOI = null; private PrimitiveObjectInspector glueOI = null;
@Override public Object evaluate(DeferredObject[] params) throws HiveException { return join(listOI.getList(params[0].get()), glueOI.getPrimitiveJavaObject(params[1].get()).toString()); }
private String join(List<?> list, String separator) { if (list == null) { return null; } if (list.size() == 0) { return ""; }
StringBuffer buf = new StringBuffer();
for (Object o : list) { buf.append(o); buf.append(separator); }
return buf.substring(0, buf.length() - separator.length()); }
@Override public String getDisplayString(String[] args) { return "implode(" + args[0] + "," + args[1] + ")"; }
@Override public ObjectInspector initialize(ObjectInspector[] params) throws UDFArgumentException { if (params[0].getCategory() != Category.LIST) { throw new UDFArgumentException("Expected: List as argument 1 to implode()"); } if (params[1].getCategory() != Category.PRIMITIVE) { throw new UDFArgumentException("Expected: Primitive as argument 2 to implode()"); }
listOI = (ListObjectInspector) params[0]; glueOI = (PrimitiveObjectInspector) params[1];
return PrimitiveObjectInspectorFactory.javaStringObjectInspector; }
} On 15 May 2012 15:33, Nitin Pawar <[EMAIL PROTECTED]> wrote:
> I will write an UDF for array concatenation and upload on GIT if anyone > does not have it already > > > On Tue, May 15, 2012 at 7:24 PM, Zoltán Tóth-Czifra < > [EMAIL PROTECTED]> wrote: > >> Matt, thanks! >> >> Luckily the order of the parts of the date is correct (reordering them >> would bet he same craziness). >> >> Finally it is: >> >> regexp_replace( >> date_sub( >> to_date( >> from_unixtime( >> unix_timestamp() >> ) >> ), 1 >> ), "[-]", "" >> ) >> >> Nitin, concat apparently doesn't take arrays, and I did not find any >> other way to join arrays in HQL. However, it would be very handy. >> >> Thanks guys! >> >> ------------------------------ >> *From:* Tucker, Matt [[EMAIL PROTECTED]] >> *Sent:* Tuesday, May 15, 2012 3:33 PM >> >> *To:* [EMAIL PROTECTED] >> *Subject:* RE: Date format - any easier way >> >> What about wrapping it in regexp_replace(…, “[-]”, “”) ? It may not >> be the cleanest, but I’d recommend passing variables from the shell :) >> >> >> >> Matt Tucker >> >> >> >> *From:* Zoltán Tóth-Czifra [mailto:[EMAIL PROTECTED]] >> *Sent:* Tuesday, May 15, 2012 9:27 AM >> *To:* [EMAIL PROTECTED] >> *Subject:* RE: Date format - any easier way >> >> >> >> Nitin, >> >> >> >> Thank you. As you see below I know and use this function. My problem is >> that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and >> formatting is not trivial as you can see it too. >> >> >> ------------------------------ >> >> *From:* Nitin Pawar [[EMAIL PROTECTED]]
|
|