Raihan Jamal 2012-10-04, 00:20
I suggest you store unix timestamp in hive, and so you can compare it
as BIGINT without worrying about STRING comparison.
And if your data is to be queried on daily bases, you can split one
big file into small files, say, one file per day, then add them as
partitions of soj_session_container. This way can optimize hive a
little since your queries won't have to read all records in
CREATE TABLE soj_session_container (
events MAP<STRING, STRING>
PARTITIONED BY (date STRING);
ALTER TABLE soj_session_container ADD PARTITION (date = '20120918')
SELECT * FROM soj_session_container LATERAL VIEW explode(a.events) t
AS event WHERE date = '20120918' AND event.event_timestamp >unix_timestamp('2012-09-18 00:00:00') AND event.event_timestamp <unix_timestamp('2012-09-18 02:00:00');
On Thu, Oct 4, 2012 at 8:20 AM, Raihan Jamal <[EMAIL PROTECTED]> wrote:
> I have this below query from which I am trying to find out those records
> that fall between midnight and 2 A.M on 18th September.
> And SojTimestampToDate function will give me date in this format yyyy/MM/dd
> I am not sure whether the date comparison I did is right or not. And it will
> give me all those records between midnight and 2 AM.
> SELECT event.app_payload ['n'] AS changed_cguid
> FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
> WHERE a.dt = '20120918'
> AND SojTimestampToDate(event.event_timestamp) >= '2012/09/18 00:00:00'
> AND SojTimestampToDate(event.event_timestamp) <= '2012/09/18 02:00:00'
> Can anyone shed some light on this whether I am doing right or not?
> Raihan Jamal