Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Apache Log Date Format


+
bichonfrise74 2011-05-06, 22:47
Copy link to this message
-
Re: Apache Log Date Format
在 2011-5-7 上午6:48,"bichonfrise74" <[EMAIL PROTECTED]>写道:
>
> Hi,
>
> I am using this to load the apache log into Hadoop via Hive (my version is
0.4.1).
>
> CREATE TABLE apache_log (
>   ...
>   logdate STRING,
>   ...
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES (
>   "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*)
\\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ...
> ...
>
> The date is coming in this format: dd/mmm/yyyy.
> I would like to be able to load the data using this date format:
yyyy-mmm-dd.
>
> 1. Has anyone done this before loading the date in a different a different
format?
> 2. Also, how do you specify in the create table statement above that the
partition is the logdate?
> 3. And when I tried to convert the old date into unixtime format via this
sql, hive complains.
>
> hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/yyyy')) from
apache_log;
> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
[class org.apache.hadoop.io.LongWritable]

The unix_timestamp func returns bigint while the from_unixtime func only
accepts int as its parameter.so you should use cast:
from_unixtime(cast( unix_timestamp( logdate, 'dd/MMM/yyyy') as int))

> Has anyone encountered these issues before?
>
> Thanks.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB