|
|
-
Re: Newbie - Hive Tutorial question, what format is the sample data file in?Edward Capriolo 2012-09-01, 15:33
It is up to the user to decide what that INT means in this case. This
tutorial was created very early on. Since then hive has added support for timestamp type which has a clear meaning. On Sat, Sep 1, 2012 at 10:33 AM, David Swearingen <[EMAIL PROTECTED]> wrote: > Thanks. Still not clear to me what a time field is as an INT: > milleseconds since the epoch? That was my question. > > Sent from my iPhone > > On Sep 1, 2012, at 9:37 AM, Edward Capriolo <[EMAIL PROTECTED]> wrote: > >> I do not think their is a sample file. You can tell the format by >> create table statement. >> >> COMMENT 'This is the staging page view table' >> ROW FORMAT DELIMITED FIELDS TERMINATED BY '44' LINES TERMINATED BY '12' >> STORED AS TEXTFILE >> LOCATION '/user/data/staging/page_view'; >> >> >> http://www.asciitable.com/ >> >> 12 is a '\n' and 44 is a ','. Since the format is TEXTFILE integers >> are serialized into strings. >> >> >> >> On Sat, Sep 1, 2012 at 7:52 AM, David Swearingen <[EMAIL PROTECTED]> wrote: >>> I'm going through the tutorial at >>> https://cwiki.apache.org/Hive/tutorial.html . It's not clear to me what the >>> exact format of the log file would be for the sample queries described eg at >>> https://cwiki.apache.org/Hive/tutorial.html#Tutorial-LoadingData I can't >>> find a link to download such a file and while I'd be happy to construct one >>> myself it's not clear to me what a viewTime of type INT would look like >>> exactly. Perhaps the file conforms to standard web server logfile formats >>> however there are I believe a couple of variants on that format. >>> >>> Am I missing something? Thanks. |