|
|
-
Re: create a hive table: always a tab space before each line
Nitin Pawar 2013-01-09, 09:07
you may want to look at the sequencefile format http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432that tab is to separate key from values in the record (I may be wrong but this is how I interpreted it) On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote: > more information: > > if I set the format as textfile, there is no tab space. > if I set the format as sequencefile and view the content via hadoop fs > -text, I saw a tab space in the head of each line. > > > At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote: > > hi there > > > I have a problem with creating a hive table. > > no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). > > something like this: > > \t f1 \001 f2 \001 f3 ... > > where f1 , f2 , f3 denotes the field value and \001 is the field separator. > > > ** > > here is the clause I used > > 35 create external table if not exists ${HIVETBL_my_table} > 36 ( > 37 nid string, > 38 userid string, > 39 spv bigint, > 40 sipv bigint, > 41 pay bigint, > 42 spay bigint, > 43 ipv bigint, > 44 sellerid string, > 45 cate string > 46 ) > 47 partitioned by(ds string) > 48 row format delimited fields terminated by '\001' lines terminated by '\n' > 49 stored as sequencefile > 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; > > > thanks for help. > > > Richard > > > > > > -- Nitin Pawar
-
Re: create a hive table: always a tab space before each line
Anurag Tangri 2013-01-09, 09:17
Hi Richard, You should set the format in create external table command based on the format of your data on HDFS.
Is your data text file or seq file on HDFS ?
Thanks, Anurag Tangri
Sent from my iPhone
On Jan 9, 2013, at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote:
> more information: > > if I set the format as textfile, there is no tab space. > if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line. > > At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote: > hi there > > I have a problem with creating a hive table. > no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). > something like this: > \t f1 \001 f2 \001 f3 ... > where f1 , f2 , f3 denotes the field value and \001 is the field separator. > > here is the clause I used > 35 create external table if not exists ${HIVETBL_my_table} > 36 ( > 37 nid string, > 38 userid string, > 39 spv bigint, > 40 sipv bigint, > 41 pay bigint, > 42 spay bigint, > 43 ipv bigint, > 44 sellerid string, > 45 cate string > 46 ) > 47 partitioned by(ds string) > 48 row format delimited fields terminated by '\001' lines terminated by '\n' > 49 stored as sequencefile > 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; > > thanks for help. > > Richard > > > >
-
Re:Re: create a hive table: always a tab space before each line
Richard 2013-01-09, 09:22
I am trying to create a table and insert overwrite it, so the data is supposed to be generated. At 2013-01-09 17:17:06,"Anurag Tangri" <[EMAIL PROTECTED]> wrote:
Hi Richard, You should set the format in create external table command based on the format of your data on HDFS. Is your data text file or seq file on HDFS ? Thanks, Anurag Tangri
Sent from my iPhone
On Jan 9, 2013, at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote: more information: if I set the format as textfile, there is no tab space. if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line. At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote:
hi there I have a problem with creating a hive table. no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). something like this: \t f1 \001 f2 \001 f3 ... where f1 , f2 , f3 denotes the field value and \001 is the field separator. here is the clause I used 35 create external table if not exists ${HIVETBL_my_table} 36 ( 37 nid string, 38 userid string, 39 spv bigint, 40 sipv bigint, 41 pay bigint, 42 spay bigint, 43 ipv bigint, 44 sellerid string, 45 cate string 46 ) 47 partitioned by(ds string) 48 row format delimited fields terminated by '\001' lines terminated by '\n' 49 stored as sequencefile 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; thanks for help. Richard
-
Re: create a hive table: always a tab space before each line
Dean Wampler 2013-01-09, 14:38
To add to what Nitin said, there is no key output by Hive in front of the tab. On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote: > you may want to look at the sequencefile format > > http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432> > that tab is to separate key from values in the record (I may be wrong but > this is how I interpreted it) > > > On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote: > >> more information: >> >> if I set the format as textfile, there is no tab space. >> if I set the format as sequencefile and view the content via hadoop fs >> -text, I saw a tab space in the head of each line. >> >> >> At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote: >> >> hi there >> >> >> I have a problem with creating a hive table. >> >> no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). >> >> something like this: >> >> \t f1 \001 f2 \001 f3 ... >> >> where f1 , f2 , f3 denotes the field value and \001 is the field separator. >> >> >> ** >> >> here is the clause I used >> >> 35 create external table if not exists ${HIVETBL_my_table} >> 36 ( >> 37 nid string, >> 38 userid string, >> 39 spv bigint, >> 40 sipv bigint, >> 41 pay bigint, >> 42 spay bigint, >> 43 ipv bigint, >> 44 sellerid string, >> 45 cate string >> 46 ) >> 47 partitioned by(ds string) >> 48 row format delimited fields terminated by '\001' lines terminated by '\n' >> 49 stored as sequencefile >> 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; >> >> >> thanks for help. >> >> >> Richard >> >> >> >> >> >> > > > -- > Nitin Pawar > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330
-
Re:Re: create a hive table: always a tab space before each line
Richard 2013-01-14, 14:58
thanks. it seems that as long as I use sequencefile as the storage format, there will be \t before the first column. If this output is continously used by hive, it is fine. The problem is that I may use a self-define map-reduce job to read these files. Does that mean I have to take care of this \t by myself? is there any option that I can disable this \t in hive? At 2013-01-09 22:38:11,"Dean Wampler" <[EMAIL PROTECTED]> wrote: To add to what Nitin said, there is no key output by Hive in front of the tab. On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote: you may want to look at the sequencefile format http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432that tab is to separate key from values in the record (I may be wrong but this is how I interpreted it) On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote: more information: if I set the format as textfile, there is no tab space. if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line. At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote: hi there I have a problem with creating a hive table. no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). something like this: \t f1 \001 f2 \001 f3 ... where f1 , f2 , f3 denotes the field value and \001 is the field separator. here is the clause I used 35 create external table if not exists ${HIVETBL_my_table} 36 ( 37 nid string, 38 userid string, 39 spv bigint, 40 sipv bigint, 41 pay bigint, 42 spay bigint, 43 ipv bigint, 44 sellerid string, 45 cate string 46 ) 47 partitioned by(ds string) 48 row format delimited fields terminated by '\001' lines terminated by '\n' 49 stored as sequencefile 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; thanks for help. Richard -- Nitin Pawar -- Dean Wampler, Ph.D. thinkbiganalytics.com +1-312-339-1330
-
Re: Re: create a hive table: always a tab space before each line
Dean Wampler 2013-01-14, 15:56
Hadoop supports Sequence Files natively. Hadoop the Definitive Guide discusses the details. dean On Mon, Jan 14, 2013 at 8:58 AM, Richard <[EMAIL PROTECTED]> wrote: > thanks. > it seems that as long as I use sequencefile as the storage format, there > will be \t before the first column. If this output is continously used by > hive, it is fine. The problem is that I may use a self-define map-reduce > job to read these files. Does that mean I have to take care of > this \t by myself? > > is there any option that I can disable this \t in hive? > > > > At 2013-01-09 22:38:11,"Dean Wampler" <[EMAIL PROTECTED]> > wrote: > > To add to what Nitin said, there is no key output by Hive in front of the > tab. > > On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote: > >> you may want to look at the sequencefile format >> >> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432>> >> that tab is to separate key from values in the record (I may be wrong but >> this is how I interpreted it) >> >> >> On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote: >> >>> more information: >>> >>> if I set the format as textfile, there is no tab space. >>> if I set the format as sequencefile and view the content via hadoop fs >>> -text, I saw a tab space in the head of each line. >>> >>> >>> At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote: >>> >>> hi there >>> >>> >>> I have a problem with creating a hive table. >>> >>> no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). >>> >>> something like this: >>> >>> \t f1 \001 f2 \001 f3 ... >>> >>> where f1 , f2 , f3 denotes the field value and \001 is the field separator. >>> >>> >>> ** >>> >>> here is the clause I used >>> >>> 35 create external table if not exists ${HIVETBL_my_table} >>> 36 ( >>> 37 nid string, >>> 38 userid string, >>> 39 spv bigint, >>> 40 sipv bigint, >>> 41 pay bigint, >>> 42 spay bigint, >>> 43 ipv bigint, >>> 44 sellerid string, >>> 45 cate string >>> 46 ) >>> 47 partitioned by(ds string) >>> 48 row format delimited fields terminated by '\001' lines terminated by '\n' >>> 49 stored as sequencefile >>> 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; >>> >>> >>> thanks for help. >>> >>> >>> Richard >>> >>> >>> >>> >>> >>> >> >> >> -- >> Nitin Pawar >> > > > > -- > *Dean Wampler, Ph.D.* > thinkbiganalytics.com > +1-312-339-1330 > > > > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330
|
|