Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Re: create a hive table: always a tab space before each line


Copy link to this message
-
Re:Re: create a hive table: always a tab space before each line
Richard 2013-01-14, 14:58
thanks.
it seems that as long as I use sequencefile as the storage format, there
will be \t before the first column. If this output is continously used by
hive, it is fine. The problem is that I may use a self-define map-reduce
job to read these files.  Does that mean I have to take care of
this \t by myself?
is there any option that I can disable this \t in hive?
At 2013-01-09 22:38:11,"Dean Wampler" <[EMAIL PROTECTED]> wrote:
To add to what Nitin said, there is no key output by Hive in front of the tab.
On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote:

you may want to look at the sequencefile format
http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432

that tab is to separate key from values in the record (I may be wrong but this is how I interpreted it)

On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote:

more information:
if I set the format as textfile, there is no tab space.
if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line.

At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote:

hi there
I have a problem with creating a hive table.
no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record).
something like this:
\t f1 \001 f2 \001 f3 ...
where f1 , f2 , f3 denotes the field value and \001 is the field separator.
here is the clause I used
35 create external table if not exists ${HIVETBL_my_table}
 36 (
 37 nid string,
 38 userid string,
 39 spv bigint,
 40 sipv bigint,
 41 pay bigint,
 42 spay bigint,
 43 ipv bigint,
 44 sellerid string,
 45 cate string
 46 )
 47 partitioned by(ds string)
 48 row format delimited fields terminated by '\001' lines terminated by '\n'
 49 stored as sequencefile
 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
thanks for help.
Richard

--
Nitin Pawar
--
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330