Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Re: create a hive table: always a tab space before each line


Copy link to this message
-
Re: Re: create a hive table: always a tab space before each line
Hadoop supports Sequence Files natively. Hadoop the Definitive Guide
discusses the details.

dean

On Mon, Jan 14, 2013 at 8:58 AM, Richard <[EMAIL PROTECTED]> wrote:

> thanks.
> it seems that as long as I use sequencefile as the storage format, there
> will be \t before the first column. If this output is continously used by
> hive, it is fine. The problem is that I may use a self-define map-reduce
> job to read these files.  Does that mean I have to take care of
> this \t by myself?
>
> is there any option that I can disable this \t in hive?
>
>
>
> At 2013-01-09 22:38:11,"Dean Wampler" <[EMAIL PROTECTED]>
> wrote:
>
> To add to what Nitin said, there is no key output by Hive in front of the
> tab.
>
> On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> you may want to look at the sequencefile format
>>
>> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432
>>
>> that tab is to separate key from values in the record (I may be wrong but
>> this is how I interpreted it)
>>
>>
>> On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote:
>>
>>> more information:
>>>
>>> if I set the format as textfile, there is no tab space.
>>> if I set the format as sequencefile and view the content via hadoop fs
>>> -text, I saw a tab space in the head of each line.
>>>
>>>
>>> At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote:
>>>
>>> hi there
>>>
>>>
>>> I have a problem with creating a hive table.
>>>
>>> no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record).
>>>
>>> something like this:
>>>
>>> \t f1 \001 f2 \001 f3 ...
>>>
>>> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
>>>
>>>
>>> **
>>>
>>> here is the clause I used
>>>
>>> 35 create external table if not exists ${HIVETBL_my_table}
>>>  36 (
>>>  37 nid string,
>>>  38 userid string,
>>>  39 spv bigint,
>>>  40 sipv bigint,
>>>  41 pay bigint,
>>>  42 spay bigint,
>>>  43 ipv bigint,
>>>  44 sellerid string,
>>>  45 cate string
>>>  46 )
>>>  47 partitioned by(ds string)
>>>  48 row format delimited fields terminated by '\001' lines terminated by '\n'
>>>  49 stored as sequencefile
>>>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
>>>
>>>
>>> thanks for help.
>>>
>>>
>>> Richard
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>
>
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330