Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Re: create a hive table: always a tab space before each line


Copy link to this message
-
Re: Re: create a hive table: always a tab space before each line
Hadoop supports Sequence Files natively. Hadoop the Definitive Guide
discusses the details.

dean

On Mon, Jan 14, 2013 at 8:58 AM, Richard <[EMAIL PROTECTED]> wrote:

> thanks.
> it seems that as long as I use sequencefile as the storage format, there
> will be \t before the first column. If this output is continously used by
> hive, it is fine. The problem is that I may use a self-define map-reduce
> job to read these files.  Does that mean I have to take care of
> this \t by myself?
>
> is there any option that I can disable this \t in hive?
>
>
>
> At 2013-01-09 22:38:11,"Dean Wampler" <[EMAIL PROTECTED]>
> wrote:
>
> To add to what Nitin said, there is no key output by Hive in front of the
> tab.
>
> On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> you may want to look at the sequencefile format
>>
>> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432
>>
>> that tab is to separate key from values in the record (I may be wrong but
>> this is how I interpreted it)
>>
>>
>> On Wed, Jan 9, 2013 at 12:49 AM, Richard <[EMAIL PROTECTED]> wrote:
>>
>>> more information:
>>>
>>> if I set the format as textfile, there is no tab space.
>>> if I set the format as sequencefile and view the content via hadoop fs
>>> -text, I saw a tab space in the head of each line.
>>>
>>>
>>> At 2013-01-09 15:44:00,Richard <[EMAIL PROTECTED]> wrote:
>>>
>>> hi there
>>>
>>>
>>> I have a problem with creating a hive table.
>>>
>>> no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record).
>>>
>>> something like this:
>>>
>>> \t f1 \001 f2 \001 f3 ...
>>>
>>> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
>>>
>>>
>>> **
>>>
>>> here is the clause I used
>>>
>>> 35 create external table if not exists ${HIVETBL_my_table}
>>>  36 (
>>>  37 nid string,
>>>  38 userid string,
>>>  39 spv bigint,
>>>  40 sipv bigint,
>>>  41 pay bigint,
>>>  42 spay bigint,
>>>  43 ipv bigint,
>>>  44 sellerid string,
>>>  45 cate string
>>>  46 )
>>>  47 partitioned by(ds string)
>>>  48 row format delimited fields terminated by '\001' lines terminated by '\n'
>>>  49 stored as sequencefile
>>>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
>>>
>>>
>>> thanks for help.
>>>
>>>
>>> Richard
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>
>
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB