Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Data file and table def different number of columns


Copy link to this message
-
Re: Data file and table def different number of columns
Stephen Sprague 2013-10-24, 00:58
yeah. that works as expected.  the schema drives the column list in the
select statement (not the hdfs file.)

you'd have nulls if your schema had *more* columns than the hdfs file had
fields.

you dig?
On Wed, Oct 23, 2013 at 4:53 PM, Xiu Guo <[EMAIL PROTECTED]> wrote:

> We have a table called employee.dat with below contents:
>
> 1,ryan,d'souza,it,20000
> 2,michael,fernandes,admin,25000
>
> then in Hive, query:
>
> create table myTbl (a INT, b STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> LINES TERMINATED BY '\n'
> TBLPROPERTIES ("serialization.null.format"="\\N");
>
> LOAD DATA LOCAL INPATH "/.../employee.dat" overwrite into table myTbl;
>
> when do:
> select * from myTbl;
>
> the result is:
>
> 1 ryan
> 2 michael
>
> Is this correct? One of my teammate says if the dat file and table def has
> different number of columns, NULL values should be in the table.
>
> Can someone please confirm which one is expected behavior?
>
> Thanks,
>