Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Data file and table def different number of columns


+
Xiu Guo 2013-10-23, 23:53
Copy link to this message
-
Re: Data file and table def different number of columns
yeah. that works as expected.  the schema drives the column list in the
select statement (not the hdfs file.)

you'd have nulls if your schema had *more* columns than the hdfs file had
fields.

you dig?
On Wed, Oct 23, 2013 at 4:53 PM, Xiu Guo <[EMAIL PROTECTED]> wrote:

> We have a table called employee.dat with below contents:
>
> 1,ryan,d'souza,it,20000
> 2,michael,fernandes,admin,25000
>
> then in Hive, query:
>
> create table myTbl (a INT, b STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> LINES TERMINATED BY '\n'
> TBLPROPERTIES ("serialization.null.format"="\\N");
>
> LOAD DATA LOCAL INPATH "/.../employee.dat" overwrite into table myTbl;
>
> when do:
> select * from myTbl;
>
> the result is:
>
> 1 ryan
> 2 michael
>
> Is this correct? One of my teammate says if the dat file and table def has
> different number of columns, NULL values should be in the table.
>
> Can someone please confirm which one is expected behavior?
>
> Thanks,
>
+
Xiu Guo 2013-10-24, 21:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB