Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - How to handle for new columns?


Copy link to this message
-
Re: How to handle for new columns?
Aniket Mokashi 2012-03-01, 22:02
If you add a column to the table in the end, for old files your new field
will be NULL. Is it not what you observe?

Thanks,
Aniket

On Thu, Mar 1, 2012 at 12:06 PM, Anson Abraham <[EMAIL PROTECTED]>wrote:

> If i have a hive table, which is an external table, and have my "log
> files" being read into it, if a new file is imported into the hdfs and the
> file has a new column, how can i get hive to handle the old files w/o the
> new column, if I do an alter adding column into the hive table.
> So example, i have a few files w/ these fields:
>
> empid, empname, deptno
>
> and so my hive table
> CREATE EXTERNAL TABLE IF NOT EXISTS Employee (
> empid BIGINT
> ,empname string
> deptno BIGINT
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE LOCATION 'hdfs://namenode1/employee/';
>
>
>
> but if I have a new file imported into the hdfs directory w/ a new column
> empid, empname, deptno, salary
>
> I can't do an alter of the employee table adding salary b/c of the
> historical files.  I used external tables b/c I wanted the table to
> dynamically get all the log files into hive table, when a new file is
> generated.
>
> I know the long way is basically adding fields through all the old files,
> but prefer of a more scalable way to do this.  Anyone know of any?
> Thanks
>

--
"...:::Aniket:::... Quetzalco@tl"