If you add a column to the table in the end, for old files your new field
will be NULL. Is it not what you observe?
On Thu, Mar 1, 2012 at 12:06 PM, Anson Abraham <[EMAIL PROTECTED]>wrote:
> If i have a hive table, which is an external table, and have my "log
> files" being read into it, if a new file is imported into the hdfs and the
> file has a new column, how can i get hive to handle the old files w/o the
> new column, if I do an alter adding column into the hive table.
> So example, i have a few files w/ these fields:
> empid, empname, deptno
> and so my hive table
> CREATE EXTERNAL TABLE IF NOT EXISTS Employee (
> empid BIGINT
> ,empname string
> deptno BIGINT
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE LOCATION 'hdfs://namenode1/employee/';
> but if I have a new file imported into the hdfs directory w/ a new column
> empid, empname, deptno, salary
> I can't do an alter of the employee table adding salary b/c of the
> historical files. I used external tables b/c I wanted the table to
> dynamically get all the log files into hive table, when a new file is
> I know the long way is basically adding fields through all the old files,
> but prefer of a more scalable way to do this. Anyone know of any?