Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - removing hdfs table data directory does not throw error in hive


Copy link to this message
-
Re: removing hdfs table data directory does not throw error in hive
Nitin Pawar 2012-04-24, 03:28
hive table meta data is stored into a meta data store which will retain the
table structure and other meta info even if you delete hdfs table directory
as its stored in metadata store db.

When you do a select * from table;
1) hive checks for table exists in metadata store
2) if table is existing then check the location of data
3) if data is available in the location process the data else return OK
without doing anything

It is not an error case because hive job did not fail.
On Tue, Apr 24, 2012 at 6:25 AM, Sukhendu Chakraborty <
[EMAIL PROTECTED]> wrote:

> I have a hive table tab3 with two columns (c1 int, c2 int)
>
> hive> load data local inpath '/tmp/orhc466fb981' into table tab3;
> Copying data from file:/tmp/orhc466fb981
> Copying file: file:/tmp/orhc466fb981
> Loading data to table default.tab3
> OK
> Time taken: 3.907 seconds
> hive> select * from tab3;
> OK
> 4       2
> 4       10
> 7       4
> 7       22
> .....
> //remove the tab3 directory from hdfs
> [schakrab@diy-1-2 orch]$ hadoop fs -rmr /user/hive/warehouse/tab3;
> Deleted hdfs://localhost:9000/user/hive/warehouse/tab3
> [schakrab@diy-1-2 orch]$ hive
> Hive history
> file=/tmp/schakrab/hive_job_log_schakrab_201204231748_1985146177.txt
> //no error thrown!
> hive> select * from tab3;
> OK
> Time taken: 3.68 seconds
> // of course. metadata still exists.
> hive> desc tab3;
> OK
> c1      int
> c2      int
> Time taken: 0.127 seconds
>
> // doing another load recreates the directory tab3
>
> Shouldn't the select * query return an error when the underlying table
> file is removed ?
>
> -Sukhendu
>

--
Nitin Pawar