Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - msck repair table not adding partitions which contains data.


Copy link to this message
-
msck repair table not adding partitions which contains data.
Krishnappa, Suresh 2013-02-07, 19:46
Hi All,
I have created a partitioned HIVE external table as follows

create external table test_part (key int, val int) partitioned by (part int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/test/';

I have the following folders and files in /test

/test/part=1
/test/part=1/1.txt
/test/part=2
/test/part=2/1.txt
/test/part=3
/test/part=4

Now I try to automatically add the partitions into the table using the 'msck repair' command
hive> msck repair table test_part;
OK
Partitions not in metastore:    test_part:part=3        test_part:part=4
Repair: Added partition to metastore test_part:part=3
Repair: Added partition to metastore test_part:part=4
Time taken: 0.685 seconds

As you can see only partitions which do not contain any data have been added. part=1 and part=2 folders have been ignored.
Is this by design? Am I using this correctly?

The only alternative I found is to explicitly add partitions using 'alter table add partition' once every subfolder.
Is there a simpler way to achieve this?

Thanks in advance
Suri
+
Mark Grover 2013-02-07, 19:53
+
Krishnappa, Suresh 2013-02-07, 21:17