Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - msck repair table not adding partitions which contains data.


+
Krishnappa, Suresh 2013-02-07, 19:46
+
Mark Grover 2013-02-07, 19:53
Copy link to this message
-
RE: msck repair table not adding partitions which contains data.
Krishnappa, Suresh 2013-02-07, 21:17
Thanks Mark

From: Mark Grover [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 07, 2013 2:54 PM
To: [EMAIL PROTECTED]
Subject: Re: msck repair table not adding partitions which contains data.

Suresh,
Take a look at this:
https://issues.apache.org/jira/browse/HIVE-3231
On Thu, Feb 7, 2013 at 11:46 AM, Krishnappa, Suresh <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi All,
I have created a partitioned HIVE external table as follows

create external table test_part (key int, val int) partitioned by (part int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/test/';

I have the following folders and files in /test

/test/part=1
/test/part=1/1.txt
/test/part=2
/test/part=2/1.txt
/test/part=3
/test/part=4

Now I try to automatically add the partitions into the table using the 'msck repair' command
hive> msck repair table test_part;
OK
Partitions not in metastore:    test_part:part=3        test_part:part=4
Repair: Added partition to metastore test_part:part=3
Repair: Added partition to metastore test_part:part=4
Time taken: 0.685 seconds

As you can see only partitions which do not contain any data have been added. part=1 and part=2 folders have been ignored.
Is this by design? Am I using this correctly?

The only alternative I found is to explicitly add partitions using 'alter table add partition' once every subfolder.
Is there a simpler way to achieve this?

Thanks in advance
Suri