Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> msck repair table not adding partitions which contains data.


Copy link to this message
-
RE: msck repair table not adding partitions which contains data.
Thanks Mark

From: Mark Grover [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 07, 2013 2:54 PM
To: [EMAIL PROTECTED]
Subject: Re: msck repair table not adding partitions which contains data.

Suresh,
Take a look at this:
https://issues.apache.org/jira/browse/HIVE-3231
On Thu, Feb 7, 2013 at 11:46 AM, Krishnappa, Suresh <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi All,
I have created a partitioned HIVE external table as follows

create external table test_part (key int, val int) partitioned by (part int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/test/';

I have the following folders and files in /test

/test/part=1
/test/part=1/1.txt
/test/part=2
/test/part=2/1.txt
/test/part=3
/test/part=4

Now I try to automatically add the partitions into the table using the 'msck repair' command
hive> msck repair table test_part;
OK
Partitions not in metastore:    test_part:part=3        test_part:part=4
Repair: Added partition to metastore test_part:part=3
Repair: Added partition to metastore test_part:part=4
Time taken: 0.685 seconds

As you can see only partitions which do not contain any data have been added. part=1 and part=2 folders have been ignored.
Is this by design? Am I using this correctly?

The only alternative I found is to explicitly add partitions using 'alter table add partition' once every subfolder.
Is there a simpler way to achieve this?

Thanks in advance
Suri

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB