Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> question about Hive 'recover partitions' on AWS S3


Copy link to this message
-
question about Hive 'recover partitions' on AWS S3
Hi,

Is it possible ever to not specify the partition variable name when discovering partitions? I'm sure I've seen this demonstrated but of course when it's needed, I can't find it. Can anyone clarify?

I have a number of date-named directories in Amazon AWS S3, containing data stored in sequencefile format:

s3://mybucket/path/to/data/20120220
s3://mybucket/path/to/data/20120221
...
s3://mybucket/path/to/data/20120226

Here's my hive:

hive> CREATE EXTERNAL TABLE myData (text STRING) PARTITIONED BY (d STRING) STORED AS SEQUENCEFILE LOCATION 's3n://mybucket/path/to/data/Fixture';
hive> ALTER TABLE myData ADD IF NOT EXISTS PARTITION (d='20120220') LOCATION 's3n://mybucket/path/to/data/20120220';
hive> DESCRIBE myData;
text string
d string
hive> SHOW PARTITIONS myData;
d=20120220
hive> ALTER TABLE myData RECOVER PARTITIONS;
Time taken: 0.488 seconds
hive> SHOW PARTITIONS myData;
d=20120220

I'd like to see all my directories discovered as partitions:

hive> SHOW PARTITIONS myData;
d=20120220
d=20120221
...
d=20120226

Is this only possible using prefixes on my directory names, eg

s3://mybucket/path/to/data/d=20120220
s3://mybucket/path/to/data/d=20120221
...
s3://mybucket/path/to/data/d=20120226

Or is there another way to auto-recover partitions?

Thanks in advance,

Tony

**********************************************************************

This email and any attachments are confidential, protected by copyright and may be legally privileged.  If you are not the intended recipient, then the dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system.  Neither Sporting Index nor the sender accepts responsibility for any virus, or any other defect which might affect any computer or IT system into which the email is received and/or opened.  It is the responsibility of the recipient to scan the email and no responsibility is accepted for any loss or damage arising in any way from receipt or use of this email.  Sporting Index Ltd is a company registered in England and Wales with company number 2636842, whose registered office is at Brookfield House, Green Lane, Ivinghoe, Leighton Buzzard, LU7 9ES.  Sporting Index Ltd is authorised and regulated by the UK Financial Services Authority (reg. no. 150404). Any financial promotion contained herein has been issued
and approved by Sporting Index Ltd.

Outbound email has been scanned for viruses and SPAM
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB