Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - HIVE issues when using large number of partitions


+
Suresh Krishnappa 2013-03-07, 14:31
Copy link to this message
-
Re: HIVE issues when using large number of partitions
Ramki Palle 2013-03-09, 16:49
Check this for your first question:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions

Please post if you find any solution for your 2nd and 3rd questions.

Regards,
Ramki.
On Thu, Mar 7, 2013 at 8:01 PM, Suresh Krishnappa <
[EMAIL PROTECTED]> wrote:

> Hi All,
> I have a hadoop cluster with data present in large number of directories (
> > 10,000)
> To run HIVE queries over this data I created an external partitioned table
> and pointed each directory as a partition to the external table using
> 'alter table add partition' command.
> Is there a better way to create a HIVE external table over large number of
> directories?
>
> Also I am facing the following issues due to the large number of partitions
> 1) The DDL operations of creating the table and adding partitions to the
> table takes a very long time. Takes about an hour to add around 10,000
> partitions
> 2) Getting 'out of memory' java exception while adding partitions > 50000
> 3) Sometimes getting 'out of memory' java exception for select queries for
> partitions > 10000
>
> What is the recommended limit to the number of partitions that we can
> create with an HIVE table?
> Are there any configuration settings in hive/hadoop to support large
> number of partitions?
>
> I am using HIVE 0.10.0. I re-ran the tests by replacing derby with
> postgresql as metastore and still faced similar issues.
>
> Would appreciate any inputs on this
>
> Thanks
> Suresh
>
>
+
Edward Capriolo 2013-03-09, 17:53