Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Maximum Number of Hive Partitions = 256?


Copy link to this message
-
Re: Maximum Number of Hive Partitions = 256?
same here ... we have way more than 256 partitions in multiple tables. I am
sure the issue has something to do with an empty string passed to the substr
function. can you validate that the table has no null/empty string for
user_name or try running the query with len(user_name) > 1 (not sure about
query syntax) ?

On Tue, May 3, 2011 at 7:02 PM, Steven Wong <[EMAIL PROTECTED]> wrote:

> I have way more than 256 partitions per table. AFAIK, there is no partition
> limit.
>
>
>
> From your stack trace, you have some host name issue somewhere.
>
>
>
>
>
> *From:* Time Less [mailto:[EMAIL PROTECTED]]
> *Sent:* Tuesday, May 03, 2011 6:52 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* Maximum Number of Hive Partitions = 256?
>
>
>
> I created a partitioned table, partitioned daily. If I query the earlier
> partitions, everything works. The later ones fail with error:
>
> hive> select substr(user_name,1,1),count(*) from u_s_h_b where
> dtpartition='2010-10-24' group by substr(user_name,1,1) ;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> java.lang.ArrayIndexOutOfBoundsException: 0
>     at
> org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:556)
>     at
> org.apache.hadoop.mapred.FileInputFormat.getSplitHosts(FileInputFormat.java:524)
>     at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:235)
> ......snip.......
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Job Submission failed with exception
> 'java.lang.ArrayIndexOutOfBoundsException(0)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
>
> It turns out that 2010-10-24 is 257 days from the very first partition in
> my dataset (2010-01-09):
>
> | date_sub('2010-10-24',interval 257 day) |
> +-----------------------------------------+
> | 2010-02-09                              |
>
> That seems like an interesting coincidence. But try as I might, the Great
> Googles will not show me a way to tune this, or even if it is tuneable, or
> expected. Has anyone else run into a 256-partition limit in Hive? How do you
> work around it? Why is that even the limit?! Shouldn't it be more like
> 32-bit maxint??!!
>
> Thanks!
>
> --
> Tim Ellis
> Riot Games
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB