Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Optimizing ORC Sorting - Replace two level Partitions with one?


+
John Omernik 2013-08-10, 15:56
Copy link to this message
-
Re: Optimizing ORC Sorting - Replace two level Partitions with one?
will bucketing help? if you know finite # partiotions ?
On Sat, Aug 10, 2013 at 9:26 PM, John Omernik <[EMAIL PROTECTED]> wrote:

> I have a table that currently uses RC files and has two levels of
> partitions.  day and source.  The table is first partitioned by day, then
> within each day there are 6-15 source partitions.  This makes for a lot of
> crazy partitions and was wondering if there'd be a way to optimize this
> with ORC files and some sorting.
>
> Specifically, would there be a way in a new table to make source a field
> (removing the partition)and somehow, as I am inserting into this new setup
> sort by source in such a way that will help separate the files/indexes in a
> way that gives me almost the same performance as ORC with the two level
> partitions?  Just trying to optimize here and curious what people think.
>
> John
>

--
Nitin Pawar
+
Edward Capriolo 2013-08-10, 16:39
+
Nitin Pawar 2013-08-10, 16:46
+
John Omernik 2013-08-10, 16:58
+
Edward Capriolo 2013-08-10, 17:19
+
John Omernik 2013-08-10, 17:35
+
Nitin Pawar 2013-08-10, 18:33
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB