Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Optimizing ORC Sorting - Replace two level Partitions with one?


+
John Omernik 2013-08-10, 15:56
Copy link to this message
-
Re: Optimizing ORC Sorting - Replace two level Partitions with one?
will bucketing help? if you know finite # partiotions ?
On Sat, Aug 10, 2013 at 9:26 PM, John Omernik <[EMAIL PROTECTED]> wrote:

> I have a table that currently uses RC files and has two levels of
> partitions.  day and source.  The table is first partitioned by day, then
> within each day there are 6-15 source partitions.  This makes for a lot of
> crazy partitions and was wondering if there'd be a way to optimize this
> with ORC files and some sorting.
>
> Specifically, would there be a way in a new table to make source a field
> (removing the partition)and somehow, as I am inserting into this new setup
> sort by source in such a way that will help separate the files/indexes in a
> way that gives me almost the same performance as ORC with the two level
> partitions?  Just trying to optimize here and curious what people think.
>
> John
>

--
Nitin Pawar
+
Edward Capriolo 2013-08-10, 16:39
+
Nitin Pawar 2013-08-10, 16:46
+
John Omernik 2013-08-10, 16:58
+
Edward Capriolo 2013-08-10, 17:19
+
John Omernik 2013-08-10, 17:35
+
Nitin Pawar 2013-08-10, 18:33