Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Looking at the columns table

Copy link to this message
Re: Looking at the columns table
Hey Ed,

Your thinking is correct and has been implemented in

Time to upgrade to 0.8 :)


On Wed, Apr 11, 2012 at 07:53, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Hey all. Our metastore in mysql is fairly large over 12GB. All the
> storage here is the columns table. It seems that each column is stored
> for each partition/storage descriptor as a one-many relationship.
> In our case all the partitions have the same column definition. My
> thinking. Should the relationship from columns->partition/storage
> descriptor be a many<->many? In this way we only store the column once
> and the current column table can reference the primary key of this
> column. This should bring the size of this table down really
> drastically.
> Since every other table in the metastore is so small this huge columns
> table looks like the only scalability choke point we have.
> Edward