Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Bucketing external tables


Copy link to this message
-
Re: Bucketing external tables
Hi Sadu

When you use bucketed map join you need to ensure that
1) Both the tables are bucketed on the join columns
2)The number of buckets should be same or multiple of each other for two tables
Then only bucketed map join would work. Else a normal reduce side join.

If you have already considered these, then can you please your CLI logs here so that we can help you better.

 
Regards,
Bejoy KS
________________________________
 From: Sadananda Hegde <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, April 11, 2013 11:16 PM
Subject: Re: Bucketing external tables
 
I was able to load data into bucketed tables. I verified that the number of files created in each of the partitioned folder matches the number of buckets specified in my CREATE statement. But I don't see any immprovements in the query speed. I tried with 90 buckets, 360 and 720 buckets. I have SET hive.enforce.bucketing=true and set hive.optimize.bucketmapjoin=true; Do I need to set any other parameters? Do I need to use MAPJOIN hint in my join for it to use bucket join. I am not sure where to look for to verify that it's using the map side bucket joins. Greatly appreciate any  help / insgught .
 
Regads,
Sadu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB