Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map side join


Copy link to this message
-
Re: Map side join
Hello Everybody,

Need help in for on HIVE join. As we were talking about the Map side join I
tried that.
I set the flag set hive.auto.convert.join=true;

I saw Hive converts the same to map join while launching the job. But the
problem is that none of the map job progresses in my case. I made the
dataset smaller. Now it's only 512 MB cross 25 MB. I was expecting it to be
done very quickly.
No luck with any change of settings.
Failing to progress with the default setting changes these settings.
set hive.mapred.local.mem=1024; // Initially it was 216 I guess
set hive.join.cache.size=100000; // Initialliu it was 25000

Also on Hadoop side I made this changes

mapred.child.java.opts -Xmx1073741824

But I don't see any progress. After more than 40 minutes of run I am at 0%
map completion state.
Can you please throw some light on this?

Thanks a lot once again.

Regards,
Souvik.

On Fri, Dec 7, 2012 at 2:32 PM, Souvik Banerjee <[EMAIL PROTECTED]>wrote:

> Hi Bejoy,
>
> That's wonderful. Thanks for your reply.
> What I was wondering if HIVE can do map side join with more than one
> condition on JOIN clause.
> I'll simply try it out and post the result.
>
> Thanks once again.
>
> Regards,
> Souvik.
>
>  On Fri, Dec 7, 2012 at 2:10 PM, <[EMAIL PROTECTED]> wrote:
>
>> **
>> Hi Souvik
>>
>> In earlier versions of hive you had to give the map join hint. But in
>> later versions just set hive.auto.convert.join = true;
>> Hive automatically selects the smaller table. It is better to give the
>> smaller table as the first one in join.
>>
>> You can use a map join if you are joining a small table with a large one,
>> in terms of data size. By small, better to have the smaller table size in
>> range of MBs.
>> Regards
>> Bejoy KS
>>
>> Sent from remote device, Please excuse typos
>> ------------------------------
>> *From: *Souvik Banerjee <[EMAIL PROTECTED]>
>> *Date: *Fri, 7 Dec 2012 13:58:25 -0600
>> *To: *<[EMAIL PROTECTED]>
>> *ReplyTo: *[EMAIL PROTECTED]
>> *Subject: *Map side join
>>
>> Hello everybody,
>>
>> I have got a question. I didn't came across any post which says somethign
>> about this.
>> I have got two tables. Lets say A and B.
>> I want to join A & B in HIVE. I am currently using HIVE 0.9 version.
>> The join would be on few columns. like on (A.id1 = B.id1) AND (A.id2 >> B.id2) AND (A.id3 = B.id3)
>>
>> Can I ask HIVE to use map side join in this scenario? Should I give a
>> hint to HIVE by saying /*+mapjoin(B)*/
>>
>> Get back to me if you want any more information in this regard.
>>
>> Thanks and regards,
>> Souvik.
>>
>
>