|
|
Cam Bazz 2012-06-13, 16:46
hello,
for all the log files i have i log the session id and user cookie. now i need to seperate certain items of certain users, so i need to join all my data to a global cookike table.
what are some common practices doing this? just put it in a table and join? or maybe keep them in some sort of in memory cache?
and ideas / recomendations greatly appreciated.
best regards,
-
Re: joining user sessions
Bejoy KS 2012-06-13, 16:52
Hi,
If one of your tables are small enough then you can go in for map side joins, which actually distributes the smaller table contents into the distributed cache and then perform the join which is much faster compared to normal reduce side joins.
To enable map side joins, before executing join query set the following property hive> hive.auto.convert.join=true; Regards Bejoy KS
Sent from handheld, please excuse typos.
-----Original Message----- From: Cam Bazz <[EMAIL PROTECTED]> Date: Wed, 13 Jun 2012 19:46:18 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: joining user sessions
hello,
for all the log files i have i log the session id and user cookie. now i need to seperate certain items of certain users, so i need to join all my data to a global cookike table.
what are some common practices doing this? just put it in a table and join? or maybe keep them in some sort of in memory cache?
and ideas / recomendations greatly appreciated.
best regards,
-
Re: joining user sessions
Cam Bazz 2012-06-13, 17:09
Thank you. But how do I put the smaller table into distributed cache? is it done magically when we do a hive.auto.convert.join=true ?
On Wed, Jun 13, 2012 at 7:52 PM, Bejoy KS <[EMAIL PROTECTED]> wrote: > Hi, > > If one of your tables are small enough then you can go in for map side joins, which actually distributes the smaller table contents into the distributed cache and then perform the join which is much faster compared to normal reduce side joins. > > To enable map side joins, before executing join query set the following property > hive> hive.auto.convert.join=true; > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > > -----Original Message----- > From: Cam Bazz <[EMAIL PROTECTED]> > Date: Wed, 13 Jun 2012 19:46:18 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: joining user sessions > > hello, > > for all the log files i have i log the session id and user cookie. now > i need to seperate certain items of certain users, so i need to join > all my data to a global cookike table. > > what are some common practices doing this? just put it in a table and > join? or maybe keep them in some sort of in memory cache? > > and ideas / recomendations greatly appreciated. > > best regards,
-
Re: joining user sessions
Bejoy KS 2012-06-13, 17:10
Yes, the framework takes care of this.
Regards Bejoy KS
Sent from handheld, please excuse typos.
-----Original Message----- From: Cam Bazz <[EMAIL PROTECTED]> Date: Wed, 13 Jun 2012 20:09:12 To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: joining user sessions
Thank you. But how do I put the smaller table into distributed cache? is it done magically when we do a hive.auto.convert.join=true ?
On Wed, Jun 13, 2012 at 7:52 PM, Bejoy KS <[EMAIL PROTECTED]> wrote: > Hi, > > If one of your tables are small enough then you can go in for map side joins, which actually distributes the smaller table contents into the distributed cache and then perform the join which is much faster compared to normal reduce side joins. > > To enable map side joins, before executing join query set the following property > hive> hive.auto.convert.join=true; > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > > -----Original Message----- > From: Cam Bazz <[EMAIL PROTECTED]> > Date: Wed, 13 Jun 2012 19:46:18 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: joining user sessions > > hello, > > for all the log files i have i log the session id and user cookie. now > i need to seperate certain items of certain users, so i need to join > all my data to a global cookike table. > > what are some common practices doing this? just put it in a table and > join? or maybe keep them in some sort of in memory cache? > > and ideas / recomendations greatly appreciated. > > best regards,
|
|