|
|
+
Vikas Jadhav 2013-01-24, 13:21
+
Harsh J 2013-01-24, 15:09
+
Praveen Sripati 2013-01-24, 16:52
-
Re: Modifying Hadoop For join OperationVikas Jadhav 2013-01-24, 19:11
HI Thanks @ Harsh for replying
I am attaching paper called Map-Join-Reduce I want to implement similar kind of architecture. Currently MapReduce Proccess join job using Map or reduce Side join For Reduce Side join job it has drawback -->for large datasets there is lot of traffic(data movenment) from mapper to reduces(one option We can filter out record using Bloloom Filter like technique) FOR THIS I WANT TO PROCESS ALL JOIN IN SINGLE MAPREDUCE JOB 1) MAP PHASE- processes all datasets and filter out record 2) REDUCE PHASE - reduce phase divided in to join and reducer join - joins all datasets reducer - does aggregation for R join S join T Reduce mapR mapS -----> mapR join mapS => RS =>RST --> Reducer(aggrgation) mapT-------------------------------------->mapT If you have any idea plze share it. any other suggestion also we welcome if it reduces completion time for joining large dataset thank you ** On Thu, Jan 24, 2013 at 8:39 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Hi, > > Can you also define 'efficient way' and the idea you have in mind to > implement that isn't already doable today? > > On Thu, Jan 24, 2013 at 6:51 PM, Vikas Jadhav <[EMAIL PROTECTED]> > wrote: > > Anyone has idea about how should i modify Hadoop Code for > > Performing Join operation in efficient Way. > > Thanks. > > > > -- > > > > > > Thanx and Regards > > Vikas Jadhav > > > > -- > Harsh J > -- * * * Thanx and Regards* * Vikas Jadhav* |