Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Modifying Hadoop For join Operation


+
Vikas Jadhav 2013-01-24, 13:21
+
Harsh J 2013-01-24, 15:09
+
Praveen Sripati 2013-01-24, 16:52
Copy link to this message
-
Re: Modifying Hadoop For join Operation
HI Thanks @ Harsh for replying

I am attaching paper called Map-Join-Reduce

I want to implement similar kind of architecture.

Currently MapReduce Proccess join job using Map or reduce Side join

For Reduce Side join job it has drawback

 -->for large datasets there is lot of traffic(data movenment) from
      mapper to reduces(one option We can filter out record using
      Bloloom   Filter like technique)

 FOR THIS I WANT TO PROCESS ALL JOIN IN SINGLE  MAPREDUCE JOB
1) MAP PHASE- processes all datasets and filter out record
2) REDUCE PHASE -
   reduce phase divided in to join and reducer

   join - joins all datasets
   reducer - does aggregation

   for R join S join T
                                    Reduce
mapR
mapS   -----> mapR join mapS => RS   =>RST  --> Reducer(aggrgation)
mapT-------------------------------------->mapT

If you have any idea plze share it.

any other suggestion also we welcome if it reduces completion time for
joining large dataset
thank you

**
On Thu, Jan 24, 2013 at 8:39 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Can you also define 'efficient way' and the idea you have in mind to
> implement that isn't already doable today?
>
> On Thu, Jan 24, 2013 at 6:51 PM, Vikas Jadhav <[EMAIL PROTECTED]>
> wrote:
> > Anyone has idea about how should i modify Hadoop Code for
> > Performing Join operation in efficient Way.
> > Thanks.
> >
> > --
> >
> >
> > Thanx and Regards
> >  Vikas Jadhav
>
>
>
> --
> Harsh J
>

--
*
*
*

Thanx and Regards*
* Vikas Jadhav*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB