Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> From X to Hadoop MapReduce


Copy link to this message
-
Re: From X to Hadoop MapReduce
Cool, James. I am very interested to contribute to this.
I think group by, join and order by can been added to the examples.
On Thu, Jul 22, 2010 at 4:59 AM, James Seigel <[EMAIL PROTECTED]> wrote:

> Oh yeah, it would help if I put the url:
>
> http://github.com/seigel/MRPatterns
>
> James
>
> On 2010-07-21, at 2:55 PM, James Seigel wrote:
>
> > Here is a skeleton project I stuffed up on github (feel free to offer
> other suggestions/alternatives).  There is a wiki, a place to commit code, a
> place to fork around, etc..
> >
> > Over the next couple of days I’ll try and put up some sample samples for
> people to poke around with.  Feel free to attack the wiki, contribute code,
> etc...
> >
> > If anyone can derive some cool pseudo code to write map reduce type
> algorithms that’d be great.
> >
> > Cheers
> > James.
> >
> >
> > On 2010-07-21, at 10:51 AM, James Seigel wrote:
> >
> >> Jeff, I agree that cascading looks cool and might/should have a place in
> everyone’s tool box, however at some corps it takes a while to get those
> kinds of changes in place and therefore they might have to hand craft some
> java code before moving (if they ever can) to a different technology.
> >>
> >> I will get something up and going and post a link back for whomever is
> interested.
> >>
> >> To answer Himanshu’s question, I am thinking something like this (with
> some code):
> >>
> >> Hadoop M/R Patterns, and ones that match Pig Structures
> >>
> >> 1. COUNT: [Mapper] Spit out one key and the value of 1. [Combiner] Same
> as reducer. [Reducer] count = count + next.value.  [Emit] Single result.
> >> 2. FREQ COUNT: [Mapper] Item, 1.  [Combiner] Same as reducer. [Reducer]
> count = count + next.value.  [Emit] list of Key, count
> >> 3. UNIQUE: [Mapper] Item, One.  [Combiner] None.  [Reducer + Emit] spit
> out list of keys and no value.
> >>
> >> I think adding a description of why the technique works would be helpful
> for people learning as well.  I see some questions from people not
> understanding what happens to the data between mappers and reducers, or what
> data they will see when it gets to the reducer...etc...
> >>
> >> Cheers
> >> James.
> >>
> >
>
>
--
Best Regards

Jeff Zhang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB