Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> From X to Hadoop MapReduce


Copy link to this message
-
Re: From X to Hadoop MapReduce
Cool, James. I am very interested to contribute to this.
I think group by, join and order by can been added to the examples.
On Thu, Jul 22, 2010 at 4:59 AM, James Seigel <[EMAIL PROTECTED]> wrote:

> Oh yeah, it would help if I put the url:
>
> http://github.com/seigel/MRPatterns
>
> James
>
> On 2010-07-21, at 2:55 PM, James Seigel wrote:
>
> > Here is a skeleton project I stuffed up on github (feel free to offer
> other suggestions/alternatives).  There is a wiki, a place to commit code, a
> place to fork around, etc..
> >
> > Over the next couple of days I’ll try and put up some sample samples for
> people to poke around with.  Feel free to attack the wiki, contribute code,
> etc...
> >
> > If anyone can derive some cool pseudo code to write map reduce type
> algorithms that’d be great.
> >
> > Cheers
> > James.
> >
> >
> > On 2010-07-21, at 10:51 AM, James Seigel wrote:
> >
> >> Jeff, I agree that cascading looks cool and might/should have a place in
> everyone’s tool box, however at some corps it takes a while to get those
> kinds of changes in place and therefore they might have to hand craft some
> java code before moving (if they ever can) to a different technology.
> >>
> >> I will get something up and going and post a link back for whomever is
> interested.
> >>
> >> To answer Himanshu’s question, I am thinking something like this (with
> some code):
> >>
> >> Hadoop M/R Patterns, and ones that match Pig Structures
> >>
> >> 1. COUNT: [Mapper] Spit out one key and the value of 1. [Combiner] Same
> as reducer. [Reducer] count = count + next.value.  [Emit] Single result.
> >> 2. FREQ COUNT: [Mapper] Item, 1.  [Combiner] Same as reducer. [Reducer]
> count = count + next.value.  [Emit] list of Key, count
> >> 3. UNIQUE: [Mapper] Item, One.  [Combiner] None.  [Reducer + Emit] spit
> out list of keys and no value.
> >>
> >> I think adding a description of why the technique works would be helpful
> for people learning as well.  I see some questions from people not
> understanding what happens to the data between mappers and reducers, or what
> data they will see when it gets to the reducer...etc...
> >>
> >> Cheers
> >> James.
> >>
> >
>
>
--
Best Regards

Jeff Zhang