Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Determine the key of Map function


Copy link to this message
-
Re: Determine the key of Map function
Jay Vyas 2012-04-24, 03:52
Ahh... Well than the key will be teacher, and the value will simply be

<-1 * # students, class_id> .

Then, you will see in the reducer that the first 3 entries will always be
the ones you wanted.

On Mon, Apr 23, 2012 at 10:17 PM, Lac Trung <[EMAIL PROTECTED]> wrote:

> Hi Jay !
> I think it's a bit difference here. I want to get 30 classId for each
> teacherId that have most students.
> For example : get 3 classId.
> (File1)
> 1) Teacher1, Class11, 30
> 2) Teacher1, Class12, 29
> 3) Teacher1, Class13, 28
> 4) Teacher1, Class14, 27
> ... n ...
>
> n+1) Teacher2, Class21, 45
> n+2) Teacher2, Class22, 44
> n+3) Teacher2, Class23, 43
> n+4) Teacher2, Class24, 42
> ... n+m ...
>
> => return 3 line 1, 2, 3 for Teacher1 and line n+1, n+2, n+3 for Teacher2
>
>
> Vào 09:52 Ngày 24 tháng 4 năm 2012, Jay Vyas <[EMAIL PROTECTED]> đã
> viết:
>
> > Its somewhat tricky to understand exactly what you need from your
> > explanation, but I believe you want teachers who have the most students
> in
> > a given class.  So for English, i have 10 teachers teaching the class -
> and
> > i want the ones with the highes # of students.
> >
> > You can output key= <classid>, value=<-1*#ofstudent,teacherid> as the
> > values.
> >
> > The values will then be sorted, by # of students.  You can thus pick
> > teacher in the the first value of your reducer, and that will be the
> > teacher for class id = xyz , with the highes number of students.
> >
> > You can also be smart in your mapper by running a combiner to remove the
> > teacherids who are clearly not maximal.
> >
> > On Mon, Apr 23, 2012 at 9:38 PM, Lac Trung <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hello everyone !
> > >
> > > I have a problem with MapReduce [:(] like that :
> > > I have 4 file input with 3 fields : teacherId, classId, numberOfStudent
> > > (numberOfStudent is ordered by desc for each teach)
> > > Output is top 30 classId that numberOfStudent is max for each teacher.
> > > My approach is MapReduce like Wordcount example. But I don't know how
> to
> > > determine key for map function.
> > > I run Wordcount example, understand its code but I have no experience
> at
> > > programming MapReduce.
> > >
> > > Can anyone help me to resolve this problem ?
> > > Thanks so much !
> > >
> > >
> > > --
> > > Lạc Trung
> > > 20083535
> > >
> >
> >
> >
> > --
> > Jay Vyas
> > MMSB/UCHC
> >
>
>
>
> --
> Lạc Trung
> 20083535
>

--
Jay Vyas
MMSB/UCHC