Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Determine the key of Map function


+
Lac Trung 2012-04-24, 02:38
+
Jay Vyas 2012-04-24, 02:52
+
Lac Trung 2012-04-24, 03:17
Copy link to this message
-
Re: Determine the key of Map function
Ahh... Well than the key will be teacher, and the value will simply be

<-1 * # students, class_id> .

Then, you will see in the reducer that the first 3 entries will always be
the ones you wanted.

On Mon, Apr 23, 2012 at 10:17 PM, Lac Trung <[EMAIL PROTECTED]> wrote:

> Hi Jay !
> I think it's a bit difference here. I want to get 30 classId for each
> teacherId that have most students.
> For example : get 3 classId.
> (File1)
> 1) Teacher1, Class11, 30
> 2) Teacher1, Class12, 29
> 3) Teacher1, Class13, 28
> 4) Teacher1, Class14, 27
> ... n ...
>
> n+1) Teacher2, Class21, 45
> n+2) Teacher2, Class22, 44
> n+3) Teacher2, Class23, 43
> n+4) Teacher2, Class24, 42
> ... n+m ...
>
> => return 3 line 1, 2, 3 for Teacher1 and line n+1, n+2, n+3 for Teacher2
>
>
> Vào 09:52 Ngày 24 tháng 4 năm 2012, Jay Vyas <[EMAIL PROTECTED]> đã
> viết:
>
> > Its somewhat tricky to understand exactly what you need from your
> > explanation, but I believe you want teachers who have the most students
> in
> > a given class.  So for English, i have 10 teachers teaching the class -
> and
> > i want the ones with the highes # of students.
> >
> > You can output key= <classid>, value=<-1*#ofstudent,teacherid> as the
> > values.
> >
> > The values will then be sorted, by # of students.  You can thus pick
> > teacher in the the first value of your reducer, and that will be the
> > teacher for class id = xyz , with the highes number of students.
> >
> > You can also be smart in your mapper by running a combiner to remove the
> > teacherids who are clearly not maximal.
> >
> > On Mon, Apr 23, 2012 at 9:38 PM, Lac Trung <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hello everyone !
> > >
> > > I have a problem with MapReduce [:(] like that :
> > > I have 4 file input with 3 fields : teacherId, classId, numberOfStudent
> > > (numberOfStudent is ordered by desc for each teach)
> > > Output is top 30 classId that numberOfStudent is max for each teacher.
> > > My approach is MapReduce like Wordcount example. But I don't know how
> to
> > > determine key for map function.
> > > I run Wordcount example, understand its code but I have no experience
> at
> > > programming MapReduce.
> > >
> > > Can anyone help me to resolve this problem ?
> > > Thanks so much !
> > >
> > >
> > > --
> > > Lạc Trung
> > > 20083535
> > >
> >
> >
> >
> > --
> > Jay Vyas
> > MMSB/UCHC
> >
>
>
>
> --
> Lạc Trung
> 20083535
>

--
Jay Vyas
MMSB/UCHC
+
Lac Trung 2012-04-24, 03:54
+
Lac Trung 2012-04-24, 04:40
+
Lac Trung 2012-04-24, 04:41
+
Devaraj k 2012-04-24, 05:21
+
Lac Trung 2012-04-24, 12:37
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB