|
|
-
Re: Is it possible to share a key across maps?Jeff Zhang 2010-01-09, 04:15
Actually you can treat the mapper task as a template design pattern, here's
the persuade code: Mapper.configure(JobConf) for each record in InputSplit: do Mapper.map(key,value,outputkey,outputvalue) Mapper.close() Any sub class of mapper can override the three method: configure(), map(),close() to do customization. 2010/1/8 Gang Luo <[EMAIL PROTECTED]> > I don't do that in map method, but in configure( JobConf ) method which > runs ahead of any map method call in that map task. > JobConf.get("map.input.file") can tell you which file this map task is > processing. Use this path to read first line of corresponding file. All > these are done in configure method, that means, before any map method is > called. > > > -Gang > > > > ----- 原始邮件 ---- > 发件人: Raymond Jennings III <[EMAIL PROTECTED]> > 收件人: [EMAIL PROTECTED] > 发送日期: 2010/1/8 (周��) 7:54:30 下午 > 主 题: Re: Is it possible to share a key across maps? > > Hi, you do this in the map method (open the file and read the first line?) > Could you explain a little more how you do it with configure(), thank you. > > --- On Fri, 1/8/10, Gang Luo <[EMAIL PROTECTED]> wrote: > > > From: Gang Luo <[EMAIL PROTECTED]> > > Subject: Re: Is it possible to share a key across maps? > > To: [EMAIL PROTECTED] > > Date: Friday, January 8, 2010, 4:46 PM > > I will do that like this: at each map > > task, I get the input file to > > this mapper in the configure(), and manually read the first > > line of > > that file to get the user ID. Then start running the map > > function. > > > > > > -Gang > > > > > > ----- 原始邮件 ---- > > 发件人: Raymond Jennings III <[EMAIL PROTECTED]> > > 收件人: [EMAIL PROTECTED] > > 发送日期: 2010/1/8 (周��) 4:23:15 下午 > > 主 题: Is it possible to share a key > > across maps? > > > > I have large files where the userid is the first line of > > each file. I want to use that value as the output of > > the map phase for each subsequent line of the file. If > > each map task gets a chunk of this file only one map task > > will read the key value from the first line. Is there > > anyway I can force the other map tasks to wait until this > > key is read and then somehow pass this value to other map > > tasks? Or is my reasoning incorrect? Thanks. > > > > > > > > ___________________________________________________________ > > > > 好玩贺卡等你发��邮箱贺卡全新上线! > > > > http://card.mail.cn.yahoo.com/ > > > > > ___________________________________________________________ > 好玩贺卡等你发��邮箱贺卡全新上线! > http://card.mail.cn.yahoo.com/ > -- Best Regards Jeff Zhang |