Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Is it possible to share a key across maps?


Copy link to this message
-
Re: Is it possible to share a key across maps?
Actually you can treat the mapper task as a template design pattern, here's
the persuade code:

Mapper.configure(JobConf)
for each record in InputSplit:
      do Mapper.map(key,value,outputkey,outputvalue)
Mapper.close()

Any sub class of mapper can override the three method: configure(),
map(),close() to do customization.

2010/1/8 Gang Luo <[EMAIL PROTECTED]>

> I don't do that in map method, but in configure( JobConf ) method which
> runs ahead of any map method call in that map task.
> JobConf.get("map.input.file") can tell you which file this map task is
> processing. Use this path to read first line of corresponding file. All
> these are done in configure method, that means, before any map method is
> called.
>
>
> -Gang
>
>
>
> ----- 原始邮件 ----
> 发件人: Raymond Jennings III <[EMAIL PROTECTED]>
> 收件人: [EMAIL PROTECTED]
> 发送日期: 2010/1/8 (周��) 7:54:30 下午
> 主   题: Re: Is it possible to share a key across maps?
>
> Hi, you do this in the map method (open the file and read the first line?)
>  Could you explain a little more how you do it with configure(), thank you.
>
> --- On Fri, 1/8/10, Gang Luo <[EMAIL PROTECTED]> wrote:
>
> > From: Gang Luo <[EMAIL PROTECTED]>
> > Subject: Re: Is it possible to share a key across maps?
> > To: [EMAIL PROTECTED]
> > Date: Friday, January 8, 2010, 4:46 PM
> > I will do that like this: at each map
> > task, I get the input file to
> > this mapper in the configure(), and manually read the first
> > line of
> > that file to get the user ID. Then start running the map
> > function.
> >
> >
> > -Gang
> >
> >
> > ----- 原始邮件 ----
> > 发件人: Raymond Jennings III <[EMAIL PROTECTED]>
> > 收件人: [EMAIL PROTECTED]
> > 发送日期: 2010/1/8 (周��) 4:23:15 下午
> > 主   题: Is it possible to share a key
> > across maps?
> >
> > I have large files where the userid is the first line of
> > each file.  I want to use that value as the output of
> > the map phase for each subsequent line of the file.  If
> > each map task gets a chunk of this file only one map task
> > will read the key value from the first line.  Is there
> > anyway I can force the other map tasks to wait until this
> > key is read and then somehow pass this value to other map
> > tasks?  Or is my reasoning incorrect?  Thanks.
> >
> >
> >
> > ___________________________________________________________
> >
> >   好玩贺卡等你发��邮箱贺卡全新上线!
> >
> > http://card.mail.cn.yahoo.com/
> >
>
>
>       ___________________________________________________________
>   好玩贺卡等你发��邮箱贺卡全新上线!
> http://card.mail.cn.yahoo.com/
>

--
Best Regards

Jeff Zhang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB