Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Is there a way to limit the number of maps produced by HBaseStorage ?


Copy link to this message
-
Re: Is there a way to limit the number of maps produced by HBaseStorage ?
inelu nagamallikarjuna 2013-01-21, 14:16
Hi Vincent,

You can restrict the number of concurrent maps by setting this
parameter *mapred.tasktracker.map.tasks.maximum
= 1 or 2*.

*Thanks
Nagamallikarjuna*

On Mon, Jan 21, 2013 at 7:13 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Hello Vincent,
>
>          The number of map tasks for a job is primarily governed by the
> InputSplits and the InputFormat you are using. So setting it through a
> config parameter doesn't guarantee that your job would have the specified
> number of map tasks. However, you can give it a try by using "set
> mapred.map.tasks=n" in your PigLatin job.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]
> >wrote:
>
> > Hi,
> >
> > We are using HBaseStorage intensively to load data from tables having
> more
> > than 100 regions.
> >
> > HBaseStorage generates 1 map par region, and our cluster having 50 map
> > slots, it happens that our PIG scripts start 50 maps reading concurrently
> > data from HBase.
> >
> > The problem is that our HBase cluster has only 10 nodes, and thus the
> maps
> > overload it (5 intensive readers per node is too much to bare).
> >
> > So question: is there a way to say to PIG : limit the nb of maps to this
> > maximum (ex: 10) ?
> > If not, how can I patch the code to do this ?
> >
> > Thanks a lot for your help
> >
>
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]
> >wrote:
>
> > Hi,
> >
> > We are using HBaseStorage intensively to load data from tables having
> more
> > than 100 regions.
> >
> > HBaseStorage generates 1 map par region, and our cluster having 50 map
> > slots, it happens that our PIG scripts start 50 maps reading concurrently
> > data from HBase.
> >
> > The problem is that our HBase cluster has only 10 nodes, and thus the
> maps
> > overload it (5 intensive readers per node is too much to bare).
> >
> > So question: is there a way to say to PIG : limit the nb of maps to this
> > maximum (ex: 10) ?
> > If not, how can I patch the code to do this ?
> >
> > Thanks a lot for your help
> >
>

--
Thanks and Regards
Nagamallikarjuna