Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Is there a way to limit the number of maps produced by HBaseStorage ?


Copy link to this message
-
Re: Is there a way to limit the number of maps produced by HBaseStorage ?
Hello Vincent,

         The number of map tasks for a job is primarily governed by the
InputSplits and the InputFormat you are using. So setting it through a
config parameter doesn't guarantee that your job would have the specified
number of map tasks. However, you can give it a try by using "set
mapred.map.tasks=n" in your PigLatin job.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]>wrote:

> Hi,
>
> We are using HBaseStorage intensively to load data from tables having more
> than 100 regions.
>
> HBaseStorage generates 1 map par region, and our cluster having 50 map
> slots, it happens that our PIG scripts start 50 maps reading concurrently
> data from HBase.
>
> The problem is that our HBase cluster has only 10 nodes, and thus the maps
> overload it (5 intensive readers per node is too much to bare).
>
> So question: is there a way to say to PIG : limit the nb of maps to this
> maximum (ex: 10) ?
> If not, how can I patch the code to do this ?
>
> Thanks a lot for your help
>
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]>wrote:

> Hi,
>
> We are using HBaseStorage intensively to load data from tables having more
> than 100 regions.
>
> HBaseStorage generates 1 map par region, and our cluster having 50 map
> slots, it happens that our PIG scripts start 50 maps reading concurrently
> data from HBase.
>
> The problem is that our HBase cluster has only 10 nodes, and thus the maps
> overload it (5 intensive readers per node is too much to bare).
>
> So question: is there a way to say to PIG : limit the nb of maps to this
> maximum (ex: 10) ?
> If not, how can I patch the code to do this ?
>
> Thanks a lot for your help
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB