|
|
-
Re: Is there a way to limit the number of maps produced by HBaseStorage ?Mohammad Tariq 2013-01-21, 13:43
Hello Vincent,
The number of map tasks for a job is primarily governed by the InputSplits and the InputFormat you are using. So setting it through a config parameter doesn't guarantee that your job would have the specified number of map tasks. However, you can give it a try by using "set mapred.map.tasks=n" in your PigLatin job. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]>wrote: > Hi, > > We are using HBaseStorage intensively to load data from tables having more > than 100 regions. > > HBaseStorage generates 1 map par region, and our cluster having 50 map > slots, it happens that our PIG scripts start 50 maps reading concurrently > data from HBase. > > The problem is that our HBase cluster has only 10 nodes, and thus the maps > overload it (5 intensive readers per node is too much to bare). > > So question: is there a way to say to PIG : limit the nb of maps to this > maximum (ex: 10) ? > If not, how can I patch the code to do this ? > > Thanks a lot for your help > Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Jan 21, 2013 at 6:57 PM, Vincent Barat <[EMAIL PROTECTED]>wrote: > Hi, > > We are using HBaseStorage intensively to load data from tables having more > than 100 regions. > > HBaseStorage generates 1 map par region, and our cluster having 50 map > slots, it happens that our PIG scripts start 50 maps reading concurrently > data from HBase. > > The problem is that our HBase cluster has only 10 nodes, and thus the maps > overload it (5 intensive readers per node is too much to bare). > > So question: is there a way to say to PIG : limit the nb of maps to this > maximum (ex: 10) ? > If not, how can I patch the code to do this ? > > Thanks a lot for your help > |