Frank Luo 2013-04-24, 14:02
Sanjay Subramanian 2013-04-24, 14:51
-Re: how to limit mappers for a hive job
Edward Capriolo 2013-04-24, 17:25
Also make sure hive is using CombinedHiveInputFormat (not just
HiveInputFormat). Combined is the default for newer versions.
On Wed, Apr 24, 2013 at 10:51 AM, Sanjay Subramanian <
[EMAIL PROTECTED]> wrote:
> I use the following
> To specify the Mapper Input Split Size (134217728 is in bytes)
> =============================================================> SET mapreduce.input.fileinputformat.split.maxsize=134217728;
> From: Frank Luo <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Date: Wednesday, April 24, 2013 7:02 AM
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: how to limit mappers for a hive job
> I am trying to query a huge file with 370 blocks, but it errors out
> with message of “number of mappers exceeds limit” and my cluster has a “mapred.tasktracker.map.tasks.maximum”
> set to 50.
> I have tried to set parameters such as hive.exec.mappers.max/mapred.tasktracker.tasks/ apred.tasktracker.map.tasks.maximum
> through beeswax and seems none of them is effective.
> I can change “mapred.tasktracker.map.tasks.maximum” and the query can go
> through, but I really want to limit concurrent number of tasks per job.
> So any suggestions please? I am running cloudera 4.5.
> CONFIDENTIALITY NOTICE
> =====================> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.