|
|
-
Proper utilization of Map and reduce
hadoop hive 2012-01-12, 06:43
hey all,
I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 , and mapred.reduce.tasks=120 , replication factor=2
but still i wont be able to utilize map and reduce as shown in pic.
Thanks in advance.. tell how can i make it utilize full map adn reduce.
regards Vikas Srivastava
-
Re: Proper utilization of Map and reduce
Bejoy Ks 2012-01-12, 07:59
Hi Vikas
From the job tracker WebUI, looks like your job is just map only, ie there is no reduce tasks for your job. It is a hive job then if no reduce tasks are required for your query Hive sets the number of reduce tasks to zero at code level. The parameters set using -D at run time on CLI could be overridden at code level. I believe that is happening here. On the map tasks, there would be constrains taken here like data locality(rack locality etc). May be your data is not uniformly distributed across the cluster or so. Need detailed investigation, can't say in one look.
Regards Bejoy.K.S
________________________________ From: hadoop hive <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Thursday, January 12, 2012 12:13 PM Subject: Proper utilization of Map and reduce
hey all,
I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 , and mapred.reduce.tasks=120 , replication factor=2
but still i wont be able to utilize map and reduce as shown in pic.
Thanks in advance.. tell how can i make it utilize full map adn reduce.
regards Vikas Srivastava
-
Re: Proper utilization of Map and reduce
hadoop hive 2012-01-12, 08:24
thanks for your reply bejoy
Actually there is nothing like rack awareness(actually they are on default rack), sometime there are also some reducers when required by query, i m running jobs through hive cli,
you can suggest me some R&D so that i can check wat is actually going on.
Regards Vikas Srivastava On Thu, Jan 12, 2012 at 1:29 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
> Hi Vikas > From the job tracker WebUI, looks like your job is just map only, > ie there is no reduce tasks for your job. It is a hive job then if no > reduce tasks are required for your query Hive sets the number of reduce > tasks to zero at code level. The parameters set using -D at run time on CLI > could be overridden at code level. I believe that is happening here. > On the map tasks, there would be constrains taken here like data > locality(rack locality etc). May be your data is not uniformly distributed > across the cluster or so. Need detailed investigation, can't say in one > look. > > Regards > Bejoy.K.S > > ------------------------------ > *From:* hadoop hive <[EMAIL PROTECTED]> > *To:* [EMAIL PROTECTED] > *Sent:* Thursday, January 12, 2012 12:13 PM > *Subject:* Proper utilization of Map and reduce > > hey all, > > I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 > , and mapred.reduce.tasks=120 , replication factor=2 > > but still i wont be able to utilize map and reduce as shown in pic. > > Thanks in advance.. tell how can i make it utilize full map adn reduce. > > regards > Vikas Srivastava > > >
-
Re: Proper utilization of Map and reduce
Aniket Mokashi 2012-01-12, 16:06
Hi,
Can this be because of the scheduler you are using? I see there are 3 queues, check the scheduler configuration, that might give you some hints.
Thanks, Aniket
On Thu, Jan 12, 2012 at 12:24 AM, hadoop hive <[EMAIL PROTECTED]> wrote:
> thanks for your reply bejoy > > Actually there is nothing like rack awareness(actually they are on default > rack), sometime there are also some reducers when required by query, i m > running jobs through hive cli, > > you can suggest me some R&D so that i can check wat is actually going on. > > Regards > Vikas Srivastava > > > On Thu, Jan 12, 2012 at 1:29 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote: > >> Hi Vikas >> From the job tracker WebUI, looks like your job is just map only, >> ie there is no reduce tasks for your job. It is a hive job then if no >> reduce tasks are required for your query Hive sets the number of reduce >> tasks to zero at code level. The parameters set using -D at run time on CLI >> could be overridden at code level. I believe that is happening here. >> On the map tasks, there would be constrains taken here like data >> locality(rack locality etc). May be your data is not uniformly distributed >> across the cluster or so. Need detailed investigation, can't say in one >> look. >> >> Regards >> Bejoy.K.S >> >> ------------------------------ >> *From:* hadoop hive <[EMAIL PROTECTED]> >> *To:* [EMAIL PROTECTED] >> *Sent:* Thursday, January 12, 2012 12:13 PM >> *Subject:* Proper utilization of Map and reduce >> >> hey all, >> >> I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 >> , and mapred.reduce.tasks=120 , replication factor=2 >> >> but still i wont be able to utilize map and reduce as shown in pic. >> >> Thanks in advance.. tell how can i make it utilize full map adn reduce. >> >> regards >> Vikas Srivastava >> >> >> > -- "...:::Aniket:::... Quetzalco@tl"
|
|