|
Raihan Jamal
2012-10-03, 18:05
Raihan Jamal
2012-10-03, 21:19
Raihan Jamal
2012-10-03, 21:50
Chalcy Raja
2012-10-03, 21:59
Raihan Jamal
2012-10-03, 22:01
Raihan Jamal
2012-10-03, 23:46
Raihan Jamal
2012-10-03, 23:54
Bejoy KS
2012-10-04, 03:23
|
-
org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 18:05
I am running a Hive query and I am getting this exception below-
*Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2072020 exceeds the configured limit 200000* I am not sure what does this error means? Can anyone help me out here? *Raihan Jamal*
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 21:19
Can anyone help me out here? What does the below error means? And this is
the query I am using- *SELECT cguid,* * event_item,* * event_timestamp,* * event_site_id* *FROM (* * SELECT event.app_payload ['n'] AS cguid,* * event.app_payload ['itm'] AS event_item,* * max(event.event_timestamp) AS event_timestamp,* * event.site_id AS event_site_id* * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* * WHERE a.dt = '20120917'* * AND event.app_payload ['n'] IS NOT NULL* * AND instr(event.app_payload ['itm'], '%') = 0* * AND event.app_payload ['itm'] IS NOT NULL* * AND (* * event.page_type_id = '4340'* * OR event.page_type_id = '2047675'* * )* * GROUP BY event.app_payload ['n'],* * event.site_id,* * event.app_payload ['itm']* * ORDER BY cguid,* * event_timestamp DESC* * ) m* *LEFT OUTER JOIN (* * SELECT event.app_payload ['n'] AS changed_cguid* * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* * WHERE a.dt = '20120918'* * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')* * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')* * ) n ON m.cguid = n.changed_cguid* *WHERE n.changed_cguid IS NULL* *Raihan Jamal* On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]> wrote: > I am running a Hive query and I am getting this exception below- > > *Job Submission failed with exception > 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: > java.io.IOException: The number of tasks for this job 2072020 exceeds the > configured limit 200000* > > I am not sure what does this error means? Can anyone help me out here? > > > > *Raihan Jamal* > >
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 21:50
Ok. Found the issue I guess.
This is the below settings we have in the *mapred-site.xml *for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown. <property> <!-- 10,000 is 100 tasks per node on a 100-node cluster --> <name>mapred.jobtracker.maxtasks.per.job</name> <value>200000</value> <final>true</final> </property> How can I override these changes manually from the Hive prompt? Any suggestions? *Raihan Jamal* On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote: > Can anyone help me out here? What does the below error means? And this is > the query I am using- > > *SELECT cguid,* > * event_item,* > * event_timestamp,* > * event_site_id* > *FROM (* > * SELECT event.app_payload ['n'] AS cguid,* > * event.app_payload ['itm'] AS event_item,* > * max(event.event_timestamp) AS event_timestamp,* > * event.site_id AS event_site_id* > * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* > * WHERE a.dt = '20120917'* > * AND event.app_payload ['n'] IS NOT NULL* > * AND instr(event.app_payload ['itm'], '%') = 0* > * AND event.app_payload ['itm'] IS NOT NULL* > * AND (* > * event.page_type_id = '4340'* > * OR event.page_type_id = '2047675'* > * )* > * GROUP BY event.app_payload ['n'],* > * event.site_id,* > * event.app_payload ['itm']* > * ORDER BY cguid,* > * event_timestamp DESC* > * ) m* > *LEFT OUTER JOIN (* > * SELECT event.app_payload ['n'] AS changed_cguid* > * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* > * WHERE a.dt = '20120918'* > * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd > HH:mm:ss')* > * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd > HH:mm:ss')* > * ) n ON m.cguid = n.changed_cguid* > *WHERE n.changed_cguid IS NULL* > > > > *Raihan Jamal* > > > > On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]>wrote: > >> I am running a Hive query and I am getting this exception below- >> >> *Job Submission failed with exception >> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: >> java.io.IOException: The number of tasks for this job 2072020 exceeds the >> configured limit 200000* >> >> I am not sure what does this error means? Can anyone help me out here? >> >> >> >> *Raihan Jamal* >> >> >
-
RE: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionChalcy Raja 2012-10-03, 21:59
Hi Raihan,
You can set it in hive prompt like below, set mapred.jobtracker.maxtasks.per.job=7777777; To see if it is set just type in hive prompt, set; and you'll see this parameter in the output. Hope this helps, Chalcy From: Raihan Jamal [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 03, 2012 5:51 PM To: [EMAIL PROTECTED] Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException Ok. Found the issue I guess. This is the below settings we have in the mapred-site.xml for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown. <property> <!-- 10,000 is 100 tasks per node on a 100-node cluster --> <name>mapred.jobtracker.maxtasks.per.job</name> <value>200000</value> <final>true</final> </property> How can I override these changes manually from the Hive prompt? Any suggestions? Raihan Jamal On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Can anyone help me out here? What does the below error means? And this is the query I am using- SELECT cguid, event_item, event_timestamp, event_site_id FROM ( SELECT event.app_payload ['n'] AS cguid, event.app_payload ['itm'] AS event_item, max(event.event_timestamp) AS event_timestamp, event.site_id AS event_site_id FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event WHERE a.dt = '20120917' AND event.app_payload ['n'] IS NOT NULL AND instr(event.app_payload ['itm'], '%') = 0 AND event.app_payload ['itm'] IS NOT NULL AND ( event.page_type_id = '4340' OR event.page_type_id = '2047675' ) GROUP BY event.app_payload ['n'], event.site_id, event.app_payload ['itm'] ORDER BY cguid, event_timestamp DESC ) m LEFT OUTER JOIN ( SELECT event.app_payload ['n'] AS changed_cguid FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event WHERE a.dt = '20120918' AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss') AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss') ) n ON m.cguid = n.changed_cguid WHERE n.changed_cguid IS NULL Raihan Jamal On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: I am running a Hive query and I am getting this exception below- Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2072020 exceeds the configured limit 200000 I am not sure what does this error means? Can anyone help me out here? Raihan Jamal
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 22:01
What about if I do like below? Will this work?
* set mapred.jobtracker.maxtasks.per.job=-1* *Raihan Jamal* On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <[EMAIL PROTECTED]>wrote: > Hi Raihan,**** > > ** ** > > You can set it in hive prompt like below,**** > > set mapred.jobtracker.maxtasks.per.job=7777777; **** > > ** ** > > To see if it is set just type in hive prompt, *set; * and you’ll see this > parameter in the output.**** > > ** ** > > Hope this helps,**** > > Chalcy**** > > ** ** > > *From:* Raihan Jamal [mailto:[EMAIL PROTECTED]] > *Sent:* Wednesday, October 03, 2012 5:51 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: > java.io.IOException**** > > ** ** > > Ok. Found the issue I guess.**** > > **** > > This is the below settings we have in the *mapred-site.xml *for the > site-specific configuration in Hadoop. And that is the reason exception is > getting thrown.**** > > **** > > <property>**** > > <!-- 10,000 is 100 tasks per node on a 100-node cluster -->**** > > <name>mapred.jobtracker.maxtasks.per.job</name>**** > > <value>200000</value>**** > > <final>true</final>**** > > </property>**** > > ** ** > > How can I override these changes manually from the Hive prompt? Any > suggestions? > > > > *Raihan Jamal***** > > > > **** > > On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]> > wrote:**** > > Can anyone help me out here? What does the below error means? And this is > the query I am using-**** > > ** ** > > *SELECT cguid,***** > > *event_item,***** > > *event_timestamp,***** > > *event_site_id***** > > *FROM (***** > > *SELECT event.app_payload ['n'] AS cguid,***** > > *event.app_payload ['itm'] AS event_item,***** > > *max(event.event_timestamp) AS event_timestamp,***** > > *event.site_id AS event_site_id***** > > *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*** > ** > > *WHERE a.dt = '20120917'***** > > *AND event.app_payload ['n'] IS NOT NULL***** > > *AND instr(event.app_payload ['itm'], '%') = 0***** > > *AND event.app_payload ['itm'] IS NOT NULL***** > > *AND (***** > > *event.page_type_id = '4340'***** > > *OR event.page_type_id = '2047675'***** > > *)***** > > *GROUP BY event.app_payload ['n'],***** > > *event.site_id,***** > > *event.app_payload ['itm']***** > > *ORDER BY cguid,***** > > *event_timestamp DESC***** > > *) m***** > > *LEFT OUTER JOIN (***** > > *SELECT event.app_payload ['n'] AS changed_cguid***** > > *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*** > ** > > *WHERE a.dt = '20120918'***** > > *AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd > HH:mm:ss')***** > > *AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd > HH:mm:ss')***** > > *) n ON m.cguid = n.changed_cguid***** > > *WHERE n.changed_cguid IS NULL***** > > > > > *Raihan Jamal***** > > > > **** > > On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]> > wrote:**** > > I am running a Hive query and I am getting this exception below-**** > > ** ** > > *Job Submission failed with exception > 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: > java.io.IOException: The number of tasks for this job 2072020 exceeds the > configured limit 200000***** > > ** ** > > I am not sure what does this error means? Can anyone help me out here? > > > > *Raihan Jamal***** > > ** ** > > ** ** > > ** ** >
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 23:46
This is still not working as in the XML file the *final* property has been
set as true so that means I cannot override it. And this below simple query is also throwing same exception- *SELECT event.app_payload ['n'] AS changed_cguid* * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* * WHERE a.dt = '20120918'* * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')* * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')* Exception I am getting:- *Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2070929 exceeds the configured limit 200000* * * Any other suggestion what should I do to overcome this problem? May be any changes in the query can overcome this problem? *Raihan Jamal* On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <[EMAIL PROTECTED]>wrote: > Hi Raihan,**** > > ** ** > > You can set it in hive prompt like below,**** > > set mapred.jobtracker.maxtasks.per.job=7777777; **** > > ** ** > > To see if it is set just type in hive prompt, *set; * and you’ll see this > parameter in the output.**** > > ** ** > > Hope this helps,**** > > Chalcy**** > > ** ** > > *From:* Raihan Jamal [mailto:[EMAIL PROTECTED]] > *Sent:* Wednesday, October 03, 2012 5:51 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: > java.io.IOException**** > > ** ** > > Ok. Found the issue I guess.**** > > **** > > This is the below settings we have in the *mapred-site.xml *for the > site-specific configuration in Hadoop. And that is the reason exception is > getting thrown.**** > > **** > > <property>**** > > <!-- 10,000 is 100 tasks per node on a 100-node cluster -->**** > > <name>mapred.jobtracker.maxtasks.per.job</name>**** > > <value>200000</value>**** > > <final>true</final>**** > > </property>**** > > ** ** > > How can I override these changes manually from the Hive prompt? Any > suggestions? > > > > *Raihan Jamal***** > > > > **** > > On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]> > wrote:**** > > Can anyone help me out here? What does the below error means? And this is > the query I am using-**** > > ** ** > > *SELECT cguid,***** > > *event_item,***** > > *event_timestamp,***** > > *event_site_id***** > > *FROM (***** > > *SELECT event.app_payload ['n'] AS cguid,***** > > *event.app_payload ['itm'] AS event_item,***** > > *max(event.event_timestamp) AS event_timestamp,***** > > *event.site_id AS event_site_id***** > > *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*** > ** > > *WHERE a.dt = '20120917'***** > > *AND event.app_payload ['n'] IS NOT NULL***** > > *AND instr(event.app_payload ['itm'], '%') = 0***** > > *AND event.app_payload ['itm'] IS NOT NULL***** > > *AND (***** > > *event.page_type_id = '4340'***** > > *OR event.page_type_id = '2047675'***** > > *)***** > > *GROUP BY event.app_payload ['n'],***** > > *event.site_id,***** > > *event.app_payload ['itm']***** > > *ORDER BY cguid,***** > > *event_timestamp DESC***** > > *) m***** > > *LEFT OUTER JOIN (***** > > *SELECT event.app_payload ['n'] AS changed_cguid***** > > *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*** > ** > > *WHERE a.dt = '20120918'***** > > *AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd > HH:mm:ss')***** > > *AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd > HH:mm:ss')***** > > *) n ON m.cguid = n.changed_cguid***** > > *WHERE n.changed_cguid IS NULL***** > > > > > *Raihan Jamal***** > > > > **** > > On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]>
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionRaihan Jamal 2012-10-03, 23:54
Just to add here
*SojTimestampToDate* will return data in this format only *2012/02/29 17:01:43* *Raihan Jamal* On Wed, Oct 3, 2012 at 4:46 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote: > This is still not working as in the XML file the *final* property has > been set as true so that means I cannot override it. > And this below simple query is also throwing same exception- > > *SELECT event.app_payload ['n'] AS changed_cguid* > * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event* > * WHERE a.dt = '20120918'* > * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd > HH:mm:ss')* > * AND unix_timestamp(SojTimestampToDate(event.event_timestamp), > 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd > HH:mm:ss')* > > > Exception I am getting:- > > *Job Submission failed with exception > 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: > java.io.IOException: The number of tasks for this job 2070929 exceeds the > configured limit 200000* > * * > > > Any other suggestion what should I do to overcome this problem? May be any > changes in the query can overcome this problem? > > > *Raihan Jamal* > > > > On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <[EMAIL PROTECTED] > > wrote: > >> Hi Raihan,**** >> >> ** ** >> >> You can set it in hive prompt like below,**** >> >> set mapred.jobtracker.maxtasks.per.job=7777777; **** >> >> ** ** >> >> To see if it is set just type in hive prompt, *set; * and you’ll see >> this parameter in the output.**** >> >> ** ** >> >> Hope this helps,**** >> >> Chalcy**** >> >> ** ** >> >> *From:* Raihan Jamal [mailto:[EMAIL PROTECTED]] >> *Sent:* Wednesday, October 03, 2012 5:51 PM >> *To:* [EMAIL PROTECTED] >> *Subject:* Re: >> org.apache.hadoop.ipc.RemoteException(java.io.IOException: >> java.io.IOException**** >> >> ** ** >> >> Ok. Found the issue I guess.**** >> >> **** >> >> This is the below settings we have in the *mapred-site.xml *for the >> site-specific configuration in Hadoop. And that is the reason exception is >> getting thrown.**** >> >> **** >> >> <property>**** >> >> <!-- 10,000 is 100 tasks per node on a 100-node cluster -->**** >> >> <name>mapred.jobtracker.maxtasks.per.job</name>**** >> >> <value>200000</value>**** >> >> <final>true</final>**** >> >> </property>**** >> >> ** ** >> >> How can I override these changes manually from the Hive prompt? Any >> suggestions? >> >> >> >> *Raihan Jamal***** >> >> >> >> **** >> >> On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]> >> wrote:**** >> >> Can anyone help me out here? What does the below error means? And this is >> the query I am using-**** >> >> ** ** >> >> *SELECT cguid,***** >> >> *event_item,***** >> >> *event_timestamp,***** >> >> *event_site_id***** >> >> *FROM (***** >> >> *SELECT event.app_payload ['n'] AS cguid,***** >> >> *event.app_payload ['itm'] AS event_item,***** >> >> *max(event.event_timestamp) AS event_timestamp,***** >> >> *event.site_id AS event_site_id***** >> >> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event** >> *** >> >> *WHERE a.dt = '20120917'***** >> >> *AND event.app_payload ['n'] IS NOT NULL***** >> >> *AND instr(event.app_payload ['itm'], '%') = 0***** >> >> *AND event.app_payload ['itm'] IS NOT NULL***** >> >> *AND (***** >> >> *event.page_type_id = '4340'***** >> >> *OR event.page_type_id = '2047675'***** >> >> *)***** >> >> *GROUP BY event.app_payload ['n'],***** >> >> *event.site_id,***** >> >> *event.app_payload ['itm']***** >> >> *ORDER BY cguid,***** >> >> *event_timestamp DESC***** >> >> *) m***** >> >> *LEFT OUTER JOIN (***** >> >> *SELECT event.app_payload ['n'] AS changed_cguid***** >> >> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event** >> *** >> >> *WHERE a.dt = '20120918'***** >> >> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp), >> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
-
Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOExceptionBejoy KS 2012-10-04, 03:23
Hi Raihan
The propery 'mapred.jobtracker.maxtasks.per.job' is a JobTracker level one and not a task level one. Hence you cannot override it at task level. You need to make modifications in mapred_site.xml also you may need to rebounce the JT as well for the new value to come into effect. Regards, Bejoy KS ________________________________ From: Raihan Jamal <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Thursday, October 4, 2012 5:24 AM Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException Just to add here SojTimestampToDate will return data in this format only 2012/02/29 17:01:43 Raihan Jamal On Wed, Oct 3, 2012 at 4:46 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote: This is still not working as in the XML file the final property has been set as true so that means I cannot override it. >And this below simple query is also throwing same exception- > > >SELECT event.app_payload ['n'] AS changed_cguid >FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event >WHERE a.dt = '20120918' >AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss') >AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss') > > >Exception I am getting:- > > >Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2070929 exceeds the configured limit 200000 > > > > >Any other suggestion what should I do to overcome this problem? May be any changes in the query can overcome this problem? > > >Raihan Jamal > > > >On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <[EMAIL PROTECTED]> wrote: > >Hi Raihan, >> >>You can set it in hive prompt like below, >>set mapred.jobtracker.maxtasks.per.job=7777777; >> >>To see if it is set just type in hive prompt, set; and you’ll see this parameter in the output. >> >>Hope this helps, >>Chalcy >> >>From:Raihan Jamal [mailto:[EMAIL PROTECTED]] >>Sent: Wednesday, October 03, 2012 5:51 PM >>To: [EMAIL PROTECTED] >>Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException >> >>Ok. Found the issue I guess. >> >>This is the below settings we have in the ��mapred-site.xml for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown. >> >><property> >> <!-- 10,000 is 100 tasks per node on a 100-node cluster --> >> <name>mapred.jobtracker.maxtasks.per.job</name> >> <value>200000</value> >> <final>true</final> >> </property> >> >>How can I override these changes manually from the Hive prompt? Any suggestions? >> >> >> >>Raihan Jamal >> >> >> >>On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]> wrote: >>Can anyone help me out here? What does the below error means? And this is the query I am using- >> >>SELECT cguid, >>event_item, >>event_timestamp, >>event_site_id >>FROM ( >>SELECT event.app_payload ['n'] AS cguid, >>event.app_payload ['itm'] AS event_item, >>max(event.event_timestamp) AS event_timestamp, >>event.site_id AS event_site_id >>FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event >>WHERE a.dt = '20120917' >>AND event.app_payload ['n'] IS NOT NULL >>AND instr(event.app_payload ['itm'], '%') = 0 >>AND event.app_payload ['itm'] IS NOT NULL >>AND ( >>event.page_type_id = '4340' >>OR event.page_type_id = '2047675' >>) >>GROUP BY event.app_payload ['n'], >>event.site_id, >>event.app_payload ['itm'] >>ORDER BY cguid, >>event_timestamp DESC >>) m >>LEFT OUTER JOIN ( >>SELECT event.app_payload ['n'] AS changed_cguid >>FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event >>WHERE a.dt = '20120918' >>AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss') |