Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException


+
Raihan Jamal 2012-10-03, 18:05
+
Raihan Jamal 2012-10-03, 21:19
+
Raihan Jamal 2012-10-03, 21:50
Copy link to this message
-
RE: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException
Hi Raihan,

You can set it in hive prompt like below,
set mapred.jobtracker.maxtasks.per.job=7777777;

To see if it is set just type in hive prompt, set;  and you'll see this parameter in the output.

Hope this helps,
Chalcy

From: Raihan Jamal [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 03, 2012 5:51 PM
To: [EMAIL PROTECTED]
Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Ok. Found the issue I guess.

This is the below settings we have in the  mapred-site.xml for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown.

<property>
    <!-- 10,000 is 100 tasks per node on a 100-node cluster -->
    <name>mapred.jobtracker.maxtasks.per.job</name>
    <value>200000</value>
    <final>true</final>
  </property>

How can I override these changes manually from the Hive prompt? Any suggestions?

Raihan Jamal
On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Can anyone help me out here? What does the below error means? And this is the query I am using-

SELECT cguid,
event_item,
event_timestamp,
event_site_id
FROM (
SELECT event.app_payload ['n'] AS cguid,
event.app_payload ['itm'] AS event_item,
max(event.event_timestamp) AS event_timestamp,
event.site_id AS event_site_id
FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
WHERE a.dt = '20120917'
AND event.app_payload ['n'] IS NOT NULL
AND instr(event.app_payload ['itm'], '%') = 0
AND event.app_payload ['itm'] IS NOT NULL
AND (
event.page_type_id = '4340'
OR event.page_type_id = '2047675'
)
GROUP BY event.app_payload ['n'],
event.site_id,
event.app_payload ['itm']
ORDER BY cguid,
event_timestamp DESC
) m
LEFT OUTER JOIN (
SELECT event.app_payload ['n'] AS changed_cguid
FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
WHERE a.dt = '20120918'
AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')
AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')
) n ON m.cguid = n.changed_cguid
WHERE n.changed_cguid IS NULL

Raihan Jamal
On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I am running a Hive query and I am getting this exception below-

Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2072020 exceeds the configured limit 200000

I am not sure what does this error means? Can anyone help me out here?

Raihan Jamal

+
Raihan Jamal 2012-10-03, 23:46
+
Raihan Jamal 2012-10-03, 23:54
+
Bejoy KS 2012-10-04, 03:23
+
Raihan Jamal 2012-10-03, 22:01