Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Support for Hadoop 2.2


Copy link to this message
-
Re: Support for Hadoop 2.2
Claudio Romo Otto 2013-11-26, 20:32
Hi Juan,

     In a nutshell, you must pay attention to memory settings inside
mapred-site.xml, yarn-site.xml, hadoop-env.sh and yarn-env.sh, so you
have to design a memory distribution strategy according to your
performance requirements. In this way you will have, among other things,
enough memory for the Scheduler.

Remember to reserve at least 600 - 800 mb for the operative system to
avoid OOM errors.

Best regards
El 26/11/13 16:07, Juan Martin Pampliega escribi�:
> Hi Claudio,
>
> It would be nice to know which were the settings that you had to tune to
> get this. I am having a similar issue with some jobs that I am running.
> Thanks,
> Juan.
>
>
> On Wed, Oct 30, 2013 at 7:40 PM, Claudio Romo Otto <
> [EMAIL PROTECTED]> wrote:
>
>> Jarcec, finally I got solved this problem by learning more on hadoop 2
>> (lot of reading), and then tuning some settings to let the work move from
>> the SCHEDULED state. With this said, the last problem was only concerning
>> on hadoop.
>>
>> Thanks for your support!
>>
>> El 30/10/13 18:03, Jarek Jarcec Cecho escribi�:
>>
>>   Hi Claudio,
>>> it's hard to guess from the limited information. I would suggest to take
>>> a look into logs to see what is happening.
>>>
>>> One guess though - You've mentioned that the task was "running" for 30
>>> minutes, but it still seems to be in SCHEDULED time - are your node
>>> managers correctly running?
>>>
>>> Jarcec
>>>
>>> On Fri, Oct 25, 2013 at 04:10:12PM -0300, Claudio Romo Otto wrote:
>>>
>>>> You got it!
>>>>
>>>> The solution was to compile with  -Dhadoopversion=23 option. After
>>>> your message I tried another test removing Cassandra from the chain
>>>> and Pig sent successfully the job to hadoop.
>>>>
>>>> BUT! the problem changed, now the Map task remains forever stuck on
>>>> Hadoop (30 minutes waiting, no other jobs running):
>>>>
>>>> Task
>>>>
>>>> Progress
>>>>
>>>> State
>>>>
>>>> Start Time
>>>>
>>>> Finish Time
>>>>
>>>> Elapsed Time
>>>> task_1382631533263_0012_m_000000 <http://topgps-test-3.
>>>> dnsalias.com:8088/proxy/application_1382631533263_
>>>> 0012/mapreduce/task/task_1382631533263_0012_m_000000>
>>>>
>>>>          SCHEDULED       Fri, 25 Oct 2013 18:18:32 GMT   N/A     0sec
>>>>
>>>>
>>>>
>>>> Attempt
>>>>
>>>> Progress
>>>>
>>>> State
>>>>
>>>> Node
>>>>
>>>> Logs
>>>>
>>>> Started
>>>>
>>>> Finished
>>>>
>>>> Elapsed
>>>>
>>>> Note
>>>> attempt_1382631533263_0012_m_000000_0   0,00    STARTING        N/A
>>>> N/A     N/A
>>>> N/A     0sec
>>>>
>>>>
>>>> Don't know if this is a Hadoop problem or Pig, what do you think?
>>>>
>>>>
>>>> El 25/10/13 13:11, Jarek Jarcec Cecho escribi�:
>>>>
>>>>> It seems that Pig was correctly compiled against Hadoop 23, but the
>>>>> Cassandra piece was not, check out the where the exception is coming from:
>>>>>
>>>>>   Caused by: java.lang.IncompatibleClassChangeError: Found interface
>>>>>> org.apache.hadoop.mapreduce.JobContext, but class was expected
>>>>>>       at org.apache.cassandra.hadoop.AbstractColumnFamilyInputForma
>>>>>> t.getSplits(AbstractColumnFamilyInputFormat.java:113)
>>>>>>
>>>>> So, I would say that you also need to get Hadoop 2 compatible Cassandra
>>>>> connector first.
>>>>>
>>>>> Jarcec
>>>>>
>>>>> On Thu, Oct 24, 2013 at 10:34:49PM -0300, Claudio Romo Otto wrote:
>>>>>
>>>>>> After change from hadoop20 to hadoop23 the warning dissapeared but I
>>>>>> got the same exception (Found interface
>>>>>> org.apache.hadoop.mapreduce.JobContext, but class was expected)
>>>>>>
>>>>>> I have tried over a fresh install: hadoop 2.2.0 and pig 0.12.1
>>>>>> compiled by me, no other product nor configuration, just two
>>>>>> servers, one master with ResourceManager and NameNode, one slave
>>>>>> with DataNode and NodeManager.
>>>>>>
>>>>>> I can't understand why over this fresh cluster Pig 0.12 fails. Here
>>>>>> is the new trace:
>>>>>>
>>>>>> 2013-10-24 16:10:52,351 [JobControl] ERROR
>>>>>> org.apache.pig.backend.hadoop23.PigJobControl - Error while trying
>>