Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Support for Hadoop 2.2


Copy link to this message
-
Re: Support for Hadoop 2.2
Hi Juan,

     In a nutshell, you must pay attention to memory settings inside
mapred-site.xml, yarn-site.xml, hadoop-env.sh and yarn-env.sh, so you
have to design a memory distribution strategy according to your
performance requirements. In this way you will have, among other things,
enough memory for the Scheduler.

Remember to reserve at least 600 - 800 mb for the operative system to
avoid OOM errors.

Best regards
El 26/11/13 16:07, Juan Martin Pampliega escribi�:
> Hi Claudio,
>
> It would be nice to know which were the settings that you had to tune to
> get this. I am having a similar issue with some jobs that I am running.
> Thanks,
> Juan.
>
>
> On Wed, Oct 30, 2013 at 7:40 PM, Claudio Romo Otto <
> [EMAIL PROTECTED]> wrote:
>
>> Jarcec, finally I got solved this problem by learning more on hadoop 2
>> (lot of reading), and then tuning some settings to let the work move from
>> the SCHEDULED state. With this said, the last problem was only concerning
>> on hadoop.
>>
>> Thanks for your support!
>>
>> El 30/10/13 18:03, Jarek Jarcec Cecho escribi�:
>>
>>   Hi Claudio,
>>> it's hard to guess from the limited information. I would suggest to take
>>> a look into logs to see what is happening.
>>>
>>> One guess though - You've mentioned that the task was "running" for 30
>>> minutes, but it still seems to be in SCHEDULED time - are your node
>>> managers correctly running?
>>>
>>> Jarcec
>>>
>>> On Fri, Oct 25, 2013 at 04:10:12PM -0300, Claudio Romo Otto wrote:
>>>
>>>> You got it!
>>>>
>>>> The solution was to compile with  -Dhadoopversion=23 option. After
>>>> your message I tried another test removing Cassandra from the chain
>>>> and Pig sent successfully the job to hadoop.
>>>>
>>>> BUT! the problem changed, now the Map task remains forever stuck on
>>>> Hadoop (30 minutes waiting, no other jobs running):
>>>>
>>>> Task
>>>>
>>>> Progress
>>>>
>>>> State
>>>>
>>>> Start Time
>>>>
>>>> Finish Time
>>>>
>>>> Elapsed Time
>>>> task_1382631533263_0012_m_000000 <http://topgps-test-3.
>>>> dnsalias.com:8088/proxy/application_1382631533263_
>>>> 0012/mapreduce/task/task_1382631533263_0012_m_000000>
>>>>
>>>>          SCHEDULED       Fri, 25 Oct 2013 18:18:32 GMT   N/A     0sec
>>>>
>>>>
>>>>
>>>> Attempt
>>>>
>>>> Progress
>>>>
>>>> State
>>>>
>>>> Node
>>>>
>>>> Logs
>>>>
>>>> Started
>>>>
>>>> Finished
>>>>
>>>> Elapsed
>>>>
>>>> Note
>>>> attempt_1382631533263_0012_m_000000_0   0,00    STARTING        N/A
>>>> N/A     N/A
>>>> N/A     0sec
>>>>
>>>>
>>>> Don't know if this is a Hadoop problem or Pig, what do you think?
>>>>
>>>>
>>>> El 25/10/13 13:11, Jarek Jarcec Cecho escribi�:
>>>>
>>>>> It seems that Pig was correctly compiled against Hadoop 23, but the
>>>>> Cassandra piece was not, check out the where the exception is coming from:
>>>>>
>>>>>   Caused by: java.lang.IncompatibleClassChangeError: Found interface
>>>>>> org.apache.hadoop.mapreduce.JobContext, but class was expected
>>>>>>       at org.apache.cassandra.hadoop.AbstractColumnFamilyInputForma
>>>>>> t.getSplits(AbstractColumnFamilyInputFormat.java:113)
>>>>>>
>>>>> So, I would say that you also need to get Hadoop 2 compatible Cassandra
>>>>> connector first.
>>>>>
>>>>> Jarcec
>>>>>
>>>>> On Thu, Oct 24, 2013 at 10:34:49PM -0300, Claudio Romo Otto wrote:
>>>>>
>>>>>> After change from hadoop20 to hadoop23 the warning dissapeared but I
>>>>>> got the same exception (Found interface
>>>>>> org.apache.hadoop.mapreduce.JobContext, but class was expected)
>>>>>>
>>>>>> I have tried over a fresh install: hadoop 2.2.0 and pig 0.12.1
>>>>>> compiled by me, no other product nor configuration, just two
>>>>>> servers, one master with ResourceManager and NameNode, one slave
>>>>>> with DataNode and NodeManager.
>>>>>>
>>>>>> I can't understand why over this fresh cluster Pig 0.12 fails. Here
>>>>>> is the new trace:
>>>>>>
>>>>>> 2013-10-24 16:10:52,351 [JobControl] ERROR
>>>>>> org.apache.pig.backend.hadoop23.PigJobControl - Error while trying
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB