Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.


Copy link to this message
-
Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.
Also due you see any exception in RM / NM logs?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>
On Mon, Jul 1, 2013 at 11:19 AM, Omkar Joshi <[EMAIL PROTECTED]> wrote:

> Hi,
>
> As I don't know your complete AM code and how your containers are
> communicating with each other...Certain things which might help you in
> debugging.... where you are starting your RM (is it really running on
> 8030???? are you sure there is no previously started RM still running
> there?) Also in yarn-site.xml can you try changing RM address to something
> like "localhost:<free-port-but-not-default>" and configure maximum client
> thread size for handling AM requests? only your AM is expected to
> communicate with RM on AM-RM protocol.. by any chance in your code; are
> containers directly communicating with RM on AM-RM protocol??
>
>   <property>
>
>     <description>The address of the scheduler interface.</description>
>
>     <name>yarn.resourcemanager.scheduler.address</name>
>
>     <value>${yarn.resourcemanager.hostname}:8030</value>
>
>   </property>
>
>
>   <property>
>
>     <description>Number of threads to handle scheduler interface.</
> description>
>
>     <name>yarn.resourcemanager.scheduler.client.thread-count</name>
>
>     <value>50</value>
>
>   </property>
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Fri, Jun 28, 2013 at 5:35 AM, blah blah <[EMAIL PROTECTED]> wrote:
>
>> Hi
>>
>> Sorry to reply so late. I don't have the data you requested (sorry I have
>> no time, my deadline is within 3 days). However I have observed that this
>> issue occurs not only for the "larger" datasets (6.8MB), but for all
>> datasets and all jobs in general. However for smaller datasets (1MB) the AM
>> does not throw the Exception, only containers throw exceptions (same as in
>> previous e-mail). When these exception are throws my code (AM and
>> containers) does not perform any operations on HDFS, they only perform
>> in-memory computation and communication. Also I have observed that these
>> exception occur at "random", I couldn't observe any pattern. I can execute
>> job successfully, then resubmit the job repeating the experiment and these
>> exceptions occur (no change was made to src code, input dataset,or
>> execution/input parameters).
>>
>> As for the high network usage, as I said I don't have the data. But YARN
>> is running on nodes which are exclusive for my experiments no other
>> software runs on these nodes (only OS and YARN). Besides I don't think that
>> 20 containers working on 1MB dataset (total) can be called high network
>> usage.
>>
>> regards
>> tmp
>>
>>
>>
>> 2013/6/26 Devaraj k <[EMAIL PROTECTED]>
>>
>>>  Hi,****
>>>
>>> ** **
>>>
>>>    Could you check the network usage in the cluster when this problem
>>> occurs? Probably it is causing due to high network usage. ****
>>>
>>> ** **
>>>
>>> Thanks****
>>>
>>> Devaraj k****
>>>
>>> ** **
>>>
>>> *From:* blah blah [mailto:[EMAIL PROTECTED]]
>>> *Sent:* 26 June 2013 05:39
>>> *To:* [EMAIL PROTECTED]
>>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>>> datasets.****
>>>
>>> ** **
>>>
>>> Hi All****
>>>
>>> First let me excuse for the poor thread title but I have no idea how to
>>> express the problem in one sentence. ****
>>>
>>> I have implemented new Application Master with the use of Yarn. I am
>>> using old Yarn development version. Revision 1437315, from 2013-01-23
>>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>>> deadline is soon, and I don't have time to include Yarn API changes.****
>>>
>>> Currently I execute experiments in pseudo-distributed mode, I use guava
>>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>>> "larger" datasets. My AM works fine and I can execute it without a problem
>>> for a debug dataset (1MB size). But when I increase the size of input to
>>> 6.8 MB, I am getting the following exceptions:****
>>>