Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Job end notification does not always work (Hadoop 2.x)


Copy link to this message
-
Re: Job end notification does not always work (Hadoop 2.x)
Alejandro Abdelnur 2013-06-25, 13:21
Devaraj,

if a job can finish but you cannot determine it status after it ended, then
the system is not usable. Thus, HS is a required component.

thx
On Tue, Jun 25, 2013 at 6:11 AM, Devaraj k <[EMAIL PROTECTED]> wrote:

>  I agree, for getting status/counters we need HS. I mean Job can finish
> without HS also.  ****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Alejandro Abdelnur [mailto:[EMAIL PROTECTED]]
> *Sent:* 25 June 2013 18:05
> *To:* [EMAIL PROTECTED]
>
> *Subject:* Re: Job end notification does not always work (Hadoop 2.x)****
>
>  ** **
>
> Devaraj,****
>
> ** **
>
> If you don't run the HS, once your jobs finished you cannot retrieve
> status/counters from it, from Java AP or Web UI. So I'd for any practical
> usage, you need it.****
>
> ** **
>
> thx****
>
> ** **
>
> On Mon, Jun 24, 2013 at 8:42 PM, Devaraj k <[EMAIL PROTECTED]> wrote:**
> **
>
> It is not mandatory to have running HS in the cluster. Still the user can
> submit the job without HS in the cluster, and user may expect the Job/App
> End Notification.****
>
>  ****
>
> Thanks****
>
> Devaraj k****
>
>  ****
>
> *From:* Alejandro Abdelnur [mailto:[EMAIL PROTECTED]]
> *Sent:* 24 June 2013 21:42
> *To:* [EMAIL PROTECTED]
> *Cc:* [EMAIL PROTECTED]****
>
>
> *Subject:* Re: Job end notification does not always work (Hadoop 2.x)****
>
>  ****
>
> if we ought to do this in a yarn service it
> should be the RM or the HS. the RM is, IMO, the natural fit. the HS, would
> be a good choice if we are concerned about the extra work this would cause
> in the RM. the problem with the current HS is that it is MR specific, we
> should generalize it for diff AM types. ****
>
>  ****
>
> thx****
>
>
> Alejandro****
>
> (phone typing)****
>
>
> On Jun 23, 2013, at 23:28, Devaraj k <[EMAIL PROTECTED]> wrote:****
>
>  Even if we handle all the failure cases in AM for Job End Notification,
> we may miss cases like abrupt kill of AM when it is in last retry. If we
> choose NM to give the notification, again RM needs to identify which NM
> should give the end-notification as we don't have any direct protocol
> between AM and NM.****
>
>  ****
>
> I feel it would be better to move End-Notification responsibility to RM as
> Yarn Service because it ensures 100% notification and also useful for other
> types of applications as well. ****
>
>  ****
>
>  ****
>
> Thanks****
>
> Devaraj K****
>
>  ****
>
> *From:* Ravi Prakash [mailto:[EMAIL PROTECTED] <[EMAIL PROTECTED]>]
> *Sent:* 23 June 2013 19:01
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Job end notification does not always work (Hadoop 2.x)****
>
>  ****
>
> Hi Alejandro,
>
> Thanks for your reply! I was thinking more along the lines Prashant
> suggested i.e. a failure during init() should still trigger an attempt to
> notify (by the AM). But now that you mention it, maybe we would be better
> of including this as a YARN feature after all (specially with all the new
> AMs being written). We could let the NM of the AM handle the notification
> burden, so that the RM doesn't get unduly taxed. Thoughts?
>
> Thanks
> Ravi****
>
>  ****
>
>  ****
>    ------------------------------
>
> *From:* Alejandro Abdelnur <[EMAIL PROTECTED]>
> *To:* "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> *Sent:* Saturday, June 22, 2013 7:37 PM
> *Subject:* Re: Job end notification does not always work (Hadoop 2.x)****
>
>  ****
>
> If the AM fails before doing the job end notification, at any stage of the
> execution for whatever reason, the job end notification will never be
> deliver. There is not way to fix this unless the notification is done by a
> Yarn service. The 2 'candidate' services for doing this would be the RM and
> the HS. The job notification URL is in the job conf. The RM never sees the
> job conf, that rules out the RM out unless we add, at AM registration time
> the possibility to specify a callback URL. The HS has access to the job
> conf, but the HS is currently a 'passive' service.****

Alejandro