On Mon, Aug 27, 2012 at 6:43 PM, Juan P. <[EMAIL PROTECTED]> wrote:
> Hi guys!
> I need some clarification on the expected behavior for a hadoop MapReduce
> Say I was to create a Mapper task which never ends. It reads the first line
> of input and then reads data from an external service eternally. If the
> service is empty it will lock until data is available.
This is possible to do. However, I urge you to instead drive your job
based on an input, and launch jobs based on an available input event.
Apache Oozie (incubating) helps you do this, as one example framework.
> Will the jobtracker continue to receive the Heartbeat?
If you manually (and periodically) send a status or a progress update
from the task (via the Reporter/Context APIs), it will be received by
the JobTracker as a healthy sign (i.e. a heartbeat from a task).
> Will the jobtracker kill the task at some point?
If you do not do the above, then the task is killed at end of 10
minutes of inactivity, controlled via config param
> I know that it's not the way Hadoop was intended to be used, I just need to
> clarify this specific scenario.
> Thank you!
+1 to what Michael's said.