Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - for UDF, figure out whether it's on a task tracker?


Copy link to this message
-
Re: for UDF, figure out whether it's on a task tracker?
Jonathan Coveney 2012-07-03, 17:01
UDF's are instantiated at job construction time a couple of times in order
to inspect various properties about them. This is subideal, but alas. I
generally lazily initialize in exec, as that is only called on the
mapper/reducer. The lifecycle of UDF's can be a bit confusing in this way.

2012/7/3 Yang <[EMAIL PROTECTED]>

> normally job tracker and task tracker is on different nodes.
>
> when I submit a pig script using UDF. I think the UDF constructor is first
> run (several times, don't know why)
> on the job tracker, and then it's run on each of the task trackers.
>
> now I want to do some custom work inside the constructor, such as checking
> the existence of certain files
> which are specific to only task trackers. such work only needs to be done
> on task trackers.
> So , is there a way to figure out whether the UDF is being run on task
> tracker or job tracker?
>
> Thanks!
> yang
>