Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> for UDF, figure out whether it's on a task tracker?

Copy link to this message
Re: for UDF, figure out whether it's on a task tracker?
UDF's are instantiated at job construction time a couple of times in order
to inspect various properties about them. This is subideal, but alas. I
generally lazily initialize in exec, as that is only called on the
mapper/reducer. The lifecycle of UDF's can be a bit confusing in this way.

2012/7/3 Yang <[EMAIL PROTECTED]>

> normally job tracker and task tracker is on different nodes.
> when I submit a pig script using UDF. I think the UDF constructor is first
> run (several times, don't know why)
> on the job tracker, and then it's run on each of the task trackers.
> now I want to do some custom work inside the constructor, such as checking
> the existence of certain files
> which are specific to only task trackers. such work only needs to be done
> on task trackers.
> So , is there a way to figure out whether the UDF is being run on task
> tracker or job tracker?
> Thanks!
> yang