UDF's are instantiated at job construction time a couple of times in order
to inspect various properties about them. This is subideal, but alas. I
generally lazily initialize in exec, as that is only called on the
mapper/reducer. The lifecycle of UDF's can be a bit confusing in this way.
2012/7/3 Yang <[EMAIL PROTECTED]>
> normally job tracker and task tracker is on different nodes.
> when I submit a pig script using UDF. I think the UDF constructor is first
> run (several times, don't know why)
> on the job tracker, and then it's run on each of the task trackers.
> now I want to do some custom work inside the constructor, such as checking
> the existence of certain files
> which are specific to only task trackers. such work only needs to be done
> on task trackers.
> So , is there a way to figure out whether the UDF is being run on task
> tracker or job tracker?