as I understand it, mapper or reducer is actually run in its own JVM. So
if your class is required by mapper or reducer then one instance of it will
be created for every mapper or reducer. Also it would mean that only 1
such instance would be created, because you have made those functions
I believe, unless you have a threading logic in mapper or reducer, you do
not need to make it synchronous either.
On Thu, Jul 25, 2013 at 11:46 AM, Huy Pham <[EMAIL PROTECTED]> wrote:
> Hi All,
> I am writing a class (called Parser) with a couple of static functions
> because I don't want millions of instances of this class to be created
> during the run.
> However, I realized that Hadoop will eventually produce parallel jobs,
> and if all jobs will call static functions of this Parser class, would that
> be safe?
> In other words, will all hadoop jobs share the same class Parser or
> will each of them have their own Parser? In the former case, if all jobs
> share the same class, then if I make the methods synchronized, then the
> jobs would need to wait until the locks to the functions are released, thus
> that would affect the performance. However, in later case, that would not
> cause any problem.
> Can someone provide some insights?