Jay Vyas 2012-11-12, 23:58
Zizon Qiu 2012-11-13, 01:54
Bertrand Dechoux 2012-11-13, 08:55
Wow that's an awesome trick.! Okay thanks.
On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote:
> You should look at the job conf file.
> You will see that indeed the class for the mapper and reducer are explicitly written.
> So if you generate the class only on the client, the other machines won't be able to load it indeed.
> You should also look at Cascading which does a bit of what you are trying to do.
> The trick they use is that the mapper and reducer are only deserializer wrapper classes.
> They will read the serialized logic (which could be any graph of serialized objects) from the job conf file.
> On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <[EMAIL PROTECTED]> wrote:
>> when submiting a job,the ToolRunnuer or JobClient just distribute your jars to hdfs,
>> so that tasktrackers can launch/"re-run" it.
>> In your case,you should have your dynamic class re-generate in mapper/reducer`s setup method,
>> or the runtime classloader will miss them all.
>> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <[EMAIL PROTECTED]> wrote:
>>> Hi guys:
>>> Im trying to dynamically create a java class at runtime and submit it as a hadoop job.
>>> How does the Mapper (or for that matter, Reducer) use the data in the Job object? That is, how does it load a class? Is the job object serialized, along with all the info necessary to load a class?
>>> The reason im wondering is that, in all reality, the class im creating will not be on the classpath of JVM's in a distributed environment. But indeed, it will exist when the Job is created . So Im wondering wether simply "creating" a dynamic class in side of the job executioner will be serialized and sent over the wire in such a way that it can be instantiated in a different JVM or not.
>>> Jay Vyas
> Bertrand Dechoux