No setting for output key/value class in MapOnly Job. Could anyone explain the reason/intention for this, if any?
I am a kind of novice on Pig and currently reviewing the overall source
code briefly because I run pig on top of an other execution engine
(somewhat similar to hadoop) in my project.
In the middle of looking at JobControlCompiler class code, I found that, in
case of MapOnly job, the property for output key/value class is not set.
Only for the jobs that have Mapper and Reducer, the classes are set.
Obviously, I know that this produces no errors and it could not be a bug
but I am curious *whether there is any reason or intention for this*
because I've seen several codes that set those classes even thought the job
has only Mapper, that is, even thought the number of reduce task is set to
Another reason I am curious about this is that the engine that I am
currently using needs to know that information(output key/value class) in
both MapOnly job and Normal Map&Reduce job cases.
If it is possible, could anyone tell me how I can get those information
without those property setting, that is, without using
getOutputKey/ValueClass function? (I need the information after a job is
defined and before tasks for the job is launched (i.e. with no information
like task id)
Thank you in advance!