-Re: Reading files in outputSchema
Alan Gates 2012-04-16, 17:49
The outputSchema is read on the machine where you start your pig job (referred to as the front end). Where you store the output schema is independent of this however. You can still store it in HDFS and read it from there on your client. By definition your client must be able to read/write HDFS files to use Pig anyway. It is generally better to store it in HDFS so that you don't have to keep multiple copies on multiple clients and risk having copies get out of date.
On Apr 13, 2012, at 12:43 AM, Rajgopal Vaithiyanathan wrote:
> Where will the outputSchema be executed? in the client or as a mapreduce ?
> Because, the output schema of my EvalFunc is stored in an XML file. I need
> to read this file and generate the output schema.
> Where should I place this XML file ? Client or HDFS ?