|
|
-
Re: reading input parameters in a pig scriptJonathan Coveney 2013-02-20, 17:17
what is going to be generating the "pig -param param1=..." and so on?
Couldn't these be made into arguments? ie REGISTER /opt/apache_pig/pig-0.10.1/ contrib/piggybank/java/piggybank.jar; REGISTER /tmp/custudf.jar; DEFINE XMLProcessor org.sdc.map.processor.XMLProcessor('$fields'); PRODUCTS = load 'product.xml' using org.apache.pig.piggybank.storage.XMLLoader('product') as (line:chararray); PRODUCT = FOREACH PRODUCTS GENERATE FLATTEN(XMLProcessor(line)) as (id:chararray, name:chararray, description:chararray); and you callit with pig -param fields=name,description and there has to be an output format, so in that case a %default would work? 2013/2/20 Siddhi Borkar <[EMAIL PROTECTED]> > I will not be able to use %default statement in my pig script, as the > parameters being passed to my pig script are not fixed. I would need a > conditional check to be done in my pig script to check for each and every > input parameter if it is passed or not. > Also, there are no conditional operators (if/else) available in pig . > > Following is the psuedocode of the functionality I want to achieve > > Consider pig files: > 1) xmlparser.pig > 2) excelexporter.pig > 3) htmlexporter.pig > > 1) xmlparser.pig > REGISTER /opt/apache_pig/pig-0.10.1/contrib/piggybank/java/piggybank.jar; > REGISTER /tmp/custudf.jar; > > DEFINE XMLProcessor org.sdc.map.processor.XMLProcessor(); > PRODUCTS = load 'product.xml' using > org.apache.pig.piggybank.storage.XMLLoader('product') as (line:chararray); > PRODUCT = FOREACH PRODUCTS GENERATE FLATTEN(XMLProcessor(line)) as > (id:chararray, name:chararray, description:chararray); > > Please note, XMLProcessor is a custom java based udf which parses the xml. > > 2) excelexporter.pig > STORE PRODUCT INTO '/tmp/prod.csv' USING > CSVExcelStorage(',','NO_MULTILINE','UNIX'); > > 3) htmlexporter.pig > //logic for this is not yet implemented > > Now the requirement is that I need to write a wrapper pig script which > invokes the following script and generates an output. The parameters that > will be passed are the input params and the out file format > > For ex pig -param param1=name param2=description outfileformat=csv > wrapper.pig > > Now what I need to do is based on the params passed to the wrapper pig > script, I need to send inputs to the xml parser and parse the input params. > In the above case since name and description are passed as params the xml > should be parsed only for these 2 fields. > Any idea how this can be achieved in a pig script? > > Also depending on the output file format, I need to invoke the > corresponding exporter script (html or csv) from my wrapper script. I don’t > see any conditional operators available (if/else) in pig. Any idea how this > can be achieved? > > -----Original Message----- > From: Jonathan Coveney [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, February 20, 2013 2:38 PM > To: [EMAIL PROTECTED] > Subject: Re: reading input parameters in a pig script > > Reiterating Prashant's comments. > > In the script though you can have a %default statement which will define > the default value for a parameter, which can also be overriden. My guess is > this might let you do what you want? > > > 2013/2/20 Prashant Kommireddi <[EMAIL PROTECTED]> > > > Hi Siddhi, > > > > "Is there any way to access these params in the script without > > referring to the param name?" -- how would you associate a param value > to pig statement? > > > > I am guessing in this case your pig script is also dynamically generated? > > You could use PigServer API > > http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/PigServer.html > > to generate params in a Java program and embed them into a script. > > > > -Prashant > > > > > > On Tue, Feb 19, 2013 at 3:44 PM, Siddhi Borkar < > > [EMAIL PROTECTED]> wrote: > > > > > > > > Consider the following command > > > pig -param param1=test param2=test1 param3=test2 myscript.pig > > > > > > In my case the parameters are dynamic, as in I could either pass |