|
ToddG
2010-07-20, 21:26
Jeff Zhang
2010-07-21, 01:41
Corbin Hoenes
2010-07-21, 05:56
Corbin Hoenes
2010-07-21, 06:02
Corbin Hoenes
2010-07-21, 06:07
Dmitriy Ryaboy
2010-07-21, 08:11
Corbin Hoenes
2010-07-21, 13:06
Dave Viner
2010-07-21, 16:22
Dmitriy Ryaboy
2010-07-21, 17:01
|
-
PIG + JunitToddG 2010-07-20, 21:26
I'd like to include running various PIG scripts in my continuous build
system. Of course, I'll only use small datasets for this, and in the beginning, I'll only target a local machine instance. However, this brings up several questions: Q: Whats the best way to run PIG from java? Here's what I'm doing, following a pattern I found in some of the pig tests: 1. Create Pig resources in a base class (shamelessly copied from PigExecTestCase): protected MiniCluster cluster; protected PigServer pigServer; @Before public void setUp() throws Exception { String execTypeString = System.getProperty("test.exectype"); if(execTypeString!=null && execTypeString.length()>0){ execType = PigServer.parseExecType(execTypeString); } if(execType == MAPREDUCE) { cluster = MiniCluster.buildCluster(); pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); } else { pigServer = new PigServer(LOCAL); } } 2. Test classes sub class this to get access to the MiniCluster and PigServer (copied from TestPigSplit): @Test public void notestLongEvalSpec() throws Exception{ inputFileName = "notestLongEvalSpec-input.txt"; createInput(new String[] {"0\ta"}); pigServer.registerQuery("a = load '" + inputFileName + "';"); for (int i=0; i< 500; i++){ pigServer.registerQuery("a = filter a by $0 == '1';"); } Iterator<Tuple> iter = pigServer.openIterator("a"); while (iter.hasNext()){ throw new Exception(); } } 3. ERROR This pattern works for simple PIG directives, but I want to load up entire pig scripts, which have REGISTER and DEFINE directives, then the pigServer.registerQuery() fails with: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Unrecognized alias REGISTER at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.PigServer.registerQuery(PigServer.java:441) at com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) Any suggestions? -Todd
-
Re: PIG + JunitJeff Zhang 2010-07-21, 01:41
Hi Todd,
The method registerQuery can not handle register and define statement. You should use method registerJar and and registerFunction instead. Another way is to put your script in a file and then use registerScript to execute the pig script. On Wed, Jul 21, 2010 at 5:26 AM, ToddG <[EMAIL PROTECTED]> wrote: > I'd like to include running various PIG scripts in my continuous build > system. Of course, I'll only use small datasets for this, and in the > beginning, I'll only target a local machine instance. However, this brings > up several questions: > > > Q: Whats the best way to run PIG from java? Here's what I'm doing, > following a pattern I found in some of the pig tests: > > 1. Create Pig resources in a base class (shamelessly copied from > PigExecTestCase): > > protected MiniCluster cluster; > protected PigServer pigServer; > > @Before > public void setUp() throws Exception { > > String execTypeString = System.getProperty("test.exectype"); > if(execTypeString!=null && execTypeString.length()>0){ > execType = PigServer.parseExecType(execTypeString); > } > if(execType == MAPREDUCE) { > cluster = MiniCluster.buildCluster(); > pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); > } else { > pigServer = new PigServer(LOCAL); > } > } > > 2. Test classes sub class this to get access to the MiniCluster and > PigServer (copied from TestPigSplit): > > @Test > public void notestLongEvalSpec() throws Exception{ > inputFileName = "notestLongEvalSpec-input.txt"; > createInput(new String[] {"0\ta"}); > > pigServer.registerQuery("a = load '" + inputFileName + "';"); > for (int i=0; i< 500; i++){ > pigServer.registerQuery("a = filter a by $0 == '1';"); > } > Iterator<Tuple> iter = pigServer.openIterator("a"); > while (iter.hasNext()){ > throw new Exception(); > } > } > > 3. ERROR > > This pattern works for simple PIG directives, but I want to load up entire > pig scripts, which have REGISTER and DEFINE directives, then the > pigServer.registerQuery() fails with: > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error > during parsing. Unrecognized alias REGISTER > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) > at org.apache.pig.PigServer.registerQuery(PigServer.java:425) > at org.apache.pig.PigServer.registerQuery(PigServer.java:441) > at > com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > Any suggestions? > > -Todd > -- Best Regards Jeff Zhang
-
Re: PIG + JunitCorbin Hoenes 2010-07-21, 05:56
Hey Todd we run against entire pig scripts with some helper classes we built basically they preprocess the variables then call register script but the test looks like this:
@Before public void setUp() throws Exception { Helper.delete(OUT_FILE); runner = new PigRunner(); } @Test public void testRecordCount() throws Exception { runner.execute("myscript.pig", "param1=foo","param2=bar"); Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo"); assertEquals(41L, Helper.countTuples(tuples)); } It's been very useful for us to test this way. Would love to see more chatter about other techniques. On Jul 20, 2010, at 3:26 PM, ToddG wrote: > I'd like to include running various PIG scripts in my continuous build system. Of course, I'll only use small datasets for this, and in the beginning, I'll only target a local machine instance. However, this brings up several questions: > > > Q: Whats the best way to run PIG from java? Here's what I'm doing, following a pattern I found in some of the pig tests: > > 1. Create Pig resources in a base class (shamelessly copied from PigExecTestCase): > > protected MiniCluster cluster; > protected PigServer pigServer; > > @Before > public void setUp() throws Exception { > > String execTypeString = System.getProperty("test.exectype"); > if(execTypeString!=null && execTypeString.length()>0){ > execType = PigServer.parseExecType(execTypeString); > } > if(execType == MAPREDUCE) { > cluster = MiniCluster.buildCluster(); > pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); > } else { > pigServer = new PigServer(LOCAL); > } > } > > 2. Test classes sub class this to get access to the MiniCluster and PigServer (copied from TestPigSplit): > > @Test > public void notestLongEvalSpec() throws Exception{ > inputFileName = "notestLongEvalSpec-input.txt"; > createInput(new String[] {"0\ta"}); > > pigServer.registerQuery("a = load '" + inputFileName + "';"); > for (int i=0; i< 500; i++){ > pigServer.registerQuery("a = filter a by $0 == '1';"); > } > Iterator<Tuple> iter = pigServer.openIterator("a"); > while (iter.hasNext()){ > throw new Exception(); > } > } > > 3. ERROR > > This pattern works for simple PIG directives, but I want to load up entire pig scripts, which have REGISTER and DEFINE directives, then the pigServer.registerQuery() fails with: > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Unrecognized alias REGISTER > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) > at org.apache.pig.PigServer.registerQuery(PigServer.java:425) > at org.apache.pig.PigServer.registerQuery(PigServer.java:441) > at com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > Any suggestions? > > -Todd
-
Re: PIG + JunitCorbin Hoenes 2010-07-21, 06:02
Trying to attach the PigRunner class in case that helps give you a start using register script.
-
Re: PIG + JunitCorbin Hoenes 2010-07-21, 06:07
okay no attachments...try this gist:
http://gist.github.com/484135 On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: > Trying to attach the PigRunner class in case that helps give you a start using register script. > > > > On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: > >> Hey Todd we run against entire pig scripts with some helper classes we built basically they preprocess the variables then call register script but the test looks like this: >> >> @Before >> public void setUp() throws Exception { >> Helper.delete(OUT_FILE); >> runner = new PigRunner(); >> } >> >> >> @Test >> public void testRecordCount() throws Exception { >> runner.execute("myscript.pig", "param1=foo","param2=bar"); >> >> Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo"); >> assertEquals(41L, Helper.countTuples(tuples)); >> } >> >> It's been very useful for us to test this way. Would love to see more chatter about other techniques. >> >> On Jul 20, 2010, at 3:26 PM, ToddG wrote: >> >> >>> I'd like to include running various PIG scripts in my continuous build system. Of course, I'll only use small datasets for this, and in the beginning, I'll only target a local machine instance. However, this brings up several questions: >>> >>> >>> Q: Whats the best way to run PIG from java? Here's what I'm doing, following a pattern I found in some of the pig tests: >>> >>> 1. Create Pig resources in a base class (shamelessly copied from PigExecTestCase): >>> >>> protected MiniCluster cluster; >>> protected PigServer pigServer; >>> >>> @Before >>> public void setUp() throws Exception { >>> >>> String execTypeString = System.getProperty("test.exectype"); >>> if(execTypeString!=null && execTypeString.length()>0){ >>> execType = PigServer.parseExecType(execTypeString); >>> } >>> if(execType == MAPREDUCE) { >>> cluster = MiniCluster.buildCluster(); >>> pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); >>> } else { >>> pigServer = new PigServer(LOCAL); >>> } >>> } >>> >>> 2. Test classes sub class this to get access to the MiniCluster and PigServer (copied from TestPigSplit): >>> >>> @Test >>> public void notestLongEvalSpec() throws Exception{ >>> inputFileName = "notestLongEvalSpec-input.txt"; >>> createInput(new String[] {"0\ta"}); >>> >>> pigServer.registerQuery("a = load '" + inputFileName + "';"); >>> for (int i=0; i< 500; i++){ >>> pigServer.registerQuery("a = filter a by $0 == '1';"); >>> } >>> Iterator<Tuple> iter = pigServer.openIterator("a"); >>> while (iter.hasNext()){ >>> throw new Exception(); >>> } >>> } >>> >>> 3. ERROR >>> >>> This pattern works for simple PIG directives, but I want to load up entire pig scripts, which have REGISTER and DEFINE directives, then the pigServer.registerQuery() fails with: >>> >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Unrecognized alias REGISTER >>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425) >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441) >>> at com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> >>> Any suggestions? >>> >>> -Todd >> >
-
Re: PIG + JunitDmitriy Ryaboy 2010-07-21, 08:11
Corbin,
Have you looked at PigUnit? https://issues.apache.org/jira/browse/PIG-1404 On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > okay no attachments...try this gist: > > http://gist.github.com/484135 > > On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: > > > Trying to attach the PigRunner class in case that helps give you a start > using register script. > > > > > > > > On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: > > > >> Hey Todd we run against entire pig scripts with some helper classes we > built basically they preprocess the variables then call register script but > the test looks like this: > >> > >> @Before > >> public void setUp() throws Exception { > >> Helper.delete(OUT_FILE); > >> runner = new PigRunner(); > >> } > >> > >> > >> @Test > >> public void testRecordCount() throws Exception { > >> runner.execute("myscript.pig", "param1=foo","param2=bar"); > >> > >> Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo"); > >> assertEquals(41L, Helper.countTuples(tuples)); > >> } > >> > >> It's been very useful for us to test this way. Would love to see more > chatter about other techniques. > >> > >> On Jul 20, 2010, at 3:26 PM, ToddG wrote: > >> > >> > >>> I'd like to include running various PIG scripts in my continuous build > system. Of course, I'll only use small datasets for this, and in the > beginning, I'll only target a local machine instance. However, this brings > up several questions: > >>> > >>> > >>> Q: Whats the best way to run PIG from java? Here's what I'm doing, > following a pattern I found in some of the pig tests: > >>> > >>> 1. Create Pig resources in a base class (shamelessly copied from > PigExecTestCase): > >>> > >>> protected MiniCluster cluster; > >>> protected PigServer pigServer; > >>> > >>> @Before > >>> public void setUp() throws Exception { > >>> > >>> String execTypeString = System.getProperty("test.exectype"); > >>> if(execTypeString!=null && execTypeString.length()>0){ > >>> execType = PigServer.parseExecType(execTypeString); > >>> } > >>> if(execType == MAPREDUCE) { > >>> cluster = MiniCluster.buildCluster(); > >>> pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); > >>> } else { > >>> pigServer = new PigServer(LOCAL); > >>> } > >>> } > >>> > >>> 2. Test classes sub class this to get access to the MiniCluster and > PigServer (copied from TestPigSplit): > >>> > >>> @Test > >>> public void notestLongEvalSpec() throws Exception{ > >>> inputFileName = "notestLongEvalSpec-input.txt"; > >>> createInput(new String[] {"0\ta"}); > >>> > >>> pigServer.registerQuery("a = load '" + inputFileName + "';"); > >>> for (int i=0; i< 500; i++){ > >>> pigServer.registerQuery("a = filter a by $0 == '1';"); > >>> } > >>> Iterator<Tuple> iter = pigServer.openIterator("a"); > >>> while (iter.hasNext()){ > >>> throw new Exception(); > >>> } > >>> } > >>> > >>> 3. ERROR > >>> > >>> This pattern works for simple PIG directives, but I want to load up > entire pig scripts, which have REGISTER and DEFINE directives, then the > pigServer.registerQuery() fails with: > >>> > >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error > during parsing. Unrecognized alias REGISTER > >>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) > >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) > >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425) > >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441) > >>> at > com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) > >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
-
Re: PIG + JunitCorbin Hoenes 2010-07-21, 13:06
Dimitry,
Nope that is new for me thanks for pointing it out, been using this home grown class since pig 0.5--really like the idea of unit testing moving into pig as a first class citizen. On Jul 21, 2010, at 2:11 AM, Dmitriy Ryaboy wrote: > Corbin, > Have you looked at PigUnit? https://issues.apache.org/jira/browse/PIG-1404 > > > On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > >> okay no attachments...try this gist: >> >> http://gist.github.com/484135 >> >> On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: >> >>> Trying to attach the PigRunner class in case that helps give you a start >> using register script. >>> >>> >>> >>> On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: >>> >>>> Hey Todd we run against entire pig scripts with some helper classes we >> built basically they preprocess the variables then call register script but >> the test looks like this: >>>> >>>> @Before >>>> public void setUp() throws Exception { >>>> Helper.delete(OUT_FILE); >>>> runner = new PigRunner(); >>>> } >>>> >>>> >>>> @Test >>>> public void testRecordCount() throws Exception { >>>> runner.execute("myscript.pig", "param1=foo","param2=bar"); >>>> >>>> Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo"); >>>> assertEquals(41L, Helper.countTuples(tuples)); >>>> } >>>> >>>> It's been very useful for us to test this way. Would love to see more >> chatter about other techniques. >>>> >>>> On Jul 20, 2010, at 3:26 PM, ToddG wrote: >>>> >>>> >>>>> I'd like to include running various PIG scripts in my continuous build >> system. Of course, I'll only use small datasets for this, and in the >> beginning, I'll only target a local machine instance. However, this brings >> up several questions: >>>>> >>>>> >>>>> Q: Whats the best way to run PIG from java? Here's what I'm doing, >> following a pattern I found in some of the pig tests: >>>>> >>>>> 1. Create Pig resources in a base class (shamelessly copied from >> PigExecTestCase): >>>>> >>>>> protected MiniCluster cluster; >>>>> protected PigServer pigServer; >>>>> >>>>> @Before >>>>> public void setUp() throws Exception { >>>>> >>>>> String execTypeString = System.getProperty("test.exectype"); >>>>> if(execTypeString!=null && execTypeString.length()>0){ >>>>> execType = PigServer.parseExecType(execTypeString); >>>>> } >>>>> if(execType == MAPREDUCE) { >>>>> cluster = MiniCluster.buildCluster(); >>>>> pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); >>>>> } else { >>>>> pigServer = new PigServer(LOCAL); >>>>> } >>>>> } >>>>> >>>>> 2. Test classes sub class this to get access to the MiniCluster and >> PigServer (copied from TestPigSplit): >>>>> >>>>> @Test >>>>> public void notestLongEvalSpec() throws Exception{ >>>>> inputFileName = "notestLongEvalSpec-input.txt"; >>>>> createInput(new String[] {"0\ta"}); >>>>> >>>>> pigServer.registerQuery("a = load '" + inputFileName + "';"); >>>>> for (int i=0; i< 500; i++){ >>>>> pigServer.registerQuery("a = filter a by $0 == '1';"); >>>>> } >>>>> Iterator<Tuple> iter = pigServer.openIterator("a"); >>>>> while (iter.hasNext()){ >>>>> throw new Exception(); >>>>> } >>>>> } >>>>> >>>>> 3. ERROR >>>>> >>>>> This pattern works for simple PIG directives, but I want to load up >> entire pig scripts, which have REGISTER and DEFINE directives, then the >> pigServer.registerQuery() fails with: >>>>> >>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error >> during parsing. Unrecognized alias REGISTER >>>>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) >>>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) >>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425) >>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441) >>>>> at >> com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74)
-
Re: PIG + JunitDave Viner 2010-07-21, 16:22
PigUnit looks awesome. Can this make it into either the latest piggybank
release or the next core release? On Wed, Jul 21, 2010 at 6:06 AM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > Dimitry, > > Nope that is new for me thanks for pointing it out, been using this home > grown class since pig 0.5--really like the idea of unit testing moving into > pig as a first class citizen. > > > On Jul 21, 2010, at 2:11 AM, Dmitriy Ryaboy wrote: > > > Corbin, > > Have you looked at PigUnit? > https://issues.apache.org/jira/browse/PIG-1404 > > > > > > On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > > > >> okay no attachments...try this gist: > >> > >> http://gist.github.com/484135 > >> > >> On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: > >> > >>> Trying to attach the PigRunner class in case that helps give you a > start > >> using register script. > >>> > >>> > >>> > >>> On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: > >>> > >>>> Hey Todd we run against entire pig scripts with some helper classes we > >> built basically they preprocess the variables then call register script > but > >> the test looks like this: > >>>> > >>>> @Before > >>>> public void setUp() throws Exception { > >>>> Helper.delete(OUT_FILE); > >>>> runner = new PigRunner(); > >>>> } > >>>> > >>>> > >>>> @Test > >>>> public void testRecordCount() throws Exception { > >>>> runner.execute("myscript.pig", "param1=foo","param2=bar"); > >>>> > >>>> Iterator<Tuple> tuples > runner.getPigServer().openIterator("foo"); > >>>> assertEquals(41L, Helper.countTuples(tuples)); > >>>> } > >>>> > >>>> It's been very useful for us to test this way. Would love to see more > >> chatter about other techniques. > >>>> > >>>> On Jul 20, 2010, at 3:26 PM, ToddG wrote: > >>>> > >>>> > >>>>> I'd like to include running various PIG scripts in my continuous > build > >> system. Of course, I'll only use small datasets for this, and in the > >> beginning, I'll only target a local machine instance. However, this > brings > >> up several questions: > >>>>> > >>>>> > >>>>> Q: Whats the best way to run PIG from java? Here's what I'm doing, > >> following a pattern I found in some of the pig tests: > >>>>> > >>>>> 1. Create Pig resources in a base class (shamelessly copied from > >> PigExecTestCase): > >>>>> > >>>>> protected MiniCluster cluster; > >>>>> protected PigServer pigServer; > >>>>> > >>>>> @Before > >>>>> public void setUp() throws Exception { > >>>>> > >>>>> String execTypeString = System.getProperty("test.exectype"); > >>>>> if(execTypeString!=null && execTypeString.length()>0){ > >>>>> execType = PigServer.parseExecType(execTypeString); > >>>>> } > >>>>> if(execType == MAPREDUCE) { > >>>>> cluster = MiniCluster.buildCluster(); > >>>>> pigServer = new PigServer(MAPREDUCE, > cluster.getProperties()); > >>>>> } else { > >>>>> pigServer = new PigServer(LOCAL); > >>>>> } > >>>>> } > >>>>> > >>>>> 2. Test classes sub class this to get access to the MiniCluster and > >> PigServer (copied from TestPigSplit): > >>>>> > >>>>> @Test > >>>>> public void notestLongEvalSpec() throws Exception{ > >>>>> inputFileName = "notestLongEvalSpec-input.txt"; > >>>>> createInput(new String[] {"0\ta"}); > >>>>> > >>>>> pigServer.registerQuery("a = load '" + inputFileName + "';"); > >>>>> for (int i=0; i< 500; i++){ > >>>>> pigServer.registerQuery("a = filter a by $0 == '1';"); > >>>>> } > >>>>> Iterator<Tuple> iter = pigServer.openIterator("a"); > >>>>> while (iter.hasNext()){ > >>>>> throw new Exception(); > >>>>> } > >>>>> } > >>>>> > >>>>> 3. ERROR > >>>>> > >>>>> This pattern works for simple PIG directives, but I want to load up > >> entire pig scripts, which have REGISTER and DEFINE directives, then the > >> pigServer.registerQuery() fails with: > >>>>> > >>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error > >> during parsing. Unrecognized alias REGISTER
-
Re: PIG + JunitDmitriy Ryaboy 2010-07-21, 17:01
Everyone likes it, no one has time to work on it. You guys can feel free to
jump in and make it a real thing :) There's definitely still time for this to make it into Pig 0.8. -D On Wed, Jul 21, 2010 at 9:22 AM, Dave Viner <[EMAIL PROTECTED]> wrote: > PigUnit looks awesome. Can this make it into either the latest piggybank > release or the next core release? > > > > On Wed, Jul 21, 2010 at 6:06 AM, Corbin Hoenes <[EMAIL PROTECTED]> wrote: > > > Dimitry, > > > > Nope that is new for me thanks for pointing it out, been using this home > > grown class since pig 0.5--really like the idea of unit testing moving > into > > pig as a first class citizen. > > > > > > On Jul 21, 2010, at 2:11 AM, Dmitriy Ryaboy wrote: > > > > > Corbin, > > > Have you looked at PigUnit? > > https://issues.apache.org/jira/browse/PIG-1404 > > > > > > > > > On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> > wrote: > > > > > >> okay no attachments...try this gist: > > >> > > >> http://gist.github.com/484135 > > >> > > >> On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: > > >> > > >>> Trying to attach the PigRunner class in case that helps give you a > > start > > >> using register script. > > >>> > > >>> > > >>> > > >>> On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: > > >>> > > >>>> Hey Todd we run against entire pig scripts with some helper classes > we > > >> built basically they preprocess the variables then call register > script > > but > > >> the test looks like this: > > >>>> > > >>>> @Before > > >>>> public void setUp() throws Exception { > > >>>> Helper.delete(OUT_FILE); > > >>>> runner = new PigRunner(); > > >>>> } > > >>>> > > >>>> > > >>>> @Test > > >>>> public void testRecordCount() throws Exception { > > >>>> runner.execute("myscript.pig", "param1=foo","param2=bar"); > > >>>> > > >>>> Iterator<Tuple> tuples > > runner.getPigServer().openIterator("foo"); > > >>>> assertEquals(41L, Helper.countTuples(tuples)); > > >>>> } > > >>>> > > >>>> It's been very useful for us to test this way. Would love to see > more > > >> chatter about other techniques. > > >>>> > > >>>> On Jul 20, 2010, at 3:26 PM, ToddG wrote: > > >>>> > > >>>> > > >>>>> I'd like to include running various PIG scripts in my continuous > > build > > >> system. Of course, I'll only use small datasets for this, and in the > > >> beginning, I'll only target a local machine instance. However, this > > brings > > >> up several questions: > > >>>>> > > >>>>> > > >>>>> Q: Whats the best way to run PIG from java? Here's what I'm doing, > > >> following a pattern I found in some of the pig tests: > > >>>>> > > >>>>> 1. Create Pig resources in a base class (shamelessly copied from > > >> PigExecTestCase): > > >>>>> > > >>>>> protected MiniCluster cluster; > > >>>>> protected PigServer pigServer; > > >>>>> > > >>>>> @Before > > >>>>> public void setUp() throws Exception { > > >>>>> > > >>>>> String execTypeString = System.getProperty("test.exectype"); > > >>>>> if(execTypeString!=null && execTypeString.length()>0){ > > >>>>> execType = PigServer.parseExecType(execTypeString); > > >>>>> } > > >>>>> if(execType == MAPREDUCE) { > > >>>>> cluster = MiniCluster.buildCluster(); > > >>>>> pigServer = new PigServer(MAPREDUCE, > > cluster.getProperties()); > > >>>>> } else { > > >>>>> pigServer = new PigServer(LOCAL); > > >>>>> } > > >>>>> } > > >>>>> > > >>>>> 2. Test classes sub class this to get access to the MiniCluster and > > >> PigServer (copied from TestPigSplit): > > >>>>> > > >>>>> @Test > > >>>>> public void notestLongEvalSpec() throws Exception{ > > >>>>> inputFileName = "notestLongEvalSpec-input.txt"; > > >>>>> createInput(new String[] {"0\ta"}); > > >>>>> > > >>>>> pigServer.registerQuery("a = load '" + inputFileName + "';"); > > >>>>> for (int i=0; i< 500; i++){ > > >>>>> pigServer.registerQuery("a = filter a by $0 == '1';"); > > >>>>> } > > >>>>> Iterator<Tuple> iter = pigServer.openIterator("a"); |