Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PIG + Junit


Copy link to this message
-
Re: PIG + Junit
Dimitry,

Nope that is new for me thanks for pointing it out, been using this home grown class since pig 0.5--really like the idea of unit testing moving into pig as a first class citizen.
On Jul 21, 2010, at 2:11 AM, Dmitriy Ryaboy wrote:

> Corbin,
> Have you looked at PigUnit? https://issues.apache.org/jira/browse/PIG-1404
>
>
> On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>
>> okay no attachments...try this gist:
>>
>> http://gist.github.com/484135
>>
>> On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote:
>>
>>> Trying to attach the PigRunner class in case that helps give you a start
>> using register script.
>>>
>>>
>>>
>>> On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote:
>>>
>>>> Hey Todd we run against entire pig scripts with some helper classes we
>> built basically they preprocess the variables then call register script but
>> the test looks like this:
>>>>
>>>>  @Before
>>>>  public void setUp() throws Exception {
>>>>      Helper.delete(OUT_FILE);
>>>>      runner = new PigRunner();
>>>>  }
>>>>
>>>>
>>>>  @Test
>>>>  public void testRecordCount() throws Exception {
>>>>     runner.execute("myscript.pig", "param1=foo","param2=bar");
>>>>
>>>>     Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo");
>>>>     assertEquals(41L, Helper.countTuples(tuples));
>>>>  }
>>>>
>>>> It's been very useful for us to test this way.  Would love to see more
>> chatter about other techniques.
>>>>
>>>> On Jul 20, 2010, at 3:26 PM, ToddG wrote:
>>>>
>>>>
>>>>> I'd like to include running various PIG scripts in my continuous build
>> system. Of course, I'll only use small datasets for this, and in the
>> beginning, I'll only target a local machine instance. However, this brings
>> up several questions:
>>>>>
>>>>>
>>>>> Q: Whats the best way to run PIG from java? Here's what I'm doing,
>> following a pattern I found in some of the pig tests:
>>>>>
>>>>> 1. Create Pig resources in a base class (shamelessly copied from
>> PigExecTestCase):
>>>>>
>>>>> protected MiniCluster cluster;
>>>>> protected PigServer pigServer;
>>>>>
>>>>> @Before
>>>>> public void setUp() throws Exception {
>>>>>
>>>>>     String execTypeString = System.getProperty("test.exectype");
>>>>>     if(execTypeString!=null && execTypeString.length()>0){
>>>>>         execType = PigServer.parseExecType(execTypeString);
>>>>>     }
>>>>>     if(execType == MAPREDUCE) {
>>>>>         cluster = MiniCluster.buildCluster();
>>>>>         pigServer = new PigServer(MAPREDUCE, cluster.getProperties());
>>>>>     } else {
>>>>>         pigServer = new PigServer(LOCAL);
>>>>>     }
>>>>> }
>>>>>
>>>>> 2. Test classes sub class this to get access to the MiniCluster and
>> PigServer (copied from TestPigSplit):
>>>>>
>>>>> @Test
>>>>> public void notestLongEvalSpec() throws Exception{
>>>>>     inputFileName = "notestLongEvalSpec-input.txt";
>>>>>     createInput(new String[] {"0\ta"});
>>>>>
>>>>>     pigServer.registerQuery("a = load '" + inputFileName + "';");
>>>>>     for (int i=0; i< 500; i++){
>>>>>         pigServer.registerQuery("a = filter a by $0 == '1';");
>>>>>     }
>>>>>     Iterator<Tuple> iter = pigServer.openIterator("a");
>>>>>     while (iter.hasNext()){
>>>>>         throw new Exception();
>>>>>     }
>>>>> }
>>>>>
>>>>> 3. ERROR
>>>>>
>>>>> This pattern works for simple PIG directives, but I want to load up
>> entire pig scripts, which have REGISTER and DEFINE directives, then the
>> pigServer.registerQuery() fails with:
>>>>>
>>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
>> during parsing. Unrecognized alias REGISTER
>>>>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170)
>>>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441)
>>>>> at
>> com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74)