Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> PIG + Junit


Copy link to this message
-
Re: PIG + Junit
Dimitry,

Nope that is new for me thanks for pointing it out, been using this home grown class since pig 0.5--really like the idea of unit testing moving into pig as a first class citizen.
On Jul 21, 2010, at 2:11 AM, Dmitriy Ryaboy wrote:

> Corbin,
> Have you looked at PigUnit? https://issues.apache.org/jira/browse/PIG-1404
>
>
> On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>
>> okay no attachments...try this gist:
>>
>> http://gist.github.com/484135
>>
>> On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote:
>>
>>> Trying to attach the PigRunner class in case that helps give you a start
>> using register script.
>>>
>>>
>>>
>>> On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote:
>>>
>>>> Hey Todd we run against entire pig scripts with some helper classes we
>> built basically they preprocess the variables then call register script but
>> the test looks like this:
>>>>
>>>>  @Before
>>>>  public void setUp() throws Exception {
>>>>      Helper.delete(OUT_FILE);
>>>>      runner = new PigRunner();
>>>>  }
>>>>
>>>>
>>>>  @Test
>>>>  public void testRecordCount() throws Exception {
>>>>     runner.execute("myscript.pig", "param1=foo","param2=bar");
>>>>
>>>>     Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo");
>>>>     assertEquals(41L, Helper.countTuples(tuples));
>>>>  }
>>>>
>>>> It's been very useful for us to test this way.  Would love to see more
>> chatter about other techniques.
>>>>
>>>> On Jul 20, 2010, at 3:26 PM, ToddG wrote:
>>>>
>>>>
>>>>> I'd like to include running various PIG scripts in my continuous build
>> system. Of course, I'll only use small datasets for this, and in the
>> beginning, I'll only target a local machine instance. However, this brings
>> up several questions:
>>>>>
>>>>>
>>>>> Q: Whats the best way to run PIG from java? Here's what I'm doing,
>> following a pattern I found in some of the pig tests:
>>>>>
>>>>> 1. Create Pig resources in a base class (shamelessly copied from
>> PigExecTestCase):
>>>>>
>>>>> protected MiniCluster cluster;
>>>>> protected PigServer pigServer;
>>>>>
>>>>> @Before
>>>>> public void setUp() throws Exception {
>>>>>
>>>>>     String execTypeString = System.getProperty("test.exectype");
>>>>>     if(execTypeString!=null && execTypeString.length()>0){
>>>>>         execType = PigServer.parseExecType(execTypeString);
>>>>>     }
>>>>>     if(execType == MAPREDUCE) {
>>>>>         cluster = MiniCluster.buildCluster();
>>>>>         pigServer = new PigServer(MAPREDUCE, cluster.getProperties());
>>>>>     } else {
>>>>>         pigServer = new PigServer(LOCAL);
>>>>>     }
>>>>> }
>>>>>
>>>>> 2. Test classes sub class this to get access to the MiniCluster and
>> PigServer (copied from TestPigSplit):
>>>>>
>>>>> @Test
>>>>> public void notestLongEvalSpec() throws Exception{
>>>>>     inputFileName = "notestLongEvalSpec-input.txt";
>>>>>     createInput(new String[] {"0\ta"});
>>>>>
>>>>>     pigServer.registerQuery("a = load '" + inputFileName + "';");
>>>>>     for (int i=0; i< 500; i++){
>>>>>         pigServer.registerQuery("a = filter a by $0 == '1';");
>>>>>     }
>>>>>     Iterator<Tuple> iter = pigServer.openIterator("a");
>>>>>     while (iter.hasNext()){
>>>>>         throw new Exception();
>>>>>     }
>>>>> }
>>>>>
>>>>> 3. ERROR
>>>>>
>>>>> This pattern works for simple PIG directives, but I want to load up
>> entire pig scripts, which have REGISTER and DEFINE directives, then the
>> pigServer.registerQuery() fails with:
>>>>>
>>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
>> during parsing. Unrecognized alias REGISTER
>>>>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170)
>>>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441)
>>>>> at
>> com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB