Pig, mail # user - Re: Is pig maddening to work with because it's so slow? - 2014-05-20, 19:40
Solr & Elasticsearch trainings in New York & San Francisco [more info][hide]
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
-
Re: Is pig maddening to work with because it's so slow?
Seconded for PigUnit.

As for a faster debugging procedure, I've gone modular. First I JUnit test
individual UDFs against their functional requirements and use cases a
priori.  Then I mockup my whiteboard workflow as multiple pig script
logical blocks (multiple pig files to test), start a pig -x local, and try
each aliased line one-by-one per each logical block, with a DESCRIBE after
each.  This ensures that I have correct syntactical formulation in the
scripting, schemas, desired re-aliasing, etc., and you can merge logical
blocks back together for optimizations when blocks are completed.

Once a block is completed, you can do an ILLUSTRATE on each block to
spot-check results as well, but be forewarned, I've had issues with larger
scripts failing prematurely in this regard due to complexity.

Hope this helps,

-Dan
On Tue, May 20, 2014 at 3:26 PM, Suraj Nayak <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB