Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Is pig maddening to work with because it's so slow?


Copy link to this message
-
Re: Is pig maddening to work with because it's so slow?
Seconded for PigUnit.

As for a faster debugging procedure, I've gone modular. First I JUnit test
individual UDFs against their functional requirements and use cases a
priori.  Then I mockup my whiteboard workflow as multiple pig script
logical blocks (multiple pig files to test), start a pig -x local, and try
each aliased line one-by-one per each logical block, with a DESCRIBE after
each.  This ensures that I have correct syntactical formulation in the
scripting, schemas, desired re-aliasing, etc., and you can merge logical
blocks back together for optimizations when blocks are completed.

Once a block is completed, you can do an ILLUSTRATE on each block to
spot-check results as well, but be forewarned, I've had issues with larger
scripts failing prematurely in this regard due to complexity.

Hope this helps,

-Dan
On Tue, May 20, 2014 at 3:26 PM, Suraj Nayak <[EMAIL PROTECTED]> wrote: