-Re: [GSoC 2012] Self Introduction and interested projects
Shasha Liu 2012-04-06, 14:03
Based on the email discussions, I wrote my proposal of this pig visualizer
project and submit it onto google-melange. Please take a look at it at
your convenience, and it would also appreciated a lot if further
feedback/comments could be provided.
Thank you very much.
On Sun, Mar 25, 2012 at 9:25 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:
> I suggest you create a simple, minimal web application that visualizes a
> pig script file each time a url with the script filename is loaded.
> For instance, the process to use the tool might go like this:
> 1) Run pigvisualizer.(pl/py/rb) locally, at the start of your pig work
> 2) Create a new pig script at /my/dif/filename.pig
> 3) Open http://localhost:4567/pigviz/my/dir/filename.pig in a web browser
> 5) Reload this web page each time you want to see a new visualization OR
> have to page try to reload the file periodically
> There are several sources of data:
> 1) Start a pig session, via grunt,PigServer or HCatalog, and use
> ILLUSTRATE/EXPLAIN. An old example of doing this is available at
> 2) Use the explain or -dot commands from pig command line. In looking at
> the dot output, the graph is not as helpful as I had thought :(
> 3) Use the PigPen code to get ILLUSTRATE data for visualization
> The ideal situation is that you get the data plan via EXPLAIN, and sample
> data via ILLUSTRATE, and combine them to produce an even better version of
> figure 2 in the paper
> [image: Inline image 1]
> As to the presentation of the data in an interface, I suggest you AVOID
> eclipse and the UI code to PigPen, as there is little utility in having
> this visualization there. Not all Pig users use Eclipse, and there is
> little utility in editing scripts in the diagrams. There is great utility
> in visualizing, understanding and debugging this way, but not so much in
> On the other hand, anyone can edit Pig in their favorite tool and view
> their pig graph in a simple web application on their localhost by directing
> a web browser at it. This is why a simple, small web application seems
> best. You can use ruby/sinatra or python/bottle/flask or perl/catalyst to
> make a simple web app. Check out sigma.js for graph visualization:
> http://sigmajs.org/examples.html or http://neyric.github.com/wireit/ for
> something more fully featured.
> Perhaps the best plan is to fix ILLUSTRATE (see
> http://wiki.apache.org/pig/ExampleGenerator and talk to the guys at
> mortardata.com who have a patch for this), and edit the PigPen code to
> remove the Eclipse dependencies and have it output simple JSON for a web
> application to consume. It could write to a file, or you could create a
> simple web service that publishes JSON for the current pig session.
> Once we have JSON of ILLUSTRATE... getting a web visualization is easy. I
> can help, I've done it before in Cloud Stenography by parsing data in
> Grunt. Which you could do, btw. Old Perl code is available on github (see
> above link).
> Interested in thoughts of others.
> On Fri, Mar 23, 2012 at 11:21 PM, Shasha Liu <[EMAIL PROTECTED]>wrote:
>> Hi Daniel,
>> Thanks a lot for the reply.
>> I installed the latest Pig and read through the book of "programming in
>> I manged to use "-dot -out filename" to produce three graphs in dot file
>> Based on the existing dot file, my next question is what is the
>> requirement regarding a better visualizer?
>> Are we going to generate a picture (e.g., .png) for different plans
>> (logical plan, physical plan, map reduce plan), or provide a web interface
>> to visualize these graphs of plans?
>> Best regards,
>> Shasha(Amy) Liu
>> On Sun, Mar 18, 2012 at 3:30 AM, Daniel Dai <[EMAIL PROTECTED]>wrote:
>>> See comments inline.