|
manu [ranga]
2012-04-02, 22:12
manu [ranga]
2012-04-05, 22:56
Daniel Dai
2012-04-06, 00:01
Dmitriy Ryaboy
2012-04-06, 00:09
Russell Jurney
2012-04-06, 00:13
Daniel Dai
2012-04-06, 00:36
manu [ranga]
2012-04-06, 09:04
Russell Jurney
2012-04-06, 14:40
Russell Jurney
2012-04-06, 16:10
Russell Jurney
2012-04-06, 17:15
manu [ranga]
2012-04-06, 23:26
|
-
[gsoc2012] plan/data flow visualizer web interface - PIG-2586manu [ranga] 2012-04-02, 22:12
I am interested in $subject gosc idea and preparing the proposal.
little bit of background: I am undergrad form Univ. of moratuwa, Sri Lanka. I did a internship last year at WSO2 [1] during which I did a data visualization project. It involves generating visualization gadgets for hadoop job results. it's integrated to WSO2 BAM 2 alpha [2] release so far. technologies/frameworks involved where - html5/css, javascript, google closure tools/lib [3], wireit [4], jqPlot. let me describe the main use case as I understood it: 1. user launches pig with command line, such as pig -x local -webconsole 8085 2. user logs to http://localhost:8085 using a browser 3. emulated console is visible in the web page via which, one can execute usual grunt commands. additional UI elements exist to upload Pig Latin script, configuration, ect. 4. output of each grunt command is formatted using some html/javascript template eg:- ILLUSTRATE will out put neat looking html table. 5. special emphasis is given in this project to formatting EXPLAIN command. it will generate interactive html elements with additional meta-data. eg:- clicking a foreach element will expand it with more visualization. it would be very helpful if you can clarify some of the question below: 1. did I capture the main requirement of this project in above scenario. is there any more required function i missed out? 2. is allowing web interface for grunt too much (out of scope) for this project? even thou the project is about plan visualization, i thought this is the most elegant way to approach the problem. 3. are the graphics (eg DAG graph) suppose to be generated on browser side or java side? my personal preference is for browser generated, since it will be easier to make them interactive. (sorry for the resend russell) any comment/feedback would be appreciated, thanks. [1] http://wso2.com/ [2] http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2.zip [3] https://developers.google.com/closure/ [4] http://neyric.github.com/wireit/ -- by, Manu (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586manu [ranga] 2012-04-05, 22:56
hi russell,
I have submitted a proposal at melange. I am aware that the pig team has been very busy recently. Hope you’ll be able to provide some feedback soon. Thanks -- by, Manu (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Daniel Dai 2012-04-06, 00:01
Hi, Russell,
You need to 1. send request to [EMAIL PROTECTED] and [EMAIL PROTECTED] to become a mentor 2. create account at http://www.google-melange.com/gsoc/homepage/google/gsoc2012 3. apply mentor to Apache in google-melange 4. let me know your google-melange so I can update Melange id for you Thanks, Daniel On Thu, Apr 5, 2012 at 3:56 PM, manu [ranga] <[EMAIL PROTECTED]> wrote: > hi russell, > I have submitted a proposal at melange. > > I am aware that the pig team has been very busy recently. Hope you’ll > be able to provide some feedback soon. Thanks > -- > by, > Manu > (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Dmitriy Ryaboy 2012-04-06, 00:09
Do we have a committer who's signed up to mentor this issue?
Russ can't according to apache's guidelines for gsoc (though he can certainly help advise / review code / etc -- just not be an official apache foundation rep to google regarding progress of the project) D On Thu, Apr 5, 2012 at 5:01 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > Hi, Russell, > You need to > 1. send request to [EMAIL PROTECTED] and [EMAIL PROTECTED] > to become a mentor > 2. create account at http://www.google-melange.com/gsoc/homepage/google/gsoc2012 > 3. apply mentor to Apache in google-melange > 4. let me know your google-melange so I can update Melange id for you > > Thanks, > Daniel > > On Thu, Apr 5, 2012 at 3:56 PM, manu [ranga] <[EMAIL PROTECTED]> wrote: >> hi russell, >> I have submitted a proposal at melange. >> >> I am aware that the pig team has been very busy recently. Hope you’ll >> be able to provide some feedback soon. Thanks >> -- >> by, >> Manu >> (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Russell Jurney 2012-04-06, 00:13
Ah, ok. I've been noodling on the problem and don't know Pig internals well
enough to figure out a clean way to get at the data that is needed for this task. Someone familiar with HCatalog and how y'all are handling accessing Pig ILLUSTRATE data there would be a more appropriate mentor. On Thu, Apr 5, 2012 at 5:09 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Do we have a committer who's signed up to mentor this issue? > Russ can't according to apache's guidelines for gsoc (though he can > certainly help advise / review code / etc -- just not be an official > apache foundation rep to google regarding progress of the project) > > D > > On Thu, Apr 5, 2012 at 5:01 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > > Hi, Russell, > > You need to > > 1. send request to [EMAIL PROTECTED] and [EMAIL PROTECTED] > > to become a mentor > > 2. create account at > http://www.google-melange.com/gsoc/homepage/google/gsoc2012 > > 3. apply mentor to Apache in google-melange > > 4. let me know your google-melange so I can update Melange id for you > > > > Thanks, > > Daniel > > > > On Thu, Apr 5, 2012 at 3:56 PM, manu [ranga] <[EMAIL PROTECTED]> > wrote: > >> hi russell, > >> I have submitted a proposal at melange. > >> > >> I am aware that the pig team has been very busy recently. Hope you’ll > >> be able to provide some feedback soon. Thanks > >> -- > >> by, > >> Manu > >> (R Chathura Manuranga Perera) > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Daniel Dai 2012-04-06, 00:36
Seems they changed the rule this year. Last year non-committer can
mentor. I can be the backup if no one else want to mentor this. Daniel On Thu, Apr 5, 2012 at 5:13 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > Ah, ok. I've been noodling on the problem and don't know Pig internals well > enough to figure out a clean way to get at the data that is needed for this > task. Someone familiar with HCatalog and how y'all are handling accessing > Pig ILLUSTRATE data there would be a more appropriate mentor. > > On Thu, Apr 5, 2012 at 5:09 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > >> Do we have a committer who's signed up to mentor this issue? >> Russ can't according to apache's guidelines for gsoc (though he can >> certainly help advise / review code / etc -- just not be an official >> apache foundation rep to google regarding progress of the project) >> >> D >> >> On Thu, Apr 5, 2012 at 5:01 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: >> > Hi, Russell, >> > You need to >> > 1. send request to [EMAIL PROTECTED] and [EMAIL PROTECTED] >> > to become a mentor >> > 2. create account at >> http://www.google-melange.com/gsoc/homepage/google/gsoc2012 >> > 3. apply mentor to Apache in google-melange >> > 4. let me know your google-melange so I can update Melange id for you >> > >> > Thanks, >> > Daniel >> > >> > On Thu, Apr 5, 2012 at 3:56 PM, manu [ranga] <[EMAIL PROTECTED]> >> wrote: >> >> hi russell, >> >> I have submitted a proposal at melange. >> >> >> >> I am aware that the pig team has been very busy recently. Hope you’ll >> >> be able to provide some feedback soon. Thanks >> >> -- >> >> by, >> >> Manu >> >> (R Chathura Manuranga Perera) >> > > > > -- > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586manu [ranga] 2012-04-06, 09:04
Hi,
Thanks for the replies and feedback. I was not aware of the mentor situation. I hope that can be resolved. I have made the proposal public, so you should able to see it without registration [1]. @Daniel, in reply to melange comment Reason why I am suggesting doing a web interface, instead of including a command like “explain -script 111.pig -graphics” :- 1.If we make a new java swing window pop up and show the plan diagram, users who are using it via remote accesses (SSH) will have to go through extra trouble to use it. 2. It will be easier to integrate to a web site of a cloud service provider (e.g.:- can be exposed via amazon’s web console). 3. Room for feature expansions. This web UI can be extended to show other important information. For example one can add a side bar showing structure of the working hadoop file system (much like NERDtree). Should I modify the proposal by adding a section on comparison of java GUI approach and web approach? Thanks, and feedback is much appreciated. [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/manuranga/5002 -- by, Manu (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Russell Jurney 2012-04-06, 14:40
Sorry for the delay, I am going over this now. I hope someone will step up to mentor-mentor me, as I think this is important work.
Russell Jurney http://datasyndrome.com On Apr 6, 2012, at 2:04 AM, "manu [ranga]" <[EMAIL PROTECTED]> wrote: > Hi, > Thanks for the replies and feedback. > I was not aware of the mentor situation. I hope that can be resolved. > I have made the proposal public, so you should able to see it without > registration [1]. > > > > @Daniel, in reply to melange comment > Reason why I am suggesting doing a web interface, instead of including > a command like “explain -script 111.pig -graphics” :- > > > 1.If we make a new java swing window pop up and show the plan > diagram, users who are using it via remote accesses (SSH) will have to > go through extra trouble to use it. > > 2. It will be easier to integrate to a web site of a cloud service > provider (e.g.:- can be exposed via amazon’s web console). > > 3. Room for feature expansions. This web UI can be extended to > show other important information. For example one can add a side bar > showing structure of the working hadoop file system (much like > NERDtree). > > > > Should I modify the proposal by adding a section on comparison of java > GUI approach and web approach? > Thanks, and feedback is much appreciated. > > [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/manuranga/5002 > -- > by, > Manu > (R Chathura Manuranga Perera)
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Russell Jurney 2012-04-06, 16:10
Request sent.
Account created: Russell Jurney (link_id: rjurney) Applied to mentor Apache in google-melange link_id: rjurney On Thu, Apr 5, 2012 at 5:01 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > Hi, Russell, > You need to > 1. send request to [EMAIL PROTECTED] and [EMAIL PROTECTED] > to become a mentor > 2. create account at > http://www.google-melange.com/gsoc/homepage/google/gsoc2012 > 3. apply mentor to Apache in google-melange > 4. let me know your google-melange so I can update Melange id for you > > Thanks, > Daniel > > On Thu, Apr 5, 2012 at 3:56 PM, manu [ranga] <[EMAIL PROTECTED]> wrote: > > hi russell, > > I have submitted a proposal at melange. > > > > I am aware that the pig team has been very busy recently. Hope you’ll > > be able to provide some feedback soon. Thanks > > -- > > by, > > Manu > > (R Chathura Manuranga Perera) > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586Russell Jurney 2012-04-06, 17:15
As to the merits of a web application:
1. It is easier to serve JSON containing ILLUSTRATE data than it is to draw primitives in swing. Pages can be served with a lightweight web server. This server could get access to data from a modified Pig/Grunt in JSON format, or simply parse Grunt's output using the 'tee' command. Once the JSON is out of Grunt, there are numerous diagramming libraries for web applications, such as WireIT <http://neyric.github.com/wireit/> (great example here for XProc <http://feedscape.appspot.com/>), or you can roll your own using d3.js or Raphael.js for drawing and something like jGraphViz<http://jgraphviz.sourceforge.net/>for layout. You seem to be familiar with them, and that is great! 2. I agree a web app has more potential. We want the ILLUSTRATE diagram feature picked up by Elastic MapReduce<http://aws.amazon.com/elasticmapreduce/>and Mortar Data <http://mortardata.com/#!/easy_hadoop>. 3. I agree that many more people will be thrilled to work on a web app than a Swing or Eclipse application. See: PigPen. As to the proposal, I have this feedback: I would focus on the functionality demonstrated in Figure 1 of the PigPen SIGMOD paper <http://research.yahoo.com/files/paper_5.pdf>: the visualization of data-flows. There are enough issues with ILLUSTRATE itself and getting the presentation right that this is a lot of work. For instance, when records are long or complex... there is meaty work to getting the presentation/visualization done well. It will be fun! If you do go for a full-blown IDE, I suggest you do it as a stretch goal. In doing so, we should be cognizant of the work Mortar Data is doing in this area. You might try coordinating with them if that is possible. Be aware that this is a large project. There are existing javascript IDEs such as Cloud 9 <http://c9.io/> (source<https://github.com/ajaxorg/cloud9>) you can use as a starting point for an in-browser editor. You could plug-in your ILLUSTRATE functionality to something like this. I would encourage you to re-scope your proposal to focus on developing an in-browser visualization of illustrate - with sample records, and then after it stands on its own on integrating with an existing editor like cloud9. I apologize for the delay in my response, and I am excited about this project and look forward to helping! On Fri, Apr 6, 2012 at 2:04 AM, manu [ranga] <[EMAIL PROTECTED]> wrote: > Hi, > Thanks for the replies and feedback. > I was not aware of the mentor situation. I hope that can be resolved. > I have made the proposal public, so you should able to see it without > registration [1]. > > > > @Daniel, in reply to melange comment > Reason why I am suggesting doing a web interface, instead of including > a command like “explain -script 111.pig -graphics” :- > > > 1.If we make a new java swing window pop up and show the plan > diagram, users who are using it via remote accesses (SSH) will have to > go through extra trouble to use it. > > 2. It will be easier to integrate to a web site of a cloud service > provider (e.g.:- can be exposed via amazon’s web console). > > 3. Room for feature expansions. This web UI can be extended to > show other important information. For example one can add a side bar > showing structure of the working hadoop file system (much like > NERDtree). > > > > Should I modify the proposal by adding a section on comparison of java > GUI approach and web approach? > Thanks, and feedback is much appreciated. > > [1] > http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/manuranga/5002 > -- > by, > Manu > (R Chathura Manuranga Perera) > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586manu [ranga] 2012-04-06, 23:26
Thanks for the reply. Unfortunately I saw it only after the proposal
edit deadline has passed. >I would focus on the functionality demonstrated in Figure 1 of the PigPen SIGMOD paper I’ll start reading the paper. >There are existing javascript IDEs such as Cloud 9 (source) I am familiar with online IDEs. As I mentioned briefly in the proposal, I was involved in initializing such a project called ‘gadget ide’ [1] at WSO2. It is a graph based IDE aimed at dashboard authoring via specifying data flows, which is a fairly large project (8000+ JavaScript SLOC). > If you do go for a full-blown IDE, I suggest you do it as a stretch goal Of cause, I am planning to be more focused on the visualization of commands. I have mentioned the IDE, only as a possible future extension. I didn’t include it in the project timeline, but if time allows I am happy to attempt it. But as you have mentioned, the visualization part itself is considerably large project. >I would encourage you to re-scope your proposal I was expecting EXPLAIN to be the major problem. I wasn’t anticipating ILLUSTRATE to be hard command to visualize. If so, I’d like to re-scope the project as you have suggested. How about completing the ILLUSTRATE before the mid evaluation, and pushing down the plan visualize to second phase? Will that allocation be enough? I may have to cut back a little on the interactive features of the graphs. Please provide your suggestions on this matter. I am aware that the team is busy with 0.10, so thanks for taking time to provide feedback. [1] (Last two screen shots are most relevant) http://mackiemathew.com/2012/01/21/a-revolution-with-business-activity-monitor-bam-2-0/ -- by, Manu (R Chathura Manuranga Perera) |