Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Drill Incubator report: can someone review and add to the wiki?

Copy link to this message
Re: Drill Incubator report: can someone review and add to the wiki?

> Ellen and I put something together.  I need someone who has edit
> privileges to post it.  It is attached below:

By and large looks great! Two additions:

* As you can see from http://drill-user.org/ there were few more HUGs/BUGs where Drill was presented/discussed (in Europe) - the blog itself might also be considered to manifest a contribution (?)
* We have published an article on Drill in the Big Data journal http://www.liebertpub.com/big


Michael Hausenblas
Ireland, Europe

On 5 Jun 2013, at 16:42, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> Hey y'all...
> Ellen and I put something together.  I need someone who has edit
> privileges to post it.  It is attached below:
> ---------------
> Apache: Project Drill
> Description:
> Apache Drill is a distributed system for interactive analysis of
> large-scale datasets that is based on Google's Dremel. Its goal is to
> efficiently process nested data, scale to 10,000 servers or more and
> to be able to process petabyes of data and trillions of records in
> seconds.
> Drill has been incubating since 2012-08-11.
> Three Issues to Address in Move to Graduation:
> 1. Continue to attract new developers with a variety of skills and viewpoints
> 2. Develop community skills and knowledge by building some releases
> 3. Demonstrate community robustness by rotating project tasks among
> multiple project members
> Issues to Call to Attention of PMC or ASF Board:
> none
> How community has developed since last report:
> Mailing list discussions:
> There has been active participation in discussions on the developer
> mailing list, including new participants and developers. A few have
> participated in the users list; mainly activity takes place on
> developer mailing list.
> Activity summary:
> http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/
> June to date 5 June, 29 (mainly jira; some discussion)
> May 2013, 135  (jira, focused discussions)
> April 2013, 188  (jira; focused discussions)
> March 2013 260 (jira, focused discussions)
> Topics in discussion on the dev mailing list included but not limited to:
> • Evolution of logical plan syntax with addition of operators
> including the Value and Union Distinct operators
> • Advantages and disadvantages of Parquet versus ORC
> • ValueVector construct and requirements
> • The relative performance of Janino based compilation versus
> javax.tools.Javacompiler
> • Initial development of execution engine environment
> • Discussion of various types of large array and off heap data
> structure libraries
> • RPC protocol and framework
> Code
> For details of code commits, see http://bit.ly/14YPXN9 and http://bit.ly/19IyID1
> There has been great progress around both evolution of the reference
> interpreter and
> In the last three months, there have been many commits including:
> • Initial implementation of RPC framework
> • Base client and Zookeeper based client abstraction
> • SQL parser with JDBC driver
> • Distributed query scheduling framework
> • ValueVector implementations
> • Large number of reference interpreter tests and fixes
> Community Interactions
> There is now a weekly Drill hangout conducted remotely through Google
> hangouts Tuesday mornings 9am Pacific Time to keep core developers in
> contact in realtime despite geographical separation.  Results from
> these discussions are shared with the discussion list through meeting
> minutes and all are welcome to attend.  This has been helpful in
> speeding development and averages attendance of 8-10 developers each
> week.
> Presentations
> There have been presentations from community members at conferences,
> meet-ups and through the weekly Google hangout.
> Sample presentations:
> • Introduction to Apache Drill, Bay Area Analytics Group 2 April 2013
> by Tomer Shiran
> • Interactive Ad hoc query at scale: talk at Hadoop User Group UK by
> @mhausenblas
> • Apache Drill Technical Overview: talk at Google Hangout, May 22 by