Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # user >> Drill Masters Project


Copy link to this message
-
Re: Drill Masters Project
Thanks Jacques.  I'm very happy to get involved and share my experiences.

I'm looking for the best way to set up a cluster now.  In terms of
evaluating Drill's performance, do you think it's especially important to
have a system that would be close in performance to a production cluster,
or would it be worthwhile exploring it on a small scale?  Problem being a
student, my budget is limited, so I'm exploring things like Raspberry Pi
clusters, which I think don't have linear performance improvements as you
scale out.  I'm also enquiring about EC2 or GCE student licensing.
On 29 August 2013 05:08, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> A Hadoop cluster would be a good start.  We're in the process right now of
> putting together distributable files which will help get you to up to speed
> quickly.  Contribution isn't just code, there are many types and I'm sure
> you can help in any number of ways.  Just documenting your early
> experiences and advice would be a great way to start helping out.
>
> Jacques
>
>
> On Sun, Aug 25, 2013 at 1:25 PM, Tom Seddon <[EMAIL PROTECTED]>
> wrote:
>
> > Hi,
> >
> > I'm looking to do a dissertation on Drill, as part of masters degree in
> > Data Science.  I'm hoping to set up a cluster to run it and then analyse
> > its efficiency with different datasets, as well as make recommendations
> for
> > its usage. I know Drill is in a fairly early stage of development but I
> > have around 18 months until the project is due, so I'm hoping the timing
> > will work as Drill is developed further.
> >
> > I'd be grateful for any advice on how I could get started on this.
>  Would a
> > Hadoop cluster be a good back-end to base my project on or would
> something
> > more suited to nested data like MongoDB be more appropriate?  Also, I
> > haven't found much documentation on configuring Drill in a distributed
> > environment, so any help on this would be appreciated.
> >
> > I'd also be willing to contribute but not sure if I have enough Java
> > experience.  My background is mainly in BI and database technologies.
> >
> > Thanks,
> >
> > Tom
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB