Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # user >> Drill Masters Project


Copy link to this message
-
Re: Drill Masters Project
Thanks Jacques.  I'm very happy to get involved and share my experiences.

I'm looking for the best way to set up a cluster now.  In terms of
evaluating Drill's performance, do you think it's especially important to
have a system that would be close in performance to a production cluster,
or would it be worthwhile exploring it on a small scale?  Problem being a
student, my budget is limited, so I'm exploring things like Raspberry Pi
clusters, which I think don't have linear performance improvements as you
scale out.  I'm also enquiring about EC2 or GCE student licensing.
On 29 August 2013 05:08, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> A Hadoop cluster would be a good start.  We're in the process right now of
> putting together distributable files which will help get you to up to speed
> quickly.  Contribution isn't just code, there are many types and I'm sure
> you can help in any number of ways.  Just documenting your early
> experiences and advice would be a great way to start helping out.
>
> Jacques
>
>
> On Sun, Aug 25, 2013 at 1:25 PM, Tom Seddon <[EMAIL PROTECTED]>
> wrote:
>
> > Hi,
> >
> > I'm looking to do a dissertation on Drill, as part of masters degree in
> > Data Science.  I'm hoping to set up a cluster to run it and then analyse
> > its efficiency with different datasets, as well as make recommendations
> for
> > its usage. I know Drill is in a fairly early stage of development but I
> > have around 18 months until the project is due, so I'm hoping the timing
> > will work as Drill is developed further.
> >
> > I'd be grateful for any advice on how I could get started on this.
>  Would a
> > Hadoop cluster be a good back-end to base my project on or would
> something
> > more suited to nested data like MongoDB be more appropriate?  Also, I
> > haven't found much documentation on configuring Drill in a distributed
> > environment, so any help on this would be appreciated.
> >
> > I'd also be willing to contribute but not sure if I have enough Java
> > experience.  My background is mainly in BI and database technologies.
> >
> > Thanks,
> >
> > Tom
> >
>