Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # user - meeting notes 10/22/13


Copy link to this message
-
Re: meeting notes 10/22/13
Michael Hausenblas 2013-10-22, 17:03

> Here are the notes from todays hangout. Michael, can you copy them into the google doc?
Thanks & done.

Cheers,
Michael

--
Michael Hausenblas
Ireland, Europe
http://mhausenblas.info/

On 22 Oct 2013, at 17:49, Jason Altekruse <[EMAIL PROTECTED]> wrote:

> Hello All,
>
> Here are the notes from todays hangout. Michael, can you copy them into the
> google doc?
>
> participants: Jacques, Micheal hausenblas, Lisen Mu, Yash Sharma, Jinfeng,
> Jason Altekruse, Harri, Steven Phillips, Timothy Chen, Julien Hyde
>
> New employee at MapR: Jinfeng
>    - couple more in the next month
>
> Jacques:
>    - merged limit
>    - clarify VVs
>        - never access internal state of VV when it is invalid
>    - release notes
>
> Steven:
>    - ordered partitioner
>        - abstract out distributed cache interface
>    - continue to work on spooling to disk
> Jason:
>    -semi-blocking
>        - look at sort and ordered hash partitioner
>
> Yash
>    - name of functions
>        - separate class for operators and functions for more clarity
>            - different operators have their own class files
>
> Lisen
>    - fork of Drill
>        - data pushed form leaves rather than pulled from root
>        - we have been thinking about this same problem
>            - don't want to wait for IO all the time
>            - pre-fetch rather than push
>            - in a join you might get pushed a huge amount of data when you
> aren't ready for it
>            - stream processing
>                - alternative concept around foreman
>                - not quite right for streams
>                - resource allocation
>                    - not as much for resource requirements
>        -HyperLogLog
>            - space saving
>            - acceptable - not precise
>        - data assembly - business logic
>            - approximations will be important to drill
>            - no serious thinking about sampling
>            - certain types of scanners should support sampling
>                - hard with some without reading all data anyway
>                - Hbase might be easier to do a scan
>            - doing it with their own business logic and statistics
>                - hard to generalize
>
> Hari
>    - not much for updates
>    - pick up with amazon ec2 docs
>        - had problem where we need 8 gigs
>        - cannot get it running on free micro instance
>        - got it working removing the direct memory flag in POM
>        - tim - out of memory exception right away
>            - was this with or without changing the option for direct
> memory?
>
> Tim
>    - wir patch in
>    - amp labs big data benchmark
>        - having numbers for performance evaluation
>        - set up on their repo for drill datasets
>        - installing HDFS to all of the nodes
>        - doesn't look to complicated
>    - cannot submit sql in distributed mode because of bad optimizer
>    - recent review board patches
>        - describe code more completely
>        - hard to review without docs
>        - Julien - single powerpoint slide per operator
>        - google doc? like the logical plan doc
>
>
> Ben
>    - code gen portion of merging receiver
>    - no blockers
>        - getting to code review soon
>
> Julian
>    - joined hortonworks
>    - working on optiq
>    - helping hive, but also working on Drill
>    - making optiq everything it can be
>    - splitting JDBC into thin client
>        - thinking about it, no implementation yet
>        - right now pushing sorts down to Mongo
>    - jacques - session next week on JDBC?
>    - roadmap on optiq
>        - commit logs tell some of the story
>        - roadmap would be helpful
>        - will put out call for optiq users like drill
>        - put together feature list for next release(s)
>        - next 6 months, want to be agile, but wants to be more predictable
>        - Jinfeng will be working with optimizer and optiq