Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill >> mail # dev >> git commit: First commit


+
tdunning@... 2012-09-03, 21:09
+
Michael Hausenblas 2012-09-03, 21:22
+
Jim Donofrio 2012-09-03, 21:49
+
Ted Dunning 2012-09-03, 22:09
Copy link to this message
-
Re: git commit: First commit
> use https://git-wip-us.apache.org/repos/asf/incubator-drill.git

Thanks, that worked.

Cheers,
  Michael

--
Michael Hausenblas
Ireland, Europe
http://mhausenblas.info/

On 3 Sep 2012, at 22:49, Jim Donofrio wrote:

> use https://git-wip-us.apache.org/repos/asf/incubator-drill.git
>
> On 09/03/2012 05:22 PM, Michael Hausenblas wrote:
>> Ted,
>>
>>> First commit
>> Cool ;)
>>
>> Tried to clone and got:
>>
>>  git clone git://git-wip-us.apache.org/repos/asf?p=incubator-drill.git repo
>>  Cloning into repo...
>>  git-wip-us.apache.org[0: 140.211.11.121]: errno=Operation timed out
>>  fatal: unable to connect a socket (Operation timed out)
>>
>> Also, it seems to not been listed on http://git.apache.org/ yet - could that be the reason for me not being able to clone it?
>>
>> Cheers,
>>   Michael
>>
>> --
>> Michael Hausenblas
>> Ireland, Europe
>> http://mhausenblas.info/
>>
>> On 3 Sep 2012, at 22:09, [EMAIL PROTECTED] wrote:
>>
>>> Updated Branches:
>>>  refs/heads/master [created] 9229caa45
>>>
>>>
>>> First commit
>>>
>>> Project: http://git-wip-us.apache.org/repos/asf/incubator-drill/repo
>>> Commit: http://git-wip-us.apache.org/repos/asf/incubator-drill/commit/9229caa4
>>> Tree: http://git-wip-us.apache.org/repos/asf/incubator-drill/tree/9229caa4
>>> Diff: http://git-wip-us.apache.org/repos/asf/incubator-drill/diff/9229caa4
>>>
>>> Branch: refs/heads/master
>>> Commit: 9229caa45a32dc06625f2443b6a5d84ab0a4df10
>>> Parents:
>>> Author: Ted Dunning <[EMAIL PROTECTED]>
>>> Authored: Mon Sep 3 13:21:32 2012 -0700
>>> Committer: Ted Dunning <[EMAIL PROTECTED]>
>>> Committed: Mon Sep 3 13:21:32 2012 -0700
>>>
>>> ----------------------------------------------------------------------
>>> README.md |  127 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 1 files changed, 127 insertions(+), 0 deletions(-)
>>> ----------------------------------------------------------------------
>>>
>>>
>>> http://git-wip-us.apache.org/repos/asf/incubator-drill/blob/9229caa4/README.md
>>> ----------------------------------------------------------------------
>>> diff --git a/README.md b/README.md
>>> new file mode 100644
>>> index 0000000..51772a9
>>> --- /dev/null
>>> +++ b/README.md
>>> @@ -0,0 +1,127 @@
>>> += Drill >>> +
>>> +This is a copy of the original proposal for Drill, for now.  Please edit and update as appropriate.
>>> +
>>> +== Abstract =>>> +Drill is a distributed system for interactive analysis of large-scale datasets, inspired by [[http://research.google.com/pubs/pub36632.html|Google's Dremel]].
>>> +
>>> +== Proposal =>>> +Drill is a distributed system for interactive analysis of large-scale datasets. Drill is similar to Google's Dremel, with the additional flexibility needed to support a broader range of query languages, data formats and data sources. It is designed to efficiently process nested data. It is a design goal to scale to 10,000 servers or more and to be able to process petabyes of data and trillions of records in seconds.
>>> +
>>> +== Background =>>> +Many organizations have the need to run data-intensive applications, including batch processing, stream processing and interactive analysis. In recent years open source systems have emerged to address the need for scalable batch processing (Apache Hadoop) and stream processing (Storm, Apache S4). In 2010 Google published a paper called "Dremel: Interactive Analysis of Web-Scale Datasets," describing a scalable system used internally for interactive analysis of nested data. No open source project has successfully replicated the capabilities of Dremel.
>>> +
>>> +== Rationale =>>> +There is a strong need in the market for low-latency interactive analysis of large-scale datasets, including nested data (eg, JSON, Avro, Protocol Buffers). This need was identified by Google and addressed internally with a system called Dremel.
>>> +
>>> +In recent years open source systems have emerged to address the need for scalable batch processing (Apache Hadoop) and stream processing (Storm, Apache S4). Apache Hadoop, originally inspired by Google's internal MapReduce system, is used by thousands of organizations processing large-scale datasets. Apache Hadoop is designed to achieve very high throughput, but is not designed to achieve the sub-second latency needed for interactive data analysis and exploration. Drill, inspired by Google's internal Dremel system, is intended to address this need.
+
周大 2012-09-04, 01:41
+
Ted Dunning 2012-09-04, 01:57