Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill >> mail # dev >> Introduction


+
Siprell, Stefan 2013-01-10, 13:45
+
Ted Dunning 2013-01-10, 18:19
+
Jason 2013-01-10, 18:36
+
Ted Dunning 2013-01-10, 19:07
+
Jason 2013-01-11, 17:58
+
Ted Dunning 2013-01-11, 19:29
+
Jason 2013-01-14, 19:24
+
Ted Dunning 2013-01-14, 21:53


> Michael Hausenblas is beginning to collect data sets and query examples for
> different plausible use cases ranging from small to large.  He should show
> up on the mailing list shortly and you could coordinate with him.
Welcome, Stefan - great to have you on board!

So the idea would be to compile a list of datasets along with typical (interesting) queries formulated in natural language. One thing we need to get this off the ground is the Wiki but I gather Ted is on that ..

Datasets that might be of interest include, but are not restricted to:

 * Wikipedia edit history from [1]
 * Census data (US, Eurostat, etc.)
 * AOL search logs
 * Enron emails [2]

Feel free to come up with additional ones as well.

I suppose we can continue the discussion (who looks into what) here on the list and once the Wiki is available we can co-ordinate also via it.

Cheers,
Michael

[1] http://en.wikipedia.org/wiki/Wikipedia:Database_download
[2] http://www.cs.cmu.edu/~enron/

--
Michael Hausenblas
Ireland, Europe
http://mhausenblas.info/

On 10 Jan 2013, at 10:19, Ted Dunning <[EMAIL PROTECTED]> wrote:

> Stefan,
>
> One of the key things to do right now is to work on use cases.
>
> Michael Hausenblas is beginning to collect data sets and query examples for
> different plausible use cases ranging from small to large.  He should show
> up on the mailing list shortly and you could coordinate with him.
>
> On Thu, Jan 10, 2013 at 5:45 AM, Siprell, Stefan
> <[EMAIL PROTECTED]>wrote:
>
>> Hi all,
>> I am working for a IT consulting agency in Germany. One of the goals of
>> our team for 2013 is active (as in giving) participation in the open source
>> community and offering our customers cutting-edge analytical tools for
>> large to huge data bases. You guys hit the spot!
>>
>> I would like to start offering my personal help (volunteer work for now,
>> later I could pitch in a day or two per week perhaps) in any role which
>> would help. I am a somewhat strong enterprise java developer, can deal
>> sufficiently well with HTML5 frontends, know most things about build
>> environments and testing and should be able to do some design or
>> documentation.
>>
>> Is there anything I can do?
>>
>> Stefan
>>
+
Ellen Friedman 2013-01-27, 06:46
+
Michael Hausenblas 2013-01-13, 19:06
+
Ted Dunning 2013-01-13, 22:20
+
Michael Hausenblas 2013-01-13, 22:53
+
Ted Dunning 2013-01-13, 23:31
+
Jacques Nadeau 2013-01-19, 01:05
+
Siprell, Stefan 2013-01-19, 18:30
+
Jacques Nadeau 2013-01-19, 18:51
+
Jacques Nadeau 2013-01-19, 22:20
+
Siprell, Stefan 2013-01-19, 22:39
+
Jacques Nadeau 2013-01-19, 22:52
+
Siprell, Stefan 2013-01-19, 22:54
+
Jacques Nadeau 2013-01-19, 23:01
+
Siprell, Stefan 2013-01-19, 23:11
+
Jacques Nadeau 2013-01-19, 23:56
+
Siprell, Stefan 2013-01-20, 09:51
+
Jacques Nadeau 2013-01-20, 18:30
+
Siprell, Stefan 2013-01-20, 19:39
+
Ted Dunning 2013-01-20, 20:09
+
Jacques Nadeau 2013-01-20, 22:18
+
Ted Dunning 2013-01-20, 09:49
+
Ellen Friedman 2013-01-30, 07:25