Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - Cosmos - Accumulo-backed sorting, filtering and grouping of columnar data sets


Copy link to this message
-
Re: Cosmos - Accumulo-backed sorting, filtering and grouping of columnar data sets
Miguel Pereira 2013-09-05, 14:49
+1 for making things simple :D
On Mon, Sep 2, 2013 at 10:30 PM, Josh Elser <[EMAIL PROTECTED]> wrote:

> Since this is the community that's likely to be interested, I wanted to
> spread some word about a project I've been working on in my spare time:
> Cosmos.
>
> https://github.com/joshelser/**cosmos<https://github.com/joshelser/cosmos>
>
> The point of Cosmos is to provide an efficient, easy-to-use interface
> around Accumulo for the general purpose of counting and filtering of a data
> set. At a glance, it accepts Multimaps of data, and provides mechanism to
> fetch records by column, fetch records by column with value filtering, and
> count unique values across records in a column (groupBy). It also contains
> a very simple internal timing/tracing API (much less granular than
> Accumulo's tracing library), and a (very) rough web interface for viewing
> said traces. Additionally, Cosmos contains a simple example of its API
> using a public dataset of ~350K records provided by the city of Chicago (
> https://data.cityofchicago.**org/ <https://data.cityofchicago.org/>).
>
> Cosmos' design lends itself well to multiple users accessing the same
> Accumulo instance, deferring to Accumulo or ZooKeeper to do
> synchronization/persistence when necessary. It aims at abstracting some of
> the difficulty in using Accumulo away from the user to make the application
> developer's life a bit easier.
>
> And, as you'd expect, Apache licensed and compatible with Apache Accumulo
> 1.4.4 and 1.5.0.
>
> I'd love to hear what people think. Any feedback is welcome.
>
> - Josh
>