Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Announcement of Project Panthera: Better Analytics with SQL, MapReduce and HBase


Copy link to this message
-
Re: Announcement of Project Panthera: Better Analytics with SQL, MapReduce and HBase
Stack 2012-09-17, 20:08
On Mon, Sep 17, 2012 at 6:55 AM, Dai, Jason <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'd like to announce Project Panthera, our open source efforts that showcase better data analytics capabilities on Hadoop/HBase (through both SW and HW improvements), available at https://github.com/intel-hadoop/project-panthera.
>

...

> 2)      A document store (built on top of HBase) for better query processing
>    Under Project Panthera, we will gradually make our implementation of the document store available as an extension to HBase (https://github.com/intel-hadoop/hbase-0.94-panthera). Specifically, today's release provides document store support in HBase by utilizing co-processors, which brings up-to 3x reduction in storage usage and up-to 1.8x speedup in query processing. Going forward, we will also use HBase-6800<https://issues.apache.org/jira/browse/HBASE-6800> as the umbrella JIRA to track our efforts to get the document store idea reviewed and hopefully incorporated into Apache HBase.
>

Thanks for open sourcing this stuff Jason.  It looks great.

I took a quick look.  Like Andy, I see that Pathera -- great name by
the way, J-D is playing Pantera (too!) loud here in our space since
this note showed up on the list -- includes a full HBase.  Do you have
to deliver Panthera that way?  Can we help make it so you do not need
to include HBase core?  Do you have a list of things we need to change
so you can go downstream of core?

Good on you Jason,
St.Ack