-Re: Announcement of Project Panthera: Better Analytics with SQL, MapReduce and HBase
Stack 2012-09-17, 20:08
On Mon, Sep 17, 2012 at 6:55 AM, Dai, Jason <[EMAIL PROTECTED]> wrote:
> I'd like to announce Project Panthera, our open source efforts that showcase better data analytics capabilities on Hadoop/HBase (through both SW and HW improvements), available at https://github.com/intel-hadoop/project-panthera.
> 2) A document store (built on top of HBase) for better query processing
> Under Project Panthera, we will gradually make our implementation of the document store available as an extension to HBase (https://github.com/intel-hadoop/hbase-0.94-panthera). Specifically, today's release provides document store support in HBase by utilizing co-processors, which brings up-to 3x reduction in storage usage and up-to 1.8x speedup in query processing. Going forward, we will also use HBase-6800<https://issues.apache.org/jira/browse/HBASE-6800> as the umbrella JIRA to track our efforts to get the document store idea reviewed and hopefully incorporated into Apache HBase.
Thanks for open sourcing this stuff Jason. It looks great.
I took a quick look. Like Andy, I see that Pathera -- great name by
the way, J-D is playing Pantera (too!) loud here in our space since
this note showed up on the list -- includes a full HBase. Do you have
to deliver Panthera that way? Can we help make it so you do not need
to include HBase core? Do you have a list of things we need to change
so you can go downstream of core?
Good on you Jason,