Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase and Datawarehouse


+
Kiran 2013-04-28, 03:12
+
Em 2013-04-28, 11:12
+
shashwat shriparv 2013-04-28, 17:00
+
Mohammad Tariq 2013-04-28, 17:27
+
Kiran 2013-04-29, 03:39
+
anil gupta 2013-04-29, 05:21
+
Kiran 2013-04-29, 05:40
+
anil gupta 2013-04-29, 17:00
+
Mohammad Tariq 2013-04-29, 17:35
Copy link to this message
-
Re: HBase and Datawarehouse
> HBase is not really intended for heavy data crunching

Yes it is. This is why we have first class MapReduce integration and
optimized scanners.

Recent versions, like 0.94, also do pretty well with the 'O' part of OLAP.

Urban Airship's Datacube is an example of a successful OLAP project
implemented on HBase: http://github.com/urbanairship/datacube

"Urban Airship uses the datacube project to support its analytics stack for
mobile apps. We handle about ~10K events per second per node."
Also there is Adobe's SaasBase:
http://www.slideshare.net/clehene/hbase-and-hadoop-at-adobe

Etc.

Where an HBase OLAP application will differ tremendously from a traditional
data warehouse is of course in the interface to the datastore. You have to
design and speak in the language of the HBase API, though Phoenix (
https://github.com/forcedotcom/phoenix) is changing that.
On Sun, Apr 28, 2013 at 10:21 PM, anil gupta <[EMAIL PROTECTED]> wrote:

> Hi Kiran,
>
> In HBase the data is denormalized but at the core HBase is KeyValue based
> database meant for lookups or queries that expect response in milliseconds.
> OLAP i.e. data warehouse usually involves heavy data crunching. HBase is
> not really intended for heavy data crunching. If you want to just store
> denoramlized data and do simple queries then HBase is good. For OLAP kind
> of stuff, you can make HBase work but IMO you will be better off using Hive
> for  data warehousing.
>
> HTH,
> Anil Gupta
>
>
> On Sun, Apr 28, 2013 at 8:39 PM, Kiran <[EMAIL PROTECTED]> wrote:
>
> > But in HBase data can be said to be in  denormalised state as the
> > methodology
> > used for storage is a (column family:column) based flexible schema .Also,
> > from Google's  big table paper it is evident that HBase is capable of
> doing
> > OLAP.SO where does the difference lie?
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/HBase-and-Datawarehouse-tp4043172p4043216.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>

--
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
+
Asaf Mesika 2013-04-30, 05:54
+
Andrew Purtell 2013-04-30, 08:07
+
Kevin Odell 2013-04-30, 12:01
+
Andrew Purtell 2013-04-30, 17:38
+
Amandeep Khurana 2013-04-30, 18:19
+
Andrew Purtell 2013-04-30, 18:36
+
Michael Segel 2013-04-30, 18:14
+
Andrew Purtell 2013-04-30, 18:30
+
Michael Segel 2013-04-30, 18:42
+
Michael Segel 2013-04-30, 13:17
+
James Taylor 2013-04-30, 06:28
+
Viral Bajaria 2013-04-30, 06:02