Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> How to design a data warehouse in HBase?


+
bigdata 2012-12-13, 05:57
+
lars hofhansl 2012-12-13, 07:09
+
Michel Segel 2012-12-13, 08:43
+
bigdata 2012-12-13, 09:13
+
Mohammad Tariq 2012-12-13, 09:42
+
bigdata 2012-12-13, 09:47
+
Mohammad Tariq 2012-12-13, 10:13
+
bigdata 2012-12-13, 14:28
+
Mohammad Tariq 2012-12-13, 14:44
+
Kevin Odell 2012-12-13, 14:47
Copy link to this message
-
Re: How to design a data warehouse in HBase?
Oh yes..Impala..good point by Kevin.

Kevin : Would it be appropriate if I say that I should go for Impala if my
data is not going to increase dramatically over time or if I have to work
on only a subset of my BigData?Since Impala uses MPP, it may
require specialized hardware tuned for CPU, storage and network performance
for better results, which could become a problem if have to upgrade the
hardware frequently because of the growing data.

Regards,
    Mohammad Tariq

On Thu, Dec 13, 2012 at 8:17 PM, Kevin O'dell <[EMAIL PROTECTED]>wrote:

> To Mohammad's point.  You can use HBase for quick scans of the data.  Hive
> for your longer running jobs.  Impala over the two for quick adhoc
> searches.
>
> On Thu, Dec 13, 2012 at 9:44 AM, Mohammad Tariq <[EMAIL PROTECTED]>
> wrote:
>
> > I am not saying Hbase is not good. My point was to consider Hive as well.
> > Think about the approach keeping both the tools in mind and decide. I
> just
> > provided an option keeping in mind the available built-in Hive features.
> I
> > would like to add one more point here, you can map your Hbase tables to
> > Hive.
> >
> > Regards,
> >     Mohammad Tariq
> >
> >
> >
> > On Thu, Dec 13, 2012 at 7:58 PM, bigdata <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hi, Tariq
> > > Thanks for your feedback. Actually, now we have two ways to reach the
> > > target, by Hive and  by HBase.Could you tell me why HBase is not good
> for
> > > my requirements?Or what's the problem in my solution?
> > > Thanks.
> > >
> > > > From: [EMAIL PROTECTED]
> > > > Date: Thu, 13 Dec 2012 15:43:25 +0530
> > > > Subject: Re: How to design a data warehouse in HBase?
> > > > To: [EMAIL PROTECTED]
> > > >
> > > > Both have got different purposes. Normally people say that Hive is
> > slow,
> > > > that's just because it uses MapReduce under the hood. And i'm sure
> that
> > > if
> > > > the data stored in HBase is very huge, nobody would write sequential
> > > > programs for Get or Scan. Instead they will write MP jobs or do
> > something
> > > > similar.
> > > >
> > > > My point is that nothing can be 100% real time. Is that what you
> > want?If
> > > > that is the case I would never suggest Hadoop on the first place as
> > it's
> > > a
> > > > batch processing system and cannot be used like an OLTP system,
> unless
> > > you
> > > > have thought of some additional stuff. Since you are talking about
> > > > warehouse, I am assuming you are going to store and process gigantic
> > > > amounts of data. That's the only reason I had suggested Hive.
> > > >
> > > > The whole point is that everything is not a solution for everything.
> > One
> > > > size doesn't fit all. First, we need to analyze our particular use
> > case.
> > > > The person, who says Hive is slow, might be correct. But only for his
> > > > scenario.
> > > >
> > > > HTH
> > > >
> > > > Regards,
> > > >     Mohammad Tariq
> > > >
> > > >
> > > >
> > > > On Thu, Dec 13, 2012 at 3:17 PM, bigdata <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > > I've got the information that HIVE 's performance is too low. It
> > access
> > > > > HDFS files and scan all data to search one record. IS it TRUE? And
> > > HBase is
> > > > > much faster than it.
> > > > >
> > > > >
> > > > > > From: [EMAIL PROTECTED]
> > > > > > Date: Thu, 13 Dec 2012 15:12:25 +0530
> > > > > > Subject: Re: How to design a data warehouse in HBase?
> > > > > > To: [EMAIL PROTECTED]
> > > > > >
> > > > > > Hi there,
> > > > > >
> > > > > >    If you are really planning for a warehousing solution then I
> > would
> > > > > > suggest you to have a look over Apache Hive. It provides you
> > > warehousing
> > > > > > capabilities on top of a Hadoop cluster. Along with that it also
> > > provides
> > > > > > an SQLish interface to the data stored in your warehouse, which
> > > would be
> > > > > > very helpful to you, in case you are coming from an SQL
> background.
> > > > > >
> > > > > > HTH
> > > > > >
> > > > > >
> > > > > >
+
Kevin Odell 2012-12-13, 15:30
+
Mohammad Tariq 2012-12-13, 15:33
+
Manoj Babu 2012-12-13, 16:38
+
Kevin Odell 2012-12-13, 16:42
+
Michel Segel 2012-12-14, 00:49
+
Michael Segel 2012-12-13, 20:20
+
Asaf Mesika 2012-12-15, 02:14