Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How to design a data warehouse in HBase?


Copy link to this message
-
Re: How to design a data warehouse in HBase?
Oh yes..Impala..good point by Kevin.

Kevin : Would it be appropriate if I say that I should go for Impala if my
data is not going to increase dramatically over time or if I have to work
on only a subset of my BigData?Since Impala uses MPP, it may
require specialized hardware tuned for CPU, storage and network performance
for better results, which could become a problem if have to upgrade the
hardware frequently because of the growing data.

Regards,
    Mohammad Tariq

On Thu, Dec 13, 2012 at 8:17 PM, Kevin O'dell <[EMAIL PROTECTED]>wrote:

> To Mohammad's point.  You can use HBase for quick scans of the data.  Hive
> for your longer running jobs.  Impala over the two for quick adhoc
> searches.
>
> On Thu, Dec 13, 2012 at 9:44 AM, Mohammad Tariq <[EMAIL PROTECTED]>
> wrote:
>
> > I am not saying Hbase is not good. My point was to consider Hive as well.
> > Think about the approach keeping both the tools in mind and decide. I
> just
> > provided an option keeping in mind the available built-in Hive features.
> I
> > would like to add one more point here, you can map your Hbase tables to
> > Hive.
> >
> > Regards,
> >     Mohammad Tariq
> >
> >
> >
> > On Thu, Dec 13, 2012 at 7:58 PM, bigdata <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hi, Tariq
> > > Thanks for your feedback. Actually, now we have two ways to reach the
> > > target, by Hive and  by HBase.Could you tell me why HBase is not good
> for
> > > my requirements?Or what's the problem in my solution?
> > > Thanks.
> > >
> > > > From: [EMAIL PROTECTED]
> > > > Date: Thu, 13 Dec 2012 15:43:25 +0530
> > > > Subject: Re: How to design a data warehouse in HBase?
> > > > To: [EMAIL PROTECTED]
> > > >
> > > > Both have got different purposes. Normally people say that Hive is
> > slow,
> > > > that's just because it uses MapReduce under the hood. And i'm sure
> that
> > > if
> > > > the data stored in HBase is very huge, nobody would write sequential
> > > > programs for Get or Scan. Instead they will write MP jobs or do
> > something
> > > > similar.
> > > >
> > > > My point is that nothing can be 100% real time. Is that what you
> > want?If
> > > > that is the case I would never suggest Hadoop on the first place as
> > it's
> > > a
> > > > batch processing system and cannot be used like an OLTP system,
> unless
> > > you
> > > > have thought of some additional stuff. Since you are talking about
> > > > warehouse, I am assuming you are going to store and process gigantic
> > > > amounts of data. That's the only reason I had suggested Hive.
> > > >
> > > > The whole point is that everything is not a solution for everything.
> > One
> > > > size doesn't fit all. First, we need to analyze our particular use
> > case.
> > > > The person, who says Hive is slow, might be correct. But only for his
> > > > scenario.
> > > >
> > > > HTH
> > > >
> > > > Regards,
> > > >     Mohammad Tariq
> > > >
> > > >
> > > >
> > > > On Thu, Dec 13, 2012 at 3:17 PM, bigdata <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > > I've got the information that HIVE 's performance is too low. It
> > access
> > > > > HDFS files and scan all data to search one record. IS it TRUE? And
> > > HBase is
> > > > > much faster than it.
> > > > >
> > > > >
> > > > > > From: [EMAIL PROTECTED]
> > > > > > Date: Thu, 13 Dec 2012 15:12:25 +0530
> > > > > > Subject: Re: How to design a data warehouse in HBase?
> > > > > > To: [EMAIL PROTECTED]
> > > > > >
> > > > > > Hi there,
> > > > > >
> > > > > >    If you are really planning for a warehousing solution then I
> > would
> > > > > > suggest you to have a look over Apache Hive. It provides you
> > > warehousing
> > > > > > capabilities on top of a Hadoop cluster. Along with that it also
> > > provides
> > > > > > an SQLish interface to the data stored in your warehouse, which
> > > would be
> > > > > > very helpful to you, in case you are coming from an SQL
> background.
> > > > > >
> > > > > > HTH
> > > > > >
> > > > > >
> > > > > >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB