Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Production usage stats/experience


Copy link to this message
-
Re: Production usage stats/experience
So these operations for your url service are resolving the key to the url,
and are writing log related information.

Do you think HBase would be used to build a web application like say
vBulletin?

On Sat, Jul 10, 2010 at 5:04 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> Funny you ask, because we are about to write a serie of blog posts
> about our HBase usage, operational experiences, future plans, etc, for
> the new official HBase
> blog (hbaseblog.com).
>
> Still, I shared some details inline (but obviously this will be detailed
> later).
>
> J-D
>
> On Sat, Jul 10, 2010 at 11:04 AM,  <[EMAIL PROTECTED]> wrote:
> >  Could you please share some #s? like how many requests @ peak, data
> store size,
> > # of nodes in cluster, etc (if u can reveal that is)?
>
> Our production cluster on peak answers around 16 to 22k requests per
> second (depends on the day). It's mostly atomic increments which we
> use for real-time reporting. Our MapReduce cluster peaks at around
> 7-8M scanned rows per second during our nightly big table
> scan/recompute. We must have around 35B rows in all our tables
> together, but I need to run some counts to have the right number (our
> 2 biggest tables have just over 15B each).
>
> We have a total of 5 cluster, 20 machines each, configured with 2 i7s,
> 24GB and 4x1TB.
>
> >
> > I'm also planning to use HBase for realtime web app. I would like to get
> some inputs
> > on what to do if something goes wrong...
> > ..In development, if i see any issues, I do kill -9/stop all & rm -rf
> disk. due to time crunch
> > ..(bad idea)..Obviously i can't do that in production..
>
> Oh we had worst issues than that. What about an unresponsive root disk
> that freezes your OS but not some processes?
>
> > -> Have you ever run into data corruption? ..that you could not recover
> any data?
>
> Nope, hurray for checksumming at the HDFS level.
>
> > -> If there is outage & if if you have to restart servers, what order you
> restart servers? (I presume
> > namenode/datanode, followed by HMaster, HRegionServer, followed by
> zookeper, followed by HBase client app)
>
> Hadoop then ZooKeeper then HBase then the thrift servers (our client is in
> php).
>
> > -> Is there anything that we must backup in the advent of outage? (or)
> let HDFS replication do its magic?
> > ..I'm ok with losing few days data ..but not all.
>
> We do incremental backups every hour to a NFS share and another
> cluster in another datacenter.
>
> >
> > thanks in advance
> > venkatesh
> >
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Jean-Daniel Cryans <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Sent: Sat, Jul 10, 2010 1:47 pm
> > Subject: Re: real world usage, any web applications built using hbase?
> >
> >
> > At stumbleupon, we have su.pr (url shortner / advertising platform)
> >
> > that's totally based on HBase and has been in production for more than
> >
> > a year. Also many other parts of our main product also rely on HBase.
> >
> >
> >
> > J-D
> >
> >
> >
> > On Sat, Jul 10, 2010 at 10:43 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> >
> >> Its my impression that most people are using nosql solutions for things
> like
> >
> >> statistic logging etc.
> >
> >>
> >
> >> Has anyone build a web application purely in hbase? e.g. Say an
> application
> >
> >> like Blogger or Gmail or vBulletin type applications.
> >
> >>
> >
> >> Are these potential candidates for building ontop of a nosql data store?
> >
> >>
> >
> >
> >
> >
>