|
Otis Gospodnetic
2010-01-15, 20:54
Andrew Purtell
2010-01-15, 21:17
Seth Ladd
2010-01-15, 21:22
stack
2010-01-15, 21:49
Ryan Rawson
2010-01-15, 21:50
Jean-Daniel Cryans
2010-01-15, 21:52
Andrew Purtell
2010-01-15, 21:57
Ryan Rawson
2010-01-15, 22:00
Otis Gospodnetic
2010-01-15, 22:23
Ryan Rawson
2010-01-15, 22:27
Otis Gospodnetic
2010-01-15, 22:30
stack
2010-01-15, 22:42
stack
2010-01-15, 22:45
Chris Staszak
2010-01-15, 23:55
Andrew Purtell
2010-01-16, 00:45
Ross Rick
2010-02-04, 20:40
Hubert Chang
2010-02-06, 17:18
Stack
2010-02-06, 22:11
Jonathan Gray
2010-02-07, 01:14
|
-
HBase on 1 box? how big?Otis Gospodnetic 2010-01-15, 20:54
Hello,
I understand running HBase on a single box is kind of pointless (thanks Andrew Purtell for the reply about numbers of boxes)... but I was wondering what kind of box might one need to host/run various HBase/Hadoop processes? Imagine I just need to have "HBase in a box", so to speak. :) I understand it depends on the volume on data, DB structure, request rates... I don't have those numbers, but say I want HBase to have 100M rows with data from Apache logs and want to run the common web analytics/stats reports on a nightly basis. * Would an EC2 Large Instance suffice? -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform * How about EC2 Small Instance? -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform Thanks, Otis P.S. hw specs from http://aws.amazon.com/ec2/#instance
-
Re: HBase on 1 box? how big?Andrew Purtell 2010-01-15, 21:17
On that scale, why not use MySQL or Postgres?
"HBase in a box" is like "dynamic equilibrium", or "virtual reality", or "jumbo shrimp"... :-) - Andy ----- Original Message ---- > From: Otis Gospodnetic <[EMAIL PROTECTED]> > To: hbase[EMAIL PROTECTED] > Sent: Fri, January 15, 2010 12:54:42 PM > Subject: HBase on 1 box? how big? > > Hello, > > I understand running HBase on a single box is kind of > pointless (thanks Andrew Purtell for the reply about numbers of > boxes)... but I was wondering what kind of box might one need to > host/run various HBase/Hadoop processes? > > Imagine I just need to have "HBase in a box", so to speak. :) > > I understand it depends on the volume on data, DB structure, request rates... > I don't have those numbers, but say I want HBase to have 100M rows with > data from Apache logs and want to run the common web analytics/stats > reports on a nightly basis. > > * Would an EC2 Large Instance suffice? > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores > with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit > platform > > * How about EC2 Small Instance? > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform > > Thanks, > Otis > P.S. > hw specs from http://aws.amazon.com/ec2/#instance
-
Re: HBase on 1 box? how big?Seth Ladd 2010-01-15, 21:22
I agree. HBase in a box is essentially MySQL. HBase is built for a cluster.
On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > On that scale, why not use MySQL or Postgres? > > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > "jumbo shrimp"... :-) > > - Andy > > > > ----- Original Message ---- >> From: Otis Gospodnetic <[EMAIL PROTECTED]> >> To: hbase[EMAIL PROTECTED] >> Sent: Fri, January 15, 2010 12:54:42 PM >> Subject: HBase on 1 box? how big? >> >> Hello, >> >> I understand running HBase on a single box is kind of >> pointless (thanks Andrew Purtell for the reply about numbers of >> boxes)... but I was wondering what kind of box might one need to >> host/run various HBase/Hadoop processes? >> >> Imagine I just need to have "HBase in a box", so to speak. :) >> >> I understand it depends on the volume on data, DB structure, request rates... >> I don't have those numbers, but say I want HBase to have 100M rows with >> data from Apache logs and want to run the common web analytics/stats >> reports on a nightly basis. >> >> * Would an EC2 Large Instance suffice? >> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit >> platform >> >> * How about EC2 Small Instance? >> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core >> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform >> >> Thanks, >> Otis >> P.S. >> hw specs from http://aws.amazon.com/ec2/#instance > > > > > >
-
Re: HBase on 1 box? how big?stack 2010-01-15, 21:49
On Fri, Jan 15, 2010 at 1:17 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > "jumbo shrimp"... :-) > > Andrew, thats funny. St.Ack
-
Re: HBase on 1 box? how big?Ryan Rawson 2010-01-15, 21:50
You can run HBase on any size of machine all single node, by default
when you start hbase it will store files in /tmp and everything is in 1 JVM. How much data can you jam in there? I'm not totally sure, probably a lot more than you might think, but again limited by the disk. I run it on my mac laptop for example. I have a patch that will allow a single JVM including zookeeper, but it is locked up in my private git for now. This would get rid of the need to ssh localhost just to start local hbase. -ryan On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd <[EMAIL PROTECTED]> wrote: > I agree. HBase in a box is essentially MySQL. HBase is built for a cluster. > > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> On that scale, why not use MySQL or Postgres? >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or >> "jumbo shrimp"... :-) >> >> - Andy >> >> >> >> ----- Original Message ---- >>> From: Otis Gospodnetic <[EMAIL PROTECTED]> >>> To: hbase[EMAIL PROTECTED] >>> Sent: Fri, January 15, 2010 12:54:42 PM >>> Subject: HBase on 1 box? how big? >>> >>> Hello, >>> >>> I understand running HBase on a single box is kind of >>> pointless (thanks Andrew Purtell for the reply about numbers of >>> boxes)... but I was wondering what kind of box might one need to >>> host/run various HBase/Hadoop processes? >>> >>> Imagine I just need to have "HBase in a box", so to speak. :) >>> >>> I understand it depends on the volume on data, DB structure, request rates... >>> I don't have those numbers, but say I want HBase to have 100M rows with >>> data from Apache logs and want to run the common web analytics/stats >>> reports on a nightly basis. >>> >>> * Would an EC2 Large Instance suffice? >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit >>> platform >>> >>> * How about EC2 Small Instance? >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform >>> >>> Thanks, >>> Otis >>> P.S. >>> hw specs from http://aws.amazon.com/ec2/#instance >> >> >> >> >> >> >
-
Re: HBase on 1 box? how big?Jean-Daniel Cryans 2010-01-15, 21:52
What a tease Ryan! ;)
J-D On Fri, Jan 15, 2010 at 1:50 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > You can run HBase on any size of machine all single node, by default > when you start hbase it will store files in /tmp and everything is in > 1 JVM. How much data can you jam in there? I'm not totally sure, > probably a lot more than you might think, but again limited by the > disk. I run it on my mac laptop for example. > > I have a patch that will allow a single JVM including zookeeper, but > it is locked up in my private git for now. This would get rid of the > need to ssh localhost just to start local hbase. > > -ryan > > On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd <[EMAIL PROTECTED]> wrote: >> I agree. HBase in a box is essentially MySQL. HBase is built for a cluster. >> >> On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >>> On that scale, why not use MySQL or Postgres? >>> >>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or >>> "jumbo shrimp"... :-) >>> >>> - Andy >>> >>> >>> >>> ----- Original Message ---- >>>> From: Otis Gospodnetic <[EMAIL PROTECTED]> >>>> To: hbase[EMAIL PROTECTED] >>>> Sent: Fri, January 15, 2010 12:54:42 PM >>>> Subject: HBase on 1 box? how big? >>>> >>>> Hello, >>>> >>>> I understand running HBase on a single box is kind of >>>> pointless (thanks Andrew Purtell for the reply about numbers of >>>> boxes)... but I was wondering what kind of box might one need to >>>> host/run various HBase/Hadoop processes? >>>> >>>> Imagine I just need to have "HBase in a box", so to speak. :) >>>> >>>> I understand it depends on the volume on data, DB structure, request rates... >>>> I don't have those numbers, but say I want HBase to have 100M rows with >>>> data from Apache logs and want to run the common web analytics/stats >>>> reports on a nightly basis. >>>> >>>> * Would an EC2 Large Instance suffice? >>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >>>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit >>>> platform >>>> >>>> * How about EC2 Small Instance? >>>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core >>>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform >>>> >>>> Thanks, >>>> Otis >>>> P.S. >>>> hw specs from http://aws.amazon.com/ec2/#instance >>> >>> >>> >>> >>> >>> >> >
-
Re: HBase on 1 box? how big?Andrew Purtell 2010-01-15, 21:57
That would be good for developing disconnected against the API. Any
plan on releasing a patch Ryan? - Andy ----- Original Message ---- > From: Ryan Rawson <[EMAIL PROTECTED]> > To: hbase[EMAIL PROTECTED] > Sent: Fri, January 15, 2010 1:50:32 PM > Subject: Re: HBase on 1 box? how big? > > You can run HBase on any size of machine all single node, by default > when you start hbase it will store files in /tmp and everything is in > 1 JVM. How much data can you jam in there? I'm not totally sure, > probably a lot more than you might think, but again limited by the > disk. I run it on my mac laptop for example. > > I have a patch that will allow a single JVM including zookeeper, but > it is locked up in my private git for now. This would get rid of the > need to ssh localhost just to start local hbase. > > -ryan > > On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: > > I agree. HBase in a box is essentially MySQL. HBase is built for a cluster. > > > > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: > >> On that scale, why not use MySQL or Postgres? > >> > >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > >> "jumbo shrimp"... :-) > >> > >> - Andy > >> > >> > >> > >> ----- Original Message ---- > >>> From: Otis Gospodnetic > >>> To: hbase[EMAIL PROTECTED] > >>> Sent: Fri, January 15, 2010 12:54:42 PM > >>> Subject: HBase on 1 box? how big? > >>> > >>> Hello, > >>> > >>> I understand running HBase on a single box is kind of > >>> pointless (thanks Andrew Purtell for the reply about numbers of > >>> boxes)... but I was wondering what kind of box might one need to > >>> host/run various HBase/Hadoop processes? > >>> > >>> Imagine I just need to have "HBase in a box", so to speak. :) > >>> > >>> I understand it depends on the volume on data, DB structure, request > rates... > >>> I don't have those numbers, but say I want HBase to have 100M rows with > >>> data from Apache logs and want to run the common web analytics/stats > >>> reports on a nightly basis. > >>> > >>> * Would an EC2 Large Instance suffice? > >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores > >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit > >>> platform > >>> > >>> * How about EC2 Small Instance? > >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual > core > >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform > >>> > >>> Thanks, > >>> Otis > >>> P.S. > >>> hw specs from http://aws.amazon.com/ec2/#instance > >> > >> > >> > >> > >> > >> > >
-
Re: HBase on 1 box? how big?Ryan Rawson 2010-01-15, 22:00
Yes I do plan on releasing a patch, but i need to rebase it to trunk.
It moves a class from test -> java (ie; the ZK in JVM startup class). maybe soon? -ryan On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > That would be good for developing disconnected against the API. Any > plan on releasing a patch Ryan? > > - Andy > > > > ----- Original Message ---- >> From: Ryan Rawson <[EMAIL PROTECTED]> >> To: hbase[EMAIL PROTECTED] >> Sent: Fri, January 15, 2010 1:50:32 PM >> Subject: Re: HBase on 1 box? how big? >> >> You can run HBase on any size of machine all single node, by default >> when you start hbase it will store files in /tmp and everything is in >> 1 JVM. How much data can you jam in there? I'm not totally sure, >> probably a lot more than you might think, but again limited by the >> disk. I run it on my mac laptop for example. >> >> I have a patch that will allow a single JVM including zookeeper, but >> it is locked up in my private git for now. This would get rid of the >> need to ssh localhost just to start local hbase. >> >> -ryan >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: >> > I agree. HBase in a box is essentially MySQL. HBase is built for a cluster. >> > >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: >> >> On that scale, why not use MySQL or Postgres? >> >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or >> >> "jumbo shrimp"... :-) >> >> >> >> - Andy >> >> >> >> >> >> >> >> ----- Original Message ---- >> >>> From: Otis Gospodnetic >> >>> To: hbase[EMAIL PROTECTED] >> >>> Sent: Fri, January 15, 2010 12:54:42 PM >> >>> Subject: HBase on 1 box? how big? >> >>> >> >>> Hello, >> >>> >> >>> I understand running HBase on a single box is kind of >> >>> pointless (thanks Andrew Purtell for the reply about numbers of >> >>> boxes)... but I was wondering what kind of box might one need to >> >>> host/run various HBase/Hadoop processes? >> >>> >> >>> Imagine I just need to have "HBase in a box", so to speak. :) >> >>> >> >>> I understand it depends on the volume on data, DB structure, request >> rates... >> >>> I don't have those numbers, but say I want HBase to have 100M rows with >> >>> data from Apache logs and want to run the common web analytics/stats >> >>> reports on a nightly basis. >> >>> >> >>> * Would an EC2 Large Instance suffice? >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit >> >>> platform >> >>> >> >>> * How about EC2 Small Instance? >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual >> core >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform >> >>> >> >>> Thanks, >> >>> Otis >> >>> P.S. >> >>> hw specs from http://aws.amazon.com/ec2/#instance >> >> >> >> >> >> >> >> >> >> >> >> >> > > > > > > >
-
Re: HBase on 1 box? how big?Otis Gospodnetic 2010-01-15, 22:23
Sounds like a yummy patch, Ryan, if you need another nudge. :)
Otis ----- Original Message ---- > From: Ryan Rawson <[EMAIL PROTECTED]> > To: hbase[EMAIL PROTECTED] > Sent: Fri, January 15, 2010 5:00:42 PM > Subject: Re: HBase on 1 box? how big? > > Yes I do plan on releasing a patch, but i need to rebase it to trunk. > It moves a class from test -> java (ie; the ZK in JVM startup class). > > maybe soon? > -ryan > > On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote: > > That would be good for developing disconnected against the API. Any > > plan on releasing a patch Ryan? > > > > - Andy > > > > > > > > ----- Original Message ---- > >> From: Ryan Rawson > >> To: hbase[EMAIL PROTECTED] > >> Sent: Fri, January 15, 2010 1:50:32 PM > >> Subject: Re: HBase on 1 box? how big? > >> > >> You can run HBase on any size of machine all single node, by default > >> when you start hbase it will store files in /tmp and everything is in > >> 1 JVM. How much data can you jam in there? I'm not totally sure, > >> probably a lot more than you might think, but again limited by the > >> disk. I run it on my mac laptop for example. > >> > >> I have a patch that will allow a single JVM including zookeeper, but > >> it is locked up in my private git for now. This would get rid of the > >> need to ssh localhost just to start local hbase. > >> > >> -ryan > >> > >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: > >> > I agree. HBase in a box is essentially MySQL. HBase is built for a > cluster. > >> > > >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: > >> >> On that scale, why not use MySQL or Postgres? > >> >> > >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > >> >> "jumbo shrimp"... :-) > >> >> > >> >> - Andy > >> >> > >> >> > >> >> > >> >> ----- Original Message ---- > >> >>> From: Otis Gospodnetic > >> >>> To: hbase[EMAIL PROTECTED] > >> >>> Sent: Fri, January 15, 2010 12:54:42 PM > >> >>> Subject: HBase on 1 box? how big? > >> >>> > >> >>> Hello, > >> >>> > >> >>> I understand running HBase on a single box is kind of > >> >>> pointless (thanks Andrew Purtell for the reply about numbers of > >> >>> boxes)... but I was wondering what kind of box might one need to > >> >>> host/run various HBase/Hadoop processes? > >> >>> > >> >>> Imagine I just need to have "HBase in a box", so to speak. :) > >> >>> > >> >>> I understand it depends on the volume on data, DB structure, request > >> rates... > >> >>> I don't have those numbers, but say I want HBase to have 100M rows with > >> >>> data from Apache logs and want to run the common web analytics/stats > >> >>> reports on a nightly basis. > >> >>> > >> >>> * Would an EC2 Large Instance suffice? > >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores > >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit > >> >>> platform > >> >>> > >> >>> * How about EC2 Small Instance? > >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 > virtual > >> core > >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit > platform > >> >>> > >> >>> Thanks, > >> >>> Otis > >> >>> P.S. > >> >>> hw specs from http://aws.amazon.com/ec2/#instance > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> > > > > > > > > > > > > >
-
Re: HBase on 1 box? how big?Ryan Rawson 2010-01-15, 22:27
i hadda resolve like 40 files of conflicts :-/
what i really need though is a tool so that start-hbase.sh wont do the 'normal' thing and just do hbase-daemon.sh start master when running in standalone mode. -ryan On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Sounds like a yummy patch, Ryan, if you need another nudge. :) > > Otis > > > > ----- Original Message ---- >> From: Ryan Rawson <[EMAIL PROTECTED]> >> To: hbase[EMAIL PROTECTED] >> Sent: Fri, January 15, 2010 5:00:42 PM >> Subject: Re: HBase on 1 box? how big? >> >> Yes I do plan on releasing a patch, but i need to rebase it to trunk. >> It moves a class from test -> java (ie; the ZK in JVM startup class). >> >> maybe soon? >> -ryan >> >> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote: >> > That would be good for developing disconnected against the API. Any >> > plan on releasing a patch Ryan? >> > >> > - Andy >> > >> > >> > >> > ----- Original Message ---- >> >> From: Ryan Rawson >> >> To: hbase[EMAIL PROTECTED] >> >> Sent: Fri, January 15, 2010 1:50:32 PM >> >> Subject: Re: HBase on 1 box? how big? >> >> >> >> You can run HBase on any size of machine all single node, by default >> >> when you start hbase it will store files in /tmp and everything is in >> >> 1 JVM. How much data can you jam in there? I'm not totally sure, >> >> probably a lot more than you might think, but again limited by the >> >> disk. I run it on my mac laptop for example. >> >> >> >> I have a patch that will allow a single JVM including zookeeper, but >> >> it is locked up in my private git for now. This would get rid of the >> >> need to ssh localhost just to start local hbase. >> >> >> >> -ryan >> >> >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: >> >> > I agree. HBase in a box is essentially MySQL. HBase is built for a >> cluster. >> >> > >> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: >> >> >> On that scale, why not use MySQL or Postgres? >> >> >> >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or >> >> >> "jumbo shrimp"... :-) >> >> >> >> >> >> - Andy >> >> >> >> >> >> >> >> >> >> >> >> ----- Original Message ---- >> >> >>> From: Otis Gospodnetic >> >> >>> To: hbase[EMAIL PROTECTED] >> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM >> >> >>> Subject: HBase on 1 box? how big? >> >> >>> >> >> >>> Hello, >> >> >>> >> >> >>> I understand running HBase on a single box is kind of >> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of >> >> >>> boxes)... but I was wondering what kind of box might one need to >> >> >>> host/run various HBase/Hadoop processes? >> >> >>> >> >> >>> Imagine I just need to have "HBase in a box", so to speak. :) >> >> >>> >> >> >>> I understand it depends on the volume on data, DB structure, request >> >> rates... >> >> >>> I don't have those numbers, but say I want HBase to have 100M rows with >> >> >>> data from Apache logs and want to run the common web analytics/stats >> >> >>> reports on a nightly basis. >> >> >>> >> >> >>> * Would an EC2 Large Instance suffice? >> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit >> >> >>> platform >> >> >>> >> >> >>> * How about EC2 Small Instance? >> >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 >> virtual >> >> core >> >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit >> platform >> >> >>> >> >> >>> Thanks, >> >> >>> Otis >> >> >>> P.S. >> >> >>> hw specs from http://aws.amazon.com/ec2/#instance >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> > >> > >> > >> > >>
-
Re: HBase on 1 box? how big?Otis Gospodnetic 2010-01-15, 22:30
Heh, I like the analogies! :)
Yes, it makes no sense to use HBase for production data volumes, etc., but this might be handy for development. Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) on 1 box, so that one can easily move it to a proper, big cluster, without re-engineering or replacing any of the components. For example, you may have an app that you want to demo to a customer, and you can't ask them for N boxes for the demo. But you can ask them for 1 box to install something on. Or maybe you can run everything from a memory stick? ;) Hey, is there a technical reason why having all jars, scripts, configs, etc. on a stick, and have the configs point to dirs on the stick for holding data? I'm not joking, really! :) Thanks, Otis ----- Original Message ---- > From: Andrew Purtell <[EMAIL PROTECTED]> > To: hbase[EMAIL PROTECTED] > Sent: Fri, January 15, 2010 4:17:35 PM > Subject: Re: HBase on 1 box? how big? > > On that scale, why not use MySQL or Postgres? > > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > "jumbo shrimp"... :-) > > - Andy > > > > ----- Original Message ---- > > From: Otis Gospodnetic > > To: hbase[EMAIL PROTECTED] > > Sent: Fri, January 15, 2010 12:54:42 PM > > Subject: HBase on 1 box? how big? > > > > Hello, > > > > I understand running HBase on a single box is kind of > > pointless (thanks Andrew Purtell for the reply about numbers of > > boxes)... but I was wondering what kind of box might one need to > > host/run various HBase/Hadoop processes? > > > > Imagine I just need to have "HBase in a box", so to speak. :) > > > > I understand it depends on the volume on data, DB structure, request rates... > > I don't have those numbers, but say I want HBase to have 100M rows with > > data from Apache logs and want to run the common web analytics/stats > > reports on a nightly basis. > > > > * Would an EC2 Large Instance suffice? > > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores > > with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit > > platform > > > > * How about EC2 Small Instance? > > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual > core > > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform > > > > Thanks, > > Otis > > P.S. > > hw specs from http://aws.amazon.com/ec2/#instance
-
Re: HBase on 1 box? how big?stack 2010-01-15, 22:42
How about we add a 'standalone' argument to bin/hbase? It'd check the
hbase-site.xml to see it has right standalone basic config. and then it'd pass switches to start all up in the one JVM? St.Ack On Fri, Jan 15, 2010 at 2:27 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > i hadda resolve like 40 files of conflicts :-/ > > what i really need though is a tool so that start-hbase.sh wont do the > 'normal' thing and just do hbase-daemon.sh start master when running > in standalone mode. > > -ryan > > On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic > <[EMAIL PROTECTED]> wrote: > > Sounds like a yummy patch, Ryan, if you need another nudge. :) > > > > Otis > > > > > > > > ----- Original Message ---- > >> From: Ryan Rawson <[EMAIL PROTECTED]> > >> To: hbase[EMAIL PROTECTED] > >> Sent: Fri, January 15, 2010 5:00:42 PM > >> Subject: Re: HBase on 1 box? how big? > >> > >> Yes I do plan on releasing a patch, but i need to rebase it to trunk. > >> It moves a class from test -> java (ie; the ZK in JVM startup class). > >> > >> maybe soon? > >> -ryan > >> > >> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote: > >> > That would be good for developing disconnected against the API. Any > >> > plan on releasing a patch Ryan? > >> > > >> > - Andy > >> > > >> > > >> > > >> > ----- Original Message ---- > >> >> From: Ryan Rawson > >> >> To: hbase[EMAIL PROTECTED] > >> >> Sent: Fri, January 15, 2010 1:50:32 PM > >> >> Subject: Re: HBase on 1 box? how big? > >> >> > >> >> You can run HBase on any size of machine all single node, by default > >> >> when you start hbase it will store files in /tmp and everything is in > >> >> 1 JVM. How much data can you jam in there? I'm not totally sure, > >> >> probably a lot more than you might think, but again limited by the > >> >> disk. I run it on my mac laptop for example. > >> >> > >> >> I have a patch that will allow a single JVM including zookeeper, but > >> >> it is locked up in my private git for now. This would get rid of the > >> >> need to ssh localhost just to start local hbase. > >> >> > >> >> -ryan > >> >> > >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: > >> >> > I agree. HBase in a box is essentially MySQL. HBase is built for > a > >> cluster. > >> >> > > >> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: > >> >> >> On that scale, why not use MySQL or Postgres? > >> >> >> > >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual > reality", or > >> >> >> "jumbo shrimp"... :-) > >> >> >> > >> >> >> - Andy > >> >> >> > >> >> >> > >> >> >> > >> >> >> ----- Original Message ---- > >> >> >>> From: Otis Gospodnetic > >> >> >>> To: hbase[EMAIL PROTECTED] > >> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM > >> >> >>> Subject: HBase on 1 box? how big? > >> >> >>> > >> >> >>> Hello, > >> >> >>> > >> >> >>> I understand running HBase on a single box is kind of > >> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of > >> >> >>> boxes)... but I was wondering what kind of box might one need to > >> >> >>> host/run various HBase/Hadoop processes? > >> >> >>> > >> >> >>> Imagine I just need to have "HBase in a box", so to speak. :) > >> >> >>> > >> >> >>> I understand it depends on the volume on data, DB structure, > request > >> >> rates... > >> >> >>> I don't have those numbers, but say I want HBase to have 100M > rows with > >> >> >>> data from Apache logs and want to run the common web > analytics/stats > >> >> >>> reports on a nightly basis. > >> >> >>> > >> >> >>> * Would an EC2 Large Instance suffice? > >> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 > virtual cores > >> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage,
-
Re: HBase on 1 box? how big?stack 2010-01-15, 22:45
On Fri, Jan 15, 2010 at 2:30 PM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote: > For example, you may have an app that you want to demo to a customer, and > you can't ask them for N boxes for the demo. But you can ask them for 1 box > to install something on. > > Can't you do this now? Just do ./bin/start-hbase.sh with the default config? It requires ssh'ing to localhost but that ain't too hard to set up? > Or maybe you can run everything from a memory stick? ;) Hey, is there a > technical reason why having all jars, scripts, configs, etc. on a stick, and > have the configs point to dirs on the stick for holding data? I'm not > joking, really! :) > > You can point hbase to non-standard location for configs -- see hbase-env.sh -- and same for logging so I don't see reason why you couldn't do hbase-on-a-stick (Could go nicely with a few of those jumbo shrimp). St.Ack > Thanks, > Otis > > > ----- Original Message ---- > > From: Andrew Purtell <[EMAIL PROTECTED]> > > To: hbase[EMAIL PROTECTED] > > Sent: Fri, January 15, 2010 4:17:35 PM > > Subject: Re: HBase on 1 box? how big? > > > > On that scale, why not use MySQL or Postgres? > > > > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > > "jumbo shrimp"... :-) > > > > - Andy > > > > > > > > ----- Original Message ---- > > > From: Otis Gospodnetic > > > To: hbase[EMAIL PROTECTED] > > > Sent: Fri, January 15, 2010 12:54:42 PM > > > Subject: HBase on 1 box? how big? > > > > > > Hello, > > > > > > I understand running HBase on a single box is kind of > > > pointless (thanks Andrew Purtell for the reply about numbers of > > > boxes)... but I was wondering what kind of box might one need to > > > host/run various HBase/Hadoop processes? > > > > > > Imagine I just need to have "HBase in a box", so to speak. :) > > > > > > I understand it depends on the volume on data, DB structure, request > rates... > > > I don't have those numbers, but say I want HBase to have 100M rows with > > > data from Apache logs and want to run the common web analytics/stats > > > reports on a nightly basis. > > > > > > * Would an EC2 Large Instance suffice? > > > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual > cores > > > with 2 EC2 Compute Units each), 850 GB of local instance storage, > 64-bit > > > platform > > > > > > * How about EC2 Small Instance? > > > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 > virtual > > core > > > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit > platform > > > > > > Thanks, > > > Otis > > > P.S. > > > hw specs from http://aws.amazon.com/ec2/#instance > >
-
Re: HBase on 1 box? how big?Chris Staszak 2010-01-15, 23:55
+1 for this feature.
I understand some of the questioning along the lines of "why not use PostgreSQL/MySQL" for a data store that just runs on one host. However, the driver for me (and I suspect for a growing number of people) is to write one piece of code that runs at any scale. For some uses a single host/jvm makes perfect sense: development, demos or limited production data size and transaction volume. Furthermore, this could greatly simplify demos or small scale deployments on Windows (removing the ssh requirement). On Fri, Jan 15, 2010 at 2:42 PM, stack <[EMAIL PROTECTED]> wrote: > How about we add a 'standalone' argument to bin/hbase? It'd check the > hbase-site.xml to see it has right standalone basic config. and then it'd > pass switches to start all up in the one JVM? > St.Ack > > On Fri, Jan 15, 2010 at 2:27 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > >> i hadda resolve like 40 files of conflicts :-/ >> >> what i really need though is a tool so that start-hbase.sh wont do the >> 'normal' thing and just do hbase-daemon.sh start master when running >> in standalone mode. >> >> -ryan >> >> On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic >> <[EMAIL PROTECTED]> wrote: >> > Sounds like a yummy patch, Ryan, if you need another nudge. :) >> > >> > Otis >> > >> > >> > >> > ----- Original Message ---- >> >> From: Ryan Rawson <[EMAIL PROTECTED]> >> >> To: hbase[EMAIL PROTECTED] >> >> Sent: Fri, January 15, 2010 5:00:42 PM >> >> Subject: Re: HBase on 1 box? how big? >> >> >> >> Yes I do plan on releasing a patch, but i need to rebase it to trunk. >> >> It moves a class from test -> java (ie; the ZK in JVM startup class). >> >> >> >> maybe soon? >> >> -ryan >> >> >> >> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote: >> >> > That would be good for developing disconnected against the API. Any >> >> > plan on releasing a patch Ryan? >> >> > >> >> > - Andy >> >> > >> >> > >> >> > >> >> > ----- Original Message ---- >> >> >> From: Ryan Rawson >> >> >> To: hbase[EMAIL PROTECTED] >> >> >> Sent: Fri, January 15, 2010 1:50:32 PM >> >> >> Subject: Re: HBase on 1 box? how big? >> >> >> >> >> >> You can run HBase on any size of machine all single node, by default >> >> >> when you start hbase it will store files in /tmp and everything is in >> >> >> 1 JVM. How much data can you jam in there? I'm not totally sure, >> >> >> probably a lot more than you might think, but again limited by the >> >> >> disk. I run it on my mac laptop for example. >> >> >> >> >> >> I have a patch that will allow a single JVM including zookeeper, but >> >> >> it is locked up in my private git for now. This would get rid of the >> >> >> need to ssh localhost just to start local hbase. >> >> >> >> >> >> -ryan >> >> >> >> >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote: >> >> >> > I agree. HBase in a box is essentially MySQL. HBase is built for >> a >> >> cluster. >> >> >> > >> >> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote: >> >> >> >> On that scale, why not use MySQL or Postgres? >> >> >> >> >> >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual >> reality", or >> >> >> >> "jumbo shrimp"... :-) >> >> >> >> >> >> >> >> - Andy >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ----- Original Message ---- >> >> >> >>> From: Otis Gospodnetic >> >> >> >>> To: hbase[EMAIL PROTECTED] >> >> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM >> >> >> >>> Subject: HBase on 1 box? how big? >> >> >> >>> >> >> >> >>> Hello, >> >> >> >>> >> >> >> >>> I understand running HBase on a single box is kind of >> >> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of >> >> >> >>> boxes)... but I was wondering what kind of box might one need to >> >> >> >>> host/run various HBase/Hadoop processes?
-
Re: HBase on 1 box? how big?Andrew Purtell 2010-01-16, 00:45
As long as we are all clear about the usefulness of a single host system.
For map-reduce over BigTable, nothing more than development, functional testing, and toy demo scenarios. - Andy ----- Original Message ---- > From: Otis Gospodnetic <[EMAIL PROTECTED]> > To: hbase[EMAIL PROTECTED] > Sent: Fri, January 15, 2010 2:30:34 PM > Subject: Re: HBase on 1 box? how big? > > Heh, I like the analogies! :) > Yes, it makes no sense to use HBase for production data volumes, etc., but this > might be handy for development. > Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) > on 1 box, so that one can easily move it to a proper, big cluster, without > re-engineering or replacing any of the components. > > For example, you may have an app that you want to demo to a customer, and you > can't ask them for N boxes for the demo. But you can ask them for 1 box to > install something on. > > Or maybe you can run everything from a memory stick? ;) Hey, is there a > technical reason why having all jars, scripts, configs, etc. on a stick, and > have the configs point to dirs on the stick for holding data? I'm not joking, > really! :) > > Thanks, > Otis > > > ----- Original Message ---- > > From: Andrew Purtell > > To: hbase[EMAIL PROTECTED] > > Sent: Fri, January 15, 2010 4:17:35 PM > > Subject: Re: HBase on 1 box? how big? > > > > On that scale, why not use MySQL or Postgres? > > > > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or > > "jumbo shrimp"... :-) > > > > - Andy > > > > > > > > ----- Original Message ---- > > > From: Otis Gospodnetic > > > To: hbase[EMAIL PROTECTED] > > > Sent: Fri, January 15, 2010 12:54:42 PM > > > Subject: HBase on 1 box? how big? > > > > > > Hello, > > > > > > I understand running HBase on a single box is kind of > > > pointless (thanks Andrew Purtell for the reply about numbers of > > > boxes)... but I was wondering what kind of box might one need to > > > host/run various HBase/Hadoop processes? > > > > > > Imagine I just need to have "HBase in a box", so to speak. :) > > > > > > I understand it depends on the volume on data, DB structure, request > rates... > > > I don't have those numbers, but say I want HBase to have 100M rows with > > > data from Apache logs and want to run the common web analytics/stats > > > reports on a nightly basis. > > > > > > * Would an EC2 Large Instance suffice? > > > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores > > > with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit > > > platform > > > > > > * How about EC2 Small Instance? > > > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual > > core > > > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform > > > > > > Thanks, > > > Otis > > > P.S. > > > hw specs from http://aws.amazon.com/ec2/#instance
-
Re: HBase on 1 box? how big?Ross Rick 2010-02-04, 20:40
Allow me to disagree and take a few arrows here. Not picking at what is currently being proposed, but rather with a perception of the future.
In my mind, Applications are going to continue to blur the lines from phone to cloud. Users are not normally inclined to accept answers like, "Well this is a different platform, you have to do something completely different". They expect differences between platforms, but the successes of the future will likely smooth those differences rather than accentuate them. Developers will surely want to use the same storage paradigm for all scales and let the different system manage their scales. For example, my application uses HSQLDB for the desktop and Oracle/Whatever for Enterprise, but damn near the same model is used. I would want to use a properly scaled HBase for the desktop, and then, as is appropriate for my app, push that data to cluster. This process is nearly seamless when the underlying language is the same. Success in the future isn't likely just about how big you can get. Dare I say it, it's probably more like 'rightsizing' your data. Rick On Jan 15, 2010, at 4:45 PM, Andrew Purtell wrote: > As long as we are all clear about the usefulness of a single host system. > For map-reduce over BigTable, nothing more than development, functional > testing, and toy demo scenarios. > > - Andy > > > > ----- Original Message ---- >> From: Otis Gospodnetic <[EMAIL PROTECTED]> >> To: hbase[EMAIL PROTECTED] >> Sent: Fri, January 15, 2010 2:30:34 PM >> Subject: Re: HBase on 1 box? how big? >> >> Heh, I like the analogies! :) >> Yes, it makes no sense to use HBase for production data volumes, etc., but this >> might be handy for development. >> Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) >> on 1 box, so that one can easily move it to a proper, big cluster, without >> re-engineering or replacing any of the components. >> >> For example, you may have an app that you want to demo to a customer, and you >> can't ask them for N boxes for the demo. But you can ask them for 1 box to >> install something on. >> >> Or maybe you can run everything from a memory stick? ;) Hey, is there a >> technical reason why having all jars, scripts, configs, etc. on a stick, and >> have the configs point to dirs on the stick for holding data? I'm not joking, >> really! :) >> >> Thanks, >> Otis >> >> >> ----- Original Message ---- >>> From: Andrew Purtell >>> To: hbase[EMAIL PROTECTED] >>> Sent: Fri, January 15, 2010 4:17:35 PM >>> Subject: Re: HBase on 1 box? how big? >>> >>> On that scale, why not use MySQL or Postgres? >>> >>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or >>> "jumbo shrimp"... :-) >>> >>> - Andy >>> >>> >>> >>> ----- Original Message ---- >>>> From: Otis Gospodnetic >>>> To: hbase[EMAIL PROTECTED] >>>> Sent: Fri, January 15, 2010 12:54:42 PM >>>> Subject: HBase on 1 box? how big? >>>> >>>> Hello, >>>> >>>> I understand running HBase on a single box is kind of >>>> pointless (thanks Andrew Purtell for the reply about numbers of >>>> boxes)... but I was wondering what kind of box might one need to >>>> host/run various HBase/Hadoop processes? >>>> >>>> Imagine I just need to have "HBase in a box", so to speak. :) >>>> >>>> I understand it depends on the volume on data, DB structure, request >> rates... >>>> I don't have those numbers, but say I want HBase to have 100M rows with >>>> data from Apache logs and want to run the common web analytics/stats >>>> reports on a nightly basis. >>>> >>>> * Would an EC2 Large Instance suffice? >>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores >>>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
-
Re: HBase on 1 box? how big?Hubert Chang 2010-02-06, 17:18
Agree with you. One person can deploy Wordpress blog system as his site and one big enterprise can deploy Wordpress blog system as the enterprise blog platform. But with HBase, you could not develop a Wordpress-like product because it's suited for 5 more nodes and not for 1 node. Ross Rick-2 wrote: > > Allow me to disagree and take a few arrows here. Not picking at what is > currently being proposed, but rather with a perception of the future. > > In my mind, Applications are going to continue to blur the lines from > phone to cloud. Users are not normally inclined to accept answers like, > "Well this is a different platform, you have to do something completely > different". They expect differences between platforms, but the successes > of the future will likely smooth those differences rather than accentuate > them. > > Developers will surely want to use the same storage paradigm for all > scales and let the different system manage their scales. For example, my > application uses HSQLDB for the desktop and Oracle/Whatever for > Enterprise, but damn near the same model is used. > > I would want to use a properly scaled HBase for the desktop, and then, as > is appropriate for my app, push that data to cluster. This process is > nearly seamless when the underlying language is the same. Success in the > future isn't likely just about how big you can get. Dare I say it, it's > probably more like 'rightsizing' your data. > > Rick > > > On Jan 15, 2010, at 4:45 PM, Andrew Purtell wrote: > >> As long as we are all clear about the usefulness of a single host system. >> For map-reduce over BigTable, nothing more than development, functional >> testing, and toy demo scenarios. >> >> - Andy >> >> >> >> ----- Original Message ---- >>> From: Otis Gospodnetic <[EMAIL PROTECTED]> >>> To: hbase[EMAIL PROTECTED] >>> Sent: Fri, January 15, 2010 2:30:34 PM >>> Subject: Re: HBase on 1 box? how big? >>> >>> Heh, I like the analogies! :) >>> Yes, it makes no sense to use HBase for production data volumes, etc., >>> but this >>> might be handy for development. >>> Or for a demo that needs to consists of the same pieces (daemons, >>> configs, etc.) >>> on 1 box, so that one can easily move it to a proper, big cluster, >>> without >>> re-engineering or replacing any of the components. >>> >>> For example, you may have an app that you want to demo to a customer, >>> and you >>> can't ask them for N boxes for the demo. But you can ask them for 1 box >>> to >>> install something on. >>> >>> Or maybe you can run everything from a memory stick? ;) Hey, is there a >>> technical reason why having all jars, scripts, configs, etc. on a stick, >>> and >>> have the configs point to dirs on the stick for holding data? I'm not >>> joking, >>> really! :) >>> >>> Thanks, >>> Otis >>> >>> >>> ----- Original Message ---- >>>> From: Andrew Purtell >>>> To: hbase[EMAIL PROTECTED] >>>> Sent: Fri, January 15, 2010 4:17:35 PM >>>> Subject: Re: HBase on 1 box? how big? >>>> >>>> On that scale, why not use MySQL or Postgres? >>>> >>>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", >>>> or >>>> "jumbo shrimp"... :-) >>>> >>>> - Andy >>>> >>>> >>>> >>>> ----- Original Message ---- >>>>> From: Otis Gospodnetic >>>>> To: hbase[EMAIL PROTECTED] >>>>> Sent: Fri, January 15, 2010 12:54:42 PM >>>>> Subject: HBase on 1 box? how big? >>>>> >>>>> Hello, >>>>> >>>>> I understand running HBase on a single box is kind of >>>>> pointless (thanks Andrew Purtell for the reply about numbers of >>>>> boxes)... but I was wondering what kind of box might one need to >>>>> host/run various HBase/Hadoop processes? >>>>> >>>>> Imagine I just need to have "HBase in a box", so to speak. :) View this message in context: http://old.nabble.com/HBase-on-1-box--how-big--tp27183442p27481523.html Sent from the HBase User mailing list archive at Nabble.com.
-
Re: HBase on 1 box? how big?Stack 2010-02-06, 22:11
On Thu, Feb 4, 2010 at 12:40 PM, Ross Rick <[EMAIL PROTECTED]> wrote:
> I would want to use a properly scaled HBase for the desktop, and then, as is appropriate for my app, push that data to cluster. This process is nearly seamless when the underlying language is the same. Success in the future isn't likely just about how big you can get. Dare I say it, it's probably more like 'rightsizing' your data. > You are right. We should have a better one-box story than we do. Development is generally off at the other end of the scale, on making the multinode installs run smooth with a working-one-box getting only as much attention as it takes to run unit tests and simple loadings. Our other-than-single-box -- or two to three nodes even -- focus has probably hurt us over time since thats what noobs start on. If they don't get a good feeling running on a small cluster, why would they expect it to be different when they move beyond that. Its my sense that If someone was up for working on our one-box story, they'd only get encouragement. Thanks for writing, St.Ack
-
RE: HBase on 1 box? how big?Jonathan Gray 2010-02-07, 01:14
A bit late to the party but my two cents...
I am currently using a single node HBase instance in production (beta) for a client. The use case is simply to add random access capabilities atop some large HDFS files. It's static data (rebuilt every few weeks) and close to 1TB or so (with plans to be more than 10X that within months). Attempts at loading it into simpler KV stores or MySQL proved to be very time consuming. Instead I simply converted from the existing MapFiles into HFiles using HFileOutputFormat, and am serving it using a single node instance of HBase. There is no attempt at high availability, obviously. The lookups are fast enough (slowest is 10s of ms), there is no significant concurrency (10 req/sec at the high end) so this is not a concern right now, I can rebuild the entire DB in a few minutes, and hot data gets cached via the LRU block cache. It's also ready to scale out as necessary and gives us more capacity than we would ever need just by adding more nodes. But be warned... Not only can this kind of setup not give you high availability, if you aren't careful, you'll get quite low availability. The kinds of shops that might run a single node of HBase might also be sharing that node with other processes/services. Be careful not to cause CPU/IO starvation as GC pauses, ZK timeouts, etc... can take down a single node of HBase rather easily. I'm not sure single nodes of HBase make much sense for MapReduce/analytics workloads. The reason it works well in this situation is the concurrency is very low and they are fully random, single key reads. HBase is really just adding a small layer above HDFS, acting as an indexed HFile reader and block cacher. Streaming data to/from HBase will always be less efficient than to/from HDFS. JG -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Stack Sent: Saturday, February 06, 2010 2:11 PM To: hbase[EMAIL PROTECTED] Subject: Re: HBase on 1 box? how big? On Thu, Feb 4, 2010 at 12:40 PM, Ross Rick <[EMAIL PROTECTED]> wrote: > I would want to use a properly scaled HBase for the desktop, and then, as is appropriate for my app, push that data to cluster. This process is nearly seamless when the underlying language is the same. Success in the future isn't likely just about how big you can get. Dare I say it, it's probably more like 'rightsizing' your data. > You are right. We should have a better one-box story than we do. Development is generally off at the other end of the scale, on making the multinode installs run smooth with a working-one-box getting only as much attention as it takes to run unit tests and simple loadings. Our other-than-single-box -- or two to three nodes even -- focus has probably hurt us over time since thats what noobs start on. If they don't get a good feeling running on a small cluster, why would they expect it to be different when they move beyond that. Its my sense that If someone was up for working on our one-box story, they'd only get encouragement. Thanks for writing, St.Ack |