Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Acceptable CPU_WIO % ?


+
Jean-Marc Spaggiari 2013-02-08, 01:19
+
Kevin Odell 2013-02-08, 01:43
+
Jean-Marc Spaggiari 2013-02-08, 02:00
+
Kevin Odell 2013-02-08, 02:21
+
Jean-Marc Spaggiari 2013-02-08, 02:50
+
Kevin Odell 2013-02-08, 02:57
+
Jean-Marc Spaggiari 2013-02-08, 03:15
Copy link to this message
-
Re: Acceptable CPU_WIO % ?
JM,

I don't have the context, but if you are using Hadoop/Hbase, so don't do
RAID on your disk.
On Fri, Feb 8, 2013 at 11:15 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Ok. I see. For my usecase I prefer to loose the data and have faster
> process. So I will go for RAID0 and keep the replication factor to
> 3... If at some point I have 5 disks in the node, I will most probably
> give a try to RAID5 and see the performances compared to the other
> RAID/JBOD options.
>
> Is there a "rule", like, 1 HD per core? Or we can't really simplify that
> much?
>
> So far I have that in the sar output:
> 21:35:03          tps      rtps      wtps   bread/s   bwrtn/s
> 21:45:03       218,85    215,97      2,88  45441,95    308,04
> 21:55:02       209,73    206,67      3,06  43985,28    378,32
> 22:05:04       215,03    211,71      3,33  44831,00    312,95
> Average :      214,54    211,45      3,09  44753,09    333,07
>
> But I'm not sure what it means. I will wait for tomorrow to get more
> results, but my job will be done over night, so I'm not sure the
> average will be accurate...
>
> JM
>
>
> 2013/2/7, Kevin O'dell <[EMAIL PROTECTED]>:
> > JM,
> >
> >   I think you misunderstood me.  I am not advocating any form of RAID for
> > Hadoop.  It is true that we already have redundancy built in with HDFS.
>  So
> > unless you were going to do something silly like sacrifice speed to run
> > RAID1 or RAID5 and lower your replication to 2...just don't do it :)
> >  Anyway, yes you probably should have 3 - 4 drives per node if not more.
> >  At that point then the you will really see the benefit of JBOD over
> RAID0
> >
> > Do you want to be able to lose a drive and keep the node up?  If yes,
> then
> > JBOD is for you.  Do you not care if you lose that node due to drive
> > failure? You just need speed, then RAID0 may be the correct choice.  Sar
> > will take some time to populate.  Give it about 24 hours and you should
> be
> > able to glean some interesting information.
> >
> > On Thu, Feb 7, 2013 at 9:50 PM, Jean-Marc Spaggiari
> > <[EMAIL PROTECTED]
> >> wrote:
> >
> >> Ok. I see with RAID0 might be better for me compare to JBOD. Also, why
> >> do we want to use RAID1 or RAID5? We already have the redundancy done
> >> by hadoop, is it not going to add another non-required level of
> >> redundancy?
> >>
> >> Should I already think to have 3 or even 4 drives in each node?
> >>
> >> I tried sar -A and it's only giving me 2 lines.
> >> root@node7:/home/hbase# sar -A
> >> Linux 3.2.0-4-amd64 (node7)     2013-02-07      _x86_64_        (4 CPU)
> >>
> >> 21:29:54          LINUX RESTART
> >>
> >> It was not enabled, so I just enabled it and restart sysstat, but
> >> seems that it's still not populated.
> >>
> >> I have the diskstats plugin installed on ganglia, so I have a LOT of
> >> disks information, but not this specific one.
> >>
> >> My write_bytes_per_sec is pretty low. Average is 232K for the last 2
> >> hours. But my erad_bytes_per_sec is avera 22.83M for the same period.
> >> The graph is looking like a comb.
> >>
> >> I just retried sar and some data is coming.. I will need to let it run
> >> for few more minutes to get some more data ...
> >>
> >> JM
> >>
> >>
> >> 2013/2/7, Kevin O'dell <[EMAIL PROTECTED]>:
> >> > JM,
> >> >
> >> >   Okay, I think I see what was happening.  You currently only have one
> >> > drive in the system that is showing High I/O wait correct?  You are
> >> looking
> >> > at bringing in a second drive to help distribute the load?  In your
> >> testing
> >> > with two drives you saw that RAID0 offerred superior performance vs
> >> > JBOD.
> >> >  Typically when we see RAID vs JBOD we are dealing with about 6 - 12
> >> > drives.  Here are some of the pluses and minuses:
> >> >
> >> > RAID0 - faster performance since the data is striped, but you are as
> >> > fast
> >> > as your slowest drive and one drive failure you lose the whole volume.
> >> >
> >> > JBOD - Better redundancy and faster than a RAID1, or a RAID5
+
Kevin Odell 2013-02-08, 13:56
+
Jean-Marc Spaggiari 2013-02-08, 15:43
+
Kevin Odell 2013-02-08, 16:37
+
Jean-Marc Spaggiari 2013-02-09, 16:13
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB