Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> recommended nodes

Copy link to this message
Re: recommended nodes
Finally, it took me a while to run those tests because it was way
longer than expected, but here are the results:


LVM is not really slower than JBOD and not really taking more CPU. So
I will say, if you have to choose between the 2, take the one you
prefer. Personally, I prefer LVM because it's easy to configure.

The big winner here is RAID0. It's WAY faster than anything else. But
it's using twice the space... Your choice.

I did not get a chance to test with the Ubuntu tool because it's not
working with LVM drives.


2012/11/28, Michael Segel <[EMAIL PROTECTED]>:
> Ok, just a caveat.
> I am discussing MapR as part of a complete response. As Mohit posted MapR
> takes the raw device for their MapR File System.
> They do stripe on their own within what they call a volume.
> But going back to Apache...
> You can stripe drives, however I wouldn't recommend it. I don't think the
> performance gains would really matter.
> You're going to end up getting blocked first by disk i/o, then your
> controller card, then your network... assuming 10GBe.
> With only 2 disks on an 8 core system, you will hit disk i/o first and then
> you'll watch your CPU Wait I/O climb.
> -Mike
> On Nov 28, 2012, at 7:28 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]>
> wrote:
>> Hi Mike,
>> Why not using LVM with MapR? Since LVM is reading from 2 drives almost
>> at the same time, it should be better than RAID0 or a single drive,
>> no?
>> 2012/11/28, Michael Segel <[EMAIL PROTECTED]>:
>>> Just a couple of things.
>>> I'm neutral on the use of LVMs. Some would point out that there's some
>>> overhead, but on the flip side, it can make managing the machines
>>> easier.
>>> If you're using MapR, you don't want to use LVMs but raw devices.
>>> In terms of GC, its going to depend on the heap size and not the total
>>> memory. With respect to HBase. ... MSLABS is the way to go.
>>> On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari
>>> wrote:
>>>> Hi Gregory,
>>>> I founs this about LVM:
>>>> -> http://blog.andrew.net.au/2006/08/09
>>>> ->
>>>> http://www.phoronix.com/scan.php?page=article&item=fedora_15_lvm&num=2
>>>> Seems that performances are still correct with it. I will most
>>>> probably give it a try and bench that too... I have one new hard drive
>>>> which should arrived tomorrow. Perfect timing ;)
>>>> JM
>>>> 2012/11/28, Mohit Anchlia <[EMAIL PROTECTED]>:
>>>>> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>>> Does HBase really benefit from 64 GB of RAM since allocating too
>>>>>> large
>>>>>> heap
>>>>>> might increase GC time ?
>>>>> Benefit you get is from OS cache
>>>>>> Another question : why not RAID 0, in order to aggregate disk
>>>>>> bandwidth
>>>>>> ?
>>>>>> (and thus keep 3x replication factor)
>>>>>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel
>>>>>> <[EMAIL PROTECTED]>wrote:
>>>>>>> Sorry,
>>>>>>> I need to clarify.
>>>>>>> 4GB per physical core is a good starting point.
>>>>>>> So with 2 quad core chips, that is going to be 32GB.
>>>>>>> IMHO that's a minimum. If you go with HBase, you will want more.
>>>>>>> (Actually
>>>>>>> you will need more.) The next logical jump would be to 48 or 64GB.
>>>>>>> If we start to price out memory, depending on vendor, your company's
>>>>>>> procurement,  there really isn't much of a price difference in terms
>>>>>>> of
>>>>>>> 32,48, or 64 GB.
>>>>>>> Note that it also depends on the chips themselves. Also you need to
>>>>>>> see
>>>>>>> how many memory channels exist in the mother board. You may need to
>>>>>>> buy
>>>>>>> in
>>>>>>> pairs or triplets. Your hardware vendor can help you. (Also you need
>>>>>>> to
>>>>>>> keep an eye on your hardware vendor. Sometimes they will give you
>>>>>>> higher