Glad to have you aboard. One thing to look at is your max map and reduce
slots that you are currently allowing. Typically, we look at the CPU
architecture and say if it is not HT(hyperthreaded) then it is a 1:1, if it
is using HT 1:1.5. Dual quad core without HT you would be able to use 8
total MR slots, but since you have HBase you should give your self a couple
slots. This means only using 6 MR slots. Dual quad core with HT you would
have 16 logical cores, you could use 12 MR slots, but since you have HBase
you want to leave a couple cores. This means only using 9 or 10 slots for
MR. This can help with some of the pressure from using MR/hive/pig on the
As for separating MR and HBase. You could break down your processes so
that TT run on some nodes and RS run on others, but typically people will
setup two separate clusters.
On Thu, Jan 17, 2013 at 12:24 PM, Chalcy Raja <[EMAIL PROTECTED]
> Hi HBASE Gurus,
> I am Chalcy Raja and I joined the hbase group yesterday. I am already a
> member of hive and sqoop user groups. Looking forward to learn and share
> information about hbase here!
> Have a question: We have a cluster where we run hive jobs and also hbase.
> There are stability issues like region servers just die. We are looking
> into fine tuning. When I read about performance and also heard from
> another user is separate mapreduce from hbase. How do I do that? If I
> understand that as running tasktrackers on some and hbase region servers on
> some, then we will run into data locality issues and I believe it will
> perform poorly.
> Definitely I am not the only one running into this issue. Any thoughts on
> how to resolve this issue?
Customer Operations Engineer, Cloudera