|
Jean-Marc Spaggiari
2013-01-27, 14:28
Kevin O'dell
2013-01-27, 15:03
Jean-Marc Spaggiari
2013-01-27, 15:43
Kevin O'dell
2013-01-27, 16:16
Jean-Marc Spaggiari
2013-01-27, 16:33
Kevin O'dell
2013-01-28, 14:45
karunakar
2013-01-29, 01:46
Jean-Marc Spaggiari
2013-01-29, 20:26
|
-
HBase vs Hadoop memory configuration.Jean-Marc Spaggiari 2013-01-27, 14:28
Hi,
I saw on another message that hadoop only need 1GB... Today, I have configured my nodes with 45% memory for HBase, 45% memory for Hadoop. The last 10% are for the OS. Should I move that with 1GB for Hadoop, 10% for the OS and the rest for HBase? Even if running MR jobs? Thanks, JM
-
Re: HBase vs Hadoop memory configuration.Kevin O'dell 2013-01-27, 15:03
Hey JM,
I suspect they are referring to the DN process only. It is important in these discussion to talk about individual component memory usage. In my experience most HBase clusters only need 1 - 2 GB of heap space for the DN process. I am not a Map Reduce expert, but typically the actual TT process only needs 1GB of memory then you control everything else through max slots and child heap. What is your current block count per DN? On Sun, Jan 27, 2013 at 9:28 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > Hi, > > I saw on another message that hadoop only need 1GB... > > Today, I have configured my nodes with 45% memory for HBase, 45% > memory for Hadoop. The last 10% are for the OS. > > Should I move that with 1GB for Hadoop, 10% for the OS and the rest > for HBase? Even if running MR jobs? > > Thanks, > > JM > -- Kevin O'Dell Customer Operations Engineer, Cloudera
-
Re: HBase vs Hadoop memory configuration.Jean-Marc Spaggiari 2013-01-27, 15:43
Hi Kevin,
What do you mean by "current block count per DN"? I kept the standard settings. fsck is telling me that I have 10893 titak blocks. Since I have 8 nodes, it's giving me 1361 blocks per node. It that what you are asking? JM 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: > Hey JM, > > I suspect they are referring to the DN process only. It is important in > these discussion to talk about individual component memory usage. In > my experience most HBase clusters only need 1 - 2 GB of heap space for the > DN process. I am not a Map Reduce expert, but typically the actual TT > process only needs 1GB of memory then you control everything else through > max slots and child heap. What is your current block count per DN? > > On Sun, Jan 27, 2013 at 9:28 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi, >> >> I saw on another message that hadoop only need 1GB... >> >> Today, I have configured my nodes with 45% memory for HBase, 45% >> memory for Hadoop. The last 10% are for the OS. >> >> Should I move that with 1GB for Hadoop, 10% for the OS and the rest >> for HBase? Even if running MR jobs? >> >> Thanks, >> >> JM >> > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >
-
Re: HBase vs Hadoop memory configuration.Kevin O'dell 2013-01-27, 16:16
JM,
That is probably correct. You can check the NN UI and confirm that number, but it doesn't seem too far off for an HBase cluster. You will be fine with just 1GB of heap for the DN with a block count that low. Typically you don't need to raise the heap until you are looking at a couple hundred thousand blocks per DN. On Sun, Jan 27, 2013 at 10:43 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > Hi Kevin, > > What do you mean by "current block count per DN"? I kept the standard > settings. > > fsck is telling me that I have 10893 titak blocks. Since I have 8 > nodes, it's giving me 1361 blocks per node. > > It that what you are asking? > > JM > > 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: > > Hey JM, > > > > I suspect they are referring to the DN process only. It is important > in > > these discussion to talk about individual component memory usage. In > > my experience most HBase clusters only need 1 - 2 GB of heap space for > the > > DN process. I am not a Map Reduce expert, but typically the actual TT > > process only needs 1GB of memory then you control everything else through > > max slots and child heap. What is your current block count per DN? > > > > On Sun, Jan 27, 2013 at 9:28 AM, Jean-Marc Spaggiari < > > [EMAIL PROTECTED]> wrote: > > > >> Hi, > >> > >> I saw on another message that hadoop only need 1GB... > >> > >> Today, I have configured my nodes with 45% memory for HBase, 45% > >> memory for Hadoop. The last 10% are for the OS. > >> > >> Should I move that with 1GB for Hadoop, 10% for the OS and the rest > >> for HBase? Even if running MR jobs? > >> > >> Thanks, > >> > >> JM > >> > > > > > > > > -- > > Kevin O'Dell > > Customer Operations Engineer, Cloudera > > > -- Kevin O'Dell Customer Operations Engineer, Cloudera
-
Re: HBase vs Hadoop memory configuration.Jean-Marc Spaggiari 2013-01-27, 16:33
>From the UI:
15790 files and directories, 11292 blocks = 27082 total. Heap Size is 179.12 MB / 910.25 MB (19%) I'm setting the memory into the hadoop-env.sh file using: export HADOOP_HEAPSIZE=1024 I think that's fine for the datanodes, but does it mean also each task traker, job tracker and name node will take 1G? So 2GB to 4GB on each server? (1 NN+JB+DN+TT and 7 DN+TT) Or it will be 1GB in total? And if we say 1GB for the DN, how much should we reserved for the other deamons? I want to make sure I give the maximum I can give to HBase without starving Hadoop... JM 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: > JM, > > That is probably correct. You can check the NN UI and confirm that > number, but it doesn't seem too far off for an HBase cluster. You will be > fine with just 1GB of heap for the DN with a block count that low. > Typically you don't need to raise the heap until you are looking at a > couple hundred thousand blocks per DN. > > On Sun, Jan 27, 2013 at 10:43 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi Kevin, >> >> What do you mean by "current block count per DN"? I kept the standard >> settings. >> >> fsck is telling me that I have 10893 titak blocks. Since I have 8 >> nodes, it's giving me 1361 blocks per node. >> >> It that what you are asking? >> >> JM >> >> 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: >> > Hey JM, >> > >> > I suspect they are referring to the DN process only. It is important >> in >> > these discussion to talk about individual component memory usage. In >> > my experience most HBase clusters only need 1 - 2 GB of heap space for >> the >> > DN process. I am not a Map Reduce expert, but typically the actual TT >> > process only needs 1GB of memory then you control everything else >> > through >> > max slots and child heap. What is your current block count per DN? >> > >> > On Sun, Jan 27, 2013 at 9:28 AM, Jean-Marc Spaggiari < >> > [EMAIL PROTECTED]> wrote: >> > >> >> Hi, >> >> >> >> I saw on another message that hadoop only need 1GB... >> >> >> >> Today, I have configured my nodes with 45% memory for HBase, 45% >> >> memory for Hadoop. The last 10% are for the OS. >> >> >> >> Should I move that with 1GB for Hadoop, 10% for the OS and the rest >> >> for HBase? Even if running MR jobs? >> >> >> >> Thanks, >> >> >> >> JM >> >> >> > >> > >> > >> > -- >> > Kevin O'Dell >> > Customer Operations Engineer, Cloudera >> > >> > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >
-
Re: HBase vs Hadoop memory configuration.Kevin O'dell 2013-01-28, 14:45
JM,
You would control those through the hadoop-env.sh using JOBTRACKER_OPTS, TASKTRACKER_OPTS and then setting xmx for the desired heap. On Sun, Jan 27, 2013 at 11:33 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > From the UI: > 15790 files and directories, 11292 blocks = 27082 total. Heap Size is > 179.12 MB / 910.25 MB (19%) > > I'm setting the memory into the hadoop-env.sh file using: > export HADOOP_HEAPSIZE=1024 > > I think that's fine for the datanodes, but does it mean also each task > traker, job tracker and name node will take 1G? So 2GB to 4GB on each > server? (1 NN+JB+DN+TT and 7 DN+TT) Or it will be 1GB in total? > > And if we say 1GB for the DN, how much should we reserved for the > other deamons? I want to make sure I give the maximum I can give to > HBase without starving Hadoop... > > JM > > 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: > > JM, > > > > That is probably correct. You can check the NN UI and confirm that > > number, but it doesn't seem too far off for an HBase cluster. You will > be > > fine with just 1GB of heap for the DN with a block count that low. > > Typically you don't need to raise the heap until you are looking at a > > couple hundred thousand blocks per DN. > > > > On Sun, Jan 27, 2013 at 10:43 AM, Jean-Marc Spaggiari < > > [EMAIL PROTECTED]> wrote: > > > >> Hi Kevin, > >> > >> What do you mean by "current block count per DN"? I kept the standard > >> settings. > >> > >> fsck is telling me that I have 10893 titak blocks. Since I have 8 > >> nodes, it's giving me 1361 blocks per node. > >> > >> It that what you are asking? > >> > >> JM > >> > >> 2013/1/27, Kevin O'dell <[EMAIL PROTECTED]>: > >> > Hey JM, > >> > > >> > I suspect they are referring to the DN process only. It is > important > >> in > >> > these discussion to talk about individual component memory usage. In > >> > my experience most HBase clusters only need 1 - 2 GB of heap space for > >> the > >> > DN process. I am not a Map Reduce expert, but typically the actual TT > >> > process only needs 1GB of memory then you control everything else > >> > through > >> > max slots and child heap. What is your current block count per DN? > >> > > >> > On Sun, Jan 27, 2013 at 9:28 AM, Jean-Marc Spaggiari < > >> > [EMAIL PROTECTED]> wrote: > >> > > >> >> Hi, > >> >> > >> >> I saw on another message that hadoop only need 1GB... > >> >> > >> >> Today, I have configured my nodes with 45% memory for HBase, 45% > >> >> memory for Hadoop. The last 10% are for the OS. > >> >> > >> >> Should I move that with 1GB for Hadoop, 10% for the OS and the rest > >> >> for HBase? Even if running MR jobs? > >> >> > >> >> Thanks, > >> >> > >> >> JM > >> >> > >> > > >> > > >> > > >> > -- > >> > Kevin O'Dell > >> > Customer Operations Engineer, Cloudera > >> > > >> > > > > > > > > -- > > Kevin O'Dell > > Customer Operations Engineer, Cloudera > > > -- Kevin O'Dell Customer Operations Engineer, Cloudera
-
Re: HBase vs Hadoop memory configuration.karunakar 2013-01-29, 01:46
Hi Jean,
AFAIK !! The namenode can handle 1 million blocks for 1GB of namenode heap size ! It depends on the configuration dfs.block.size*1 milion blocks = 128 TB of data [considering 128 MB is the default block size]. Using this command :export HADOOP_HEAPSIZE="-Xmx2g" will change across all the daemons. Rather than using that, use the below configurations for individual daemons. You can set the namenode, datanode, jobtracker, tasktracker 2 gb heap size for each daemon by using the following lines in hadoop-env.sh: Example export HADOOP_NAMENODE_OPTS="-Xmx2g" export HADOOP_DATANODE_OPTS="-Xmx2g" export HADOOP_JOBTRACKER_OPTS="-Xmx2g" export HADOOP_TASKTRACKER_OPTS="-Xmx2g" Ex: If you have a server of 16 GB and concentrating more on HBase, and if you are running datanode, tasktracker and regionserver on one node: then give 4 GB for datanode, 2-3 GB for tasktracker [setting child jvm's] and 6-8 GB for regionserver. Thanks, karunakar. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-vs-Hadoop-memory-configuration-tp4037436p4037573.html Sent from the HBase User mailing list archive at Nabble.com.
-
Re: HBase vs Hadoop memory configuration.Jean-Marc Spaggiari 2013-01-29, 20:26
Thanks all for this information.
I have try to adjust my setting to make sure the memory is used efficiently. JM 2013/1/28, karunakar <[EMAIL PROTECTED]>: > Hi Jean, > > AFAIK !! > > The namenode can handle 1 million blocks for 1GB of namenode heap size ! It > depends on the configuration > dfs.block.size*1 milion blocks = 128 TB of data [considering 128 MB is the > default block size]. > > Using this command :export HADOOP_HEAPSIZE="-Xmx2g" will change across all > the daemons. Rather than using that, use the below configurations for > individual daemons. > > You can set the namenode, datanode, jobtracker, tasktracker 2 gb heap size > for each daemon by using the following lines in hadoop-env.sh: Example > > export HADOOP_NAMENODE_OPTS="-Xmx2g" > export HADOOP_DATANODE_OPTS="-Xmx2g" > export HADOOP_JOBTRACKER_OPTS="-Xmx2g" > export HADOOP_TASKTRACKER_OPTS="-Xmx2g" > > Ex: If you have a server of 16 GB and concentrating more on HBase, and if > you are running datanode, tasktracker and regionserver on one node: then > give 4 GB for datanode, 2-3 GB for tasktracker [setting child jvm's] and > 6-8 > GB for regionserver. > > Thanks, > karunakar. > > > > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/HBase-vs-Hadoop-memory-configuration-tp4037436p4037573.html > Sent from the HBase User mailing list archive at Nabble.com. > |