|
|
-
hbase insert performance test (from hbasemaster and regionservers)
Faruk Berksöz 2012-05-21, 19:04
Dear All, we have 4 node in our cluster (1nn+dn,3 dn). Hadoop dist. is cdh3u3. Every node has 2 tb disk , 8 gb memory. We are trying some insert performance test on hbase. I have tried to insert 250.000 records from hbase master (without thread), that takes 5-7 sec. But when I try ro insert from any regionserver the same data (250.000) , it takes longer 21 sec. Is that normal ?
any response would be appreciated.. my java code looks like: ......... ......... ......... long start = System.currentTimeMillis(); //DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss"); //Date sdate = new Date();
//LOG.info(" Start-Time :" + dateFormat.format(sdate)); for (int i = 0; i < eachsize; i++) { put = new Put(String.format("%016x", random.nextLong()).getBytes());
put.setWriteToWAL(false); //set column and their values addColumnAndValues(); table.put(put); } elapsedTimeMillis = System.currentTimeMillis() - start; elapsedTimeSec = elapsedTimeMillis / 1000F; elapsedTimeSecInMemory = elapsedTimeSec ; LOG.info(" Elapsed-Time (sec) inMemory :" + elapsedTimeSec);
table.flushCommits(); //disk write elapsed time elapsedTimeMillis = System.currentTimeMillis() - start; elapsedTimeSec = elapsedTimeMillis / 1000F; elapsedTimeSecDiskWrite = elapsedTimeSec; LOG.info(" Elapsed-Time (sec) Disk Write:" + elapsedTimeSec); ......... .........
My hbase-site.xml is looks like : <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://master.bigdata.com:54310/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master.bigdata.com,slave1.bigdata.com,slave2.bigdata.com, slave3.bigdata.com</value> </property> <property> <name>hbase.zookeeper.dns.interface</name> <value>eth0</value> </property> <property> <name>hbase.zookeeper.dns.nameserver</name> <value>10.10.10.1</value> </property>
<property> <name>hbase.regionserver.handler.count</name> <value>20</value> </property>
<property> <name>hbase.client.write.buffer</name> <value>5097152</value> </property> </configuration>
-
Re: hbase insert performance test (from hbasemaster and regionservers)
Michael Segel 2012-05-21, 22:30
Hi,
Seems we just had someone talk about this just the other day...
1) 8GB of memory isn't enough to run both M/R and HBase. Ok, yes you can run it, however don't expect it to perform well.
2) You never want a user to run their own code from the cluster itself. Use an *edge* node.
There's more, but you get the idea.
On May 21, 2012, at 2:04 PM, Faruk Berksöz wrote:
> Dear All, > we have 4 node in our cluster (1nn+dn,3 dn). > Hadoop dist. is cdh3u3. > Every node has 2 tb disk , 8 gb memory. > We are trying some insert performance test on hbase. > I have tried to insert 250.000 records from hbase master (without thread), > that takes 5-7 sec. > But when I try ro insert from any regionserver the same data (250.000) , > it takes longer 21 sec. > Is that normal ? > > any response would be appreciated.. > > > my java code looks like: > ......... > ......... > ......... > long start = System.currentTimeMillis(); > //DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd > HH:mm:ss"); > //Date sdate = new Date(); > > //LOG.info(" Start-Time :" + dateFormat.format(sdate)); > for (int i = 0; i < eachsize; i++) { > > > put = new Put(String.format("%016x", > random.nextLong()).getBytes()); > > put.setWriteToWAL(false); > //set column and their values > addColumnAndValues(); > table.put(put); > > > } > elapsedTimeMillis = System.currentTimeMillis() - start; > elapsedTimeSec = elapsedTimeMillis / 1000F; > elapsedTimeSecInMemory = elapsedTimeSec ; > LOG.info(" Elapsed-Time (sec) inMemory :" + elapsedTimeSec); > > table.flushCommits(); > //disk write elapsed time > elapsedTimeMillis = System.currentTimeMillis() - start; > elapsedTimeSec = elapsedTimeMillis / 1000F; > elapsedTimeSecDiskWrite = elapsedTimeSec; > LOG.info(" Elapsed-Time (sec) Disk Write:" + > elapsedTimeSec); > ......... > ......... > > My hbase-site.xml is looks like : > <configuration> > <property> > <name>hbase.rootdir</name> > <value>hdfs://master.bigdata.com:54310/hbase</value> > </property> > <property> > <name>hbase.cluster.distributed</name> > <value>true</value> > </property> > <property> > <name>hbase.zookeeper.quorum</name> > <value>master.bigdata.com,slave1.bigdata.com,slave2.bigdata.com, > slave3.bigdata.com</value> > </property> > <property> > <name>hbase.zookeeper.dns.interface</name> > <value>eth0</value> > </property> > <property> > <name>hbase.zookeeper.dns.nameserver</name> > <value>10.10.10.1</value> > </property> > > <property> > <name>hbase.regionserver.handler.count</name> > <value>20</value> > </property> > > <property> > <name>hbase.client.write.buffer</name> > <value>5097152</value> > </property> > </configuration>
-
Re: hbase insert performance test (from hbasemaster and regionservers)
Tom Brown 2012-05-21, 22:44
Micheal,
This is good info. I wish you'd post what the "more" is, though.
--Tom
On Mon, May 21, 2012 at 4:30 PM, Michael Segel <[EMAIL PROTECTED]> wrote: > Hi, > > Seems we just had someone talk about this just the other day... > > 1) 8GB of memory isn't enough to run both M/R and HBase. > Ok, yes you can run it, however don't expect it to perform well. > > 2) You never want a user to run their own code from the cluster itself. Use an *edge* node. > > There's more, but you get the idea. > > On May 21, 2012, at 2:04 PM, Faruk Berksöz wrote: > >> Dear All, >> we have 4 node in our cluster (1nn+dn,3 dn). >> Hadoop dist. is cdh3u3. >> Every node has 2 tb disk , 8 gb memory. >> We are trying some insert performance test on hbase. >> I have tried to insert 250.000 records from hbase master (without thread), >> that takes 5-7 sec. >> But when I try ro insert from any regionserver the same data (250.000) , >> it takes longer 21 sec. >> Is that normal ? >> >> any response would be appreciated.. >> >> >> my java code looks like: >> ......... >> ......... >> ......... >> long start = System.currentTimeMillis(); >> //DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd >> HH:mm:ss"); >> //Date sdate = new Date(); >> >> //LOG.info(" Start-Time :" + dateFormat.format(sdate)); >> for (int i = 0; i < eachsize; i++) { >> >> >> put = new Put(String.format("%016x", >> random.nextLong()).getBytes()); >> >> put.setWriteToWAL(false); >> //set column and their values >> addColumnAndValues(); >> table.put(put); >> >> >> } >> elapsedTimeMillis = System.currentTimeMillis() - start; >> elapsedTimeSec = elapsedTimeMillis / 1000F; >> elapsedTimeSecInMemory = elapsedTimeSec ; >> LOG.info(" Elapsed-Time (sec) inMemory :" + elapsedTimeSec); >> >> table.flushCommits(); >> //disk write elapsed time >> elapsedTimeMillis = System.currentTimeMillis() - start; >> elapsedTimeSec = elapsedTimeMillis / 1000F; >> elapsedTimeSecDiskWrite = elapsedTimeSec; >> LOG.info(" Elapsed-Time (sec) Disk Write:" + >> elapsedTimeSec); >> ......... >> ......... >> >> My hbase-site.xml is looks like : >> <configuration> >> <property> >> <name>hbase.rootdir</name> >> <value>hdfs://master.bigdata.com:54310/hbase</value> >> </property> >> <property> >> <name>hbase.cluster.distributed</name> >> <value>true</value> >> </property> >> <property> >> <name>hbase.zookeeper.quorum</name> >> <value>master.bigdata.com,slave1.bigdata.com,slave2.bigdata.com, >> slave3.bigdata.com</value> >> </property> >> <property> >> <name>hbase.zookeeper.dns.interface</name> >> <value>eth0</value> >> </property> >> <property> >> <name>hbase.zookeeper.dns.nameserver</name> >> <value>10.10.10.1</value> >> </property> >> >> <property> >> <name>hbase.regionserver.handler.count</name> >> <value>20</value> >> </property> >> >> <property> >> <name>hbase.client.write.buffer</name> >> <value>5097152</value> >> </property> >> </configuration> >
-
Re: hbase insert performance test (from hbasemaster and regionservers)
Michael Segel 2012-05-21, 22:48
I would love to, but day job is still keeping me busy. :-)
I think you can google up the other threads on this... Andrew wrote up a bit more.
Sorry,
-Mike
On May 21, 2012, at 5:44 PM, Tom Brown wrote:
> Micheal, > > This is good info. I wish you'd post what the "more" is, though. > > --Tom > > On Mon, May 21, 2012 at 4:30 PM, Michael Segel > <[EMAIL PROTECTED]> wrote: >> Hi, >> >> Seems we just had someone talk about this just the other day... >> >> 1) 8GB of memory isn't enough to run both M/R and HBase. >> Ok, yes you can run it, however don't expect it to perform well. >> >> 2) You never want a user to run their own code from the cluster itself. Use an *edge* node. >> >> There's more, but you get the idea. >> >> On May 21, 2012, at 2:04 PM, Faruk Berksöz wrote: >> >>> Dear All, >>> we have 4 node in our cluster (1nn+dn,3 dn). >>> Hadoop dist. is cdh3u3. >>> Every node has 2 tb disk , 8 gb memory. >>> We are trying some insert performance test on hbase. >>> I have tried to insert 250.000 records from hbase master (without thread), >>> that takes 5-7 sec. >>> But when I try ro insert from any regionserver the same data (250.000) , >>> it takes longer 21 sec. >>> Is that normal ? >>> >>> any response would be appreciated.. >>> >>> >>> my java code looks like: >>> ......... >>> ......... >>> ......... >>> long start = System.currentTimeMillis(); >>> //DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd >>> HH:mm:ss"); >>> //Date sdate = new Date(); >>> >>> //LOG.info(" Start-Time :" + dateFormat.format(sdate)); >>> for (int i = 0; i < eachsize; i++) { >>> >>> >>> put = new Put(String.format("%016x", >>> random.nextLong()).getBytes()); >>> >>> put.setWriteToWAL(false); >>> //set column and their values >>> addColumnAndValues(); >>> table.put(put); >>> >>> >>> } >>> elapsedTimeMillis = System.currentTimeMillis() - start; >>> elapsedTimeSec = elapsedTimeMillis / 1000F; >>> elapsedTimeSecInMemory = elapsedTimeSec ; >>> LOG.info(" Elapsed-Time (sec) inMemory :" + elapsedTimeSec); >>> >>> table.flushCommits(); >>> //disk write elapsed time >>> elapsedTimeMillis = System.currentTimeMillis() - start; >>> elapsedTimeSec = elapsedTimeMillis / 1000F; >>> elapsedTimeSecDiskWrite = elapsedTimeSec; >>> LOG.info(" Elapsed-Time (sec) Disk Write:" + >>> elapsedTimeSec); >>> ......... >>> ......... >>> >>> My hbase-site.xml is looks like : >>> <configuration> >>> <property> >>> <name>hbase.rootdir</name> >>> <value>hdfs://master.bigdata.com:54310/hbase</value> >>> </property> >>> <property> >>> <name>hbase.cluster.distributed</name> >>> <value>true</value> >>> </property> >>> <property> >>> <name>hbase.zookeeper.quorum</name> >>> <value>master.bigdata.com,slave1.bigdata.com,slave2.bigdata.com, >>> slave3.bigdata.com</value> >>> </property> >>> <property> >>> <name>hbase.zookeeper.dns.interface</name> >>> <value>eth0</value> >>> </property> >>> <property> >>> <name>hbase.zookeeper.dns.nameserver</name> >>> <value>10.10.10.1</value> >>> </property> >>> >>> <property> >>> <name>hbase.regionserver.handler.count</name> >>> <value>20</value> >>> </property> >>> >>> <property> >>> <name>hbase.client.write.buffer</name> >>> <value>5097152</value> >>> </property> >>> </configuration> >> >
-
Re: hbase insert performance test (from hbasemaster and regionservers)
Faruk Berksöz 2012-05-22, 20:21
Hello Michael ,
thank you for the hint 2012/5/22 Michael Segel <[EMAIL PROTECTED]>
> Hi, > > Seems we just had someone talk about this just the other day... > > 1) 8GB of memory isn't enough to run both M/R and HBase. > Ok, yes you can run it, however don't expect it to perform well. > > 2) You never want a user to run their own code from the cluster itself. > Use an *edge* node. > > There's more, but you get the idea. > > On May 21, 2012, at 2:04 PM, Faruk Berksöz wrote: > > > Dear All, > > we have 4 node in our cluster (1nn+dn,3 dn). > > Hadoop dist. is cdh3u3. > > Every node has 2 tb disk , 8 gb memory. > > We are trying some insert performance test on hbase. > > I have tried to insert 250.000 records from hbase master (without > thread), > > that takes 5-7 sec. > > But when I try ro insert from any regionserver the same data (250.000) > , > > it takes longer 21 sec. > > Is that normal ? > > > > any response would be appreciated.. > > > > > > my java code looks like: > > ......... > > ......... > > ......... > > long start = System.currentTimeMillis(); > > //DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd > > HH:mm:ss"); > > //Date sdate = new Date(); > > > > //LOG.info(" Start-Time :" + dateFormat.format(sdate)); > > for (int i = 0; i < eachsize; i++) { > > > > > > put = new Put(String.format("%016x", > > random.nextLong()).getBytes()); > > > > put.setWriteToWAL(false); > > //set column and their values > > addColumnAndValues(); > > table.put(put); > > > > > > } > > elapsedTimeMillis = System.currentTimeMillis() - start; > > elapsedTimeSec = elapsedTimeMillis / 1000F; > > elapsedTimeSecInMemory = elapsedTimeSec ; > > LOG.info(" Elapsed-Time (sec) inMemory :" + elapsedTimeSec); > > > > table.flushCommits(); > > //disk write elapsed time > > elapsedTimeMillis = System.currentTimeMillis() - start; > > elapsedTimeSec = elapsedTimeMillis / 1000F; > > elapsedTimeSecDiskWrite = elapsedTimeSec; > > LOG.info(" Elapsed-Time (sec) Disk Write:" + > > elapsedTimeSec); > > ......... > > ......... > > > > My hbase-site.xml is looks like : > > <configuration> > > <property> > > <name>hbase.rootdir</name> > > <value>hdfs://master.bigdata.com:54310/hbase</value> > > </property> > > <property> > > <name>hbase.cluster.distributed</name> > > <value>true</value> > > </property> > > <property> > > <name>hbase.zookeeper.quorum</name> > > <value>master.bigdata.com,slave1.bigdata.com,slave2.bigdata.com, > > slave3.bigdata.com</value> > > </property> > > <property> > > <name>hbase.zookeeper.dns.interface</name> > > <value>eth0</value> > > </property> > > <property> > > <name>hbase.zookeeper.dns.nameserver</name> > > <value>10.10.10.1</value> > > </property> > > > > <property> > > <name>hbase.regionserver.handler.count</name> > > <value>20</value> > > </property> > > > > <property> > > <name>hbase.client.write.buffer</name> > > <value>5097152</value> > > </property> > > </configuration> > >
|
|