Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Strange behavior on scan while writing


Copy link to this message
-
Re: Strange behavior on scan while writing
On Wed, Oct 5, 2011 at 12:35 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
>
> I tried it a bunch of times, even trying it on a table that was splitting
> all the time, and I wasn't able to generate hit that situation.

I've tested it again in another computer in local, with a just
downloaded package:

1. download from
http://apache.rediris.es//hbase/hbase-0.90.4/hbase-0.90.4.tar.gz
2. unpackaged
3. in one terminal ran "bin/hbase master start"
4. created the "count_testtable" file
5. in another terminal ran "OLDV=0; OLDT=0; while true; do
NEWV=`./hbase shell count_testtable | head -1 | cut -d' ' -f1`;
NEWT=`date +%s`; echo $NEWV " -> " $(((NEWV - OLDV) / (NEWT - OLDT)))
"msg/s"; OLDV=$NEWV; OLDT=$NEWT; done"
6. in another terminal ran "nice -n 19 ./hbase
org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
randomWrite 3" followed by "nice -n 19 ./hbase
org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
randomWrite 3"

Result:

0  ->  0 msg/s
37927  ->  7585 msg/s
37927  ->  0 msg/s
37927  ->  0 msg/s
37195  ->  -146 msg/s
60000  ->  4561 msg/s
60000  ->  0 msg/s

Previous tests where done in a debian/lenny with a custom kernel. The
last one was done in an ubuntu/natty with stock kernel.

As you can see I had to play with the values of rows/threads and had
to renice the writings, because they were starving the scan loop.

> One way was able to do it is by truncating the table, the count was abruptly
> hitting the end of the table since it became empty all of a sudden :)
>
> J-D
>
> On Mon, Oct 3, 2011 at 3:52 AM, Placido Revilla <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > we are experiencing a strange behavior in some tests we are currently
> > performing. What we are seeing is that scans on a table that is being
> > written to at the same time sometimes end prematurely, with no error. This
> > seems to be heavily dependent on the write pattern.
> >
> > We've been able to reproduce the issue with the standard hbase tools:
> >
> > in a terminal run:
> >
> > OLDV=0; OLDT=0; while true; do NEWV=`hbase shell count_testtable | head -1
> > | cut -d' ' -f1`; NEWT=`date +%s`; echo $NEWV " -> " $(((NEWV - OLDV) /
> > (NEWT - OLDT))) "msg/s"; OLDV=$NEWV; OLDT=$NEWT; done
> >
> > where the contents of the file count_testtable are:
> >
> > count 'TestTable', INTERVAL => 100000000, CACHE => 10000
> > exit
> >
> > This counts the rows in the TestTable repeatedly showing the number of rows
> > and the delta rows per second inserted. In the hbase shell count is
> > implemented as a full scan with a filter on the row key.
> >
> > Meanwhile, in another terminal do:
> >
> > hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
> > randomWrite 5
> >
> > and when over:
> >
> > hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
> > sequentialWrite 5
> >
> > On the scan terminal we are seeing results similar to:
> >
> > 0  ->  0 msg/s
> > 45552  ->  5694 msg/s <=== randomWrite starts
> > 63284  ->  2955 msg/s
> > 63284  ->  0 msg/s  <=== randomWrite ends
> > 58829  ->  -636 msg/s <=== sequentialWrite starts
> > 88764  ->  3741 msg/s
> > 100000  ->  802 msg/s
> > 100000  ->  0 msg/s <=== sequentialWrite ends
> >
> > As you can see in the fifth row the count is lower than expected (resulting
> > on a negative inserts/sec).
> >
> > You may need to try a couple of times or tweak the number of rows to insert
> > to see the problem.
> >
> > hbase version: 0.90.4 (tried on a standalone and a full distributed
> > deployment).
> >
> > We think there must be an error somewhere or something we don't understand
> > is slipping by us.
> >
> > Thanks.
> >
> > --
> > [image: Tuenti] <http://www.tuenti.com/>
> > Plácido Revilla
> > Senior Backend Engineer
> > [EMAIL PROTECTED]
> > TUENTI TECHNOLOGIES S.L. ©
> > PZA. DE LAS CORTES 2, 4ª PLANTA | 28014 MADRID
> >
> >

--
Plácido Revilla
Senior Backend Engineer
[EMAIL PROTECTED]
TUENTI TECHNOLOGIES S.L. ©
PZA. DE LAS CORTES 2, 4ª PLANTA | 28014 MADRID