Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Strange behavior on scan while writing


Copy link to this message
-
Strange behavior on scan while writing
Placido Revilla 2011-10-03, 10:52
Hi,

we are experiencing a strange behavior in some tests we are currently
performing. What we are seeing is that scans on a table that is being
written to at the same time sometimes end prematurely, with no error. This
seems to be heavily dependent on the write pattern.

We've been able to reproduce the issue with the standard hbase tools:

in a terminal run:

OLDV=0; OLDT=0; while true; do NEWV=`hbase shell count_testtable | head -1 |
cut -d' ' -f1`; NEWT=`date +%s`; echo $NEWV " -> " $(((NEWV - OLDV) / (NEWT
- OLDT))) "msg/s"; OLDV=$NEWV; OLDT=$NEWT; done

where the contents of the file count_testtable are:

count 'TestTable', INTERVAL => 100000000, CACHE => 10000
exit

This counts the rows in the TestTable repeatedly showing the number of rows
and the delta rows per second inserted. In the hbase shell count is
implemented as a full scan with a filter on the row key.

Meanwhile, in another terminal do:

hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
randomWrite 5

and when over:

hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=20000
sequentialWrite 5

On the scan terminal we are seeing results similar to:

0  ->  0 msg/s
45552  ->  5694 msg/s <=== randomWrite starts
63284  ->  2955 msg/s
63284  ->  0 msg/s  <=== randomWrite ends
58829  ->  -636 msg/s <=== sequentialWrite starts
88764  ->  3741 msg/s
100000  ->  802 msg/s
100000  ->  0 msg/s <=== sequentialWrite ends

As you can see in the fifth row the count is lower than expected (resulting
on a negative inserts/sec).

You may need to try a couple of times or tweak the number of rows to insert
to see the problem.

hbase version: 0.90.4 (tried on a standalone and a full distributed
deployment).

We think there must be an error somewhere or something we don't understand
is slipping by us.

Thanks.

--
[image: Tuenti] <http://www.tuenti.com/>
Plácido Revilla
Senior Backend Engineer
[EMAIL PROTECTED]
TUENTI TECHNOLOGIES S.L. ©
PZA. DE LAS CORTES 2, 4ª PLANTA | 28014 MADRID
+
Placido Revilla 2011-10-03, 13:10
+
Jean-Daniel Cryans 2011-10-04, 22:35
+
Placido Revilla 2011-10-05, 07:41
+
Lars 2011-10-03, 16:11
+
Placido Revilla 2011-10-04, 09:57
+
lars hofhansl 2011-10-05, 03:57