Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Concurrent use of RegionScanner.next


Copy link to this message
-
Concurrent use of RegionScanner.next
Looking through the HRegion.RegionScannerImpl, I see various synchronized next(...) methods, same for StoreScanner.
Scanners are created for Get operations and these scanners are guaranteed to be only used from a single thread, so in that case all synchronization is pointless.
The client's next(...) operation is a bit more interesting.

Is anybody using scanners concurrently, such as calling next(...) on the *same* scanner from multiple threads concurrently?

If not, we could enforce this once at the RegionServer level and have the entire path down to HFileScannerVx unsynchronized.

Would this work? Any counterexamples?

A quick test shows that we could gain about 10-20% during a scan when everything is in cache.

-- Lars
+
lars hofhansl 2012-12-09, 05:32
+
Stack 2012-12-10, 19:17
+
lars hofhansl 2012-12-11, 05:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB