|
|
-
Understanding HBase log output regarding memstore flush
Asaf Mesika 2012-06-28, 17:17
Hi,
I'm trying to figure out some discrepancies I'm witnessing in the HBase Region Server log file.
It states that a flush was requested, and then a memstore flush is started. It says the flush size, after snapshotting is 139105600 (~132.7m). In the log message below, the file size of the file the memstore was flushed too is not the same size (132.7m):
[Quote] Added hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915, entries=214652, sequenceid=1068343, filesize=*35.5m* [/Quote]
I'm curios to this difference in size, since I eventually would like to know the HDFS write throughput HBase is experiencing while flushing memstore to disk. *Logs quote*
2012-06-25 16:23:11,905 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960. 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., current region memstore size 132.7m 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., commencing wait for mvcc, flushsize=139105600 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting, commencing flushing stores 2012-06-25 16:23:11,938 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file:hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915with permission:rwxrwxrwx 2012-06-25 16:23:11,960 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-06-25 16:23:11,960 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915: CompoundBloomFilterWriter 2012-06-25 16:23:12,218 DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested 2012-06-25 16:23:13,043 INFO org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO DeleteFamily was added to HFile ( hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 ) 2012-06-25 16:23:13,043 INFO org.apache.hadoop.hbase.regionserver.Store: Flushed , sequenceid=1068343, memsize=132.7m, into tmp file hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 2012-06-25 16:23:13,050 DEBUG org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 to hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915 2012-06-25 16:23:13,068 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915, entries=214652, sequenceid=1068343, filesize=35.5m 2012-06-25 16:23:13,071 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~132.7m/139105600, currentsize=9.2m/9603424 for region food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960. in 1165ms, sequenceid=1068343, compaction requested=true
+
Asaf Mesika 2012-06-28, 17:17
-
Re: Understanding HBase log output regarding memstore flush
Matt Corgan 2012-06-28, 17:33
Hi Asaf,
I believe HDFS will see the 35.5MB worth of data. The 132.7MB is the size of the data in the memstore with the overhead of the ConcurrentSkipListMap which is a pointer-heavy data structure.
Are you using compression? If so then the 35.5 is the compressed size, and you should see a metric on the regionserver's rs-status page with the storefileUncompressedSizeMB.
Let's say you are using compression and your storefileUncompressedSizeMB=70MB, then:
* start with 132.7MB in the memstore * the memstore flusher will rewrite the data into blocks (totaling 70MB), compress them, and write 35.5MB of data to HDFS * when reading the blocks back, they will be uncompressed and stored in the block cache with total size of 70MB
Matt On Thu, Jun 28, 2012 at 10:17 AM, Asaf Mesika <[EMAIL PROTECTED]> wrote:
> Hi, > > I'm trying to figure out some discrepancies I'm witnessing in the HBase > Region Server log file. > > It states that a flush was requested, and then a memstore flush is started. > It says the flush size, after snapshotting is 139105600 (~132.7m). > In the log message below, the file size of the file the memstore was > flushed too is not the same size (132.7m): > > [Quote] > Added > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915, > entries=214652, sequenceid=1068343, filesize=*35.5m* > [/Quote] > > I'm curios to this difference in size, since I eventually would like to > know the HDFS write throughput HBase is experiencing while flushing > memstore to disk. > > > *Logs quote* > > 2012-06-25 16:23:11,905 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Flush requested on > food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960. > 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Started memstore flush for > food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., current region > memstore size 132.7m > 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Finished snapshotting > food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., commencing wait > for mvcc, flushsize=139105600 > 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Finished snapshotting, commencing flushing stores > 2012-06-25 16:23:11,938 DEBUG org.apache.hadoop.hbase.util.FSUtils: > Creating > > file:hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915with > permission:rwxrwxrwx > 2012-06-25 16:23:11,960 DEBUG > org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with > CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] > [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] > [cacheEvictOnClose=false] [cacheCompressed=false] > 2012-06-25 16:23:11,960 INFO > org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter > type for > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915: > CompoundBloomFilterWriter > 2012-06-25 16:23:12,218 DEBUG > org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested > 2012-06-25 16:23:13,043 INFO > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > DeleteFamily was added to HFile ( > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 > ) > 2012-06-25 16:23:13,043 INFO org.apache.hadoop.hbase.regionserver.Store: > Flushed , sequenceid=1068343, memsize=132.7m, into tmp file > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 > 2012-06-25 16:23:13,050 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Renaming flushed file at > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915 > to > > hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915
+
Matt Corgan 2012-06-28, 17:33
|
|