Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Storing images in Hbase


Copy link to this message
-
Re: Storing images in Hbase
Jack, out of curiosity, how many people manage the hbase related servers?

Does it require constant monitoring or its fairly hands-off now?  (or a bit
of both, early days was getting things write/learning and now its purring
along).
On Wed, Jan 23, 2013 at 11:53 PM, Jack Levin <[EMAIL PROTECTED]> wrote:

> Its best to keep some RAM for caching of the filesystem, besides we
> also run datanode which takes heap as well.
> Now, please keep in mind that even if you specify heap of say 5GB, if
> your server opens threads to communicate with other systems via RPC
> (which hbase does a lot), you will indeed use HEAP +
> Nthreads*thread*kb_size.  There is a good Sun Microsystems document
> about it. (I don't have the link handy).
>
> -Jack
>
>
>
> On Mon, Jan 21, 2013 at 5:10 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:
> > Thanks for the useful information. I wonder why you use only 5G heap when
> > you have an 8G machine ? Is there a reason to not use all of it (the
> > DataNode typically takes a 1G of RAM)
> >
> > On Sun, Jan 20, 2013 at 11:49 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
> >
> >> I forgot to mention that I also have this setup:
> >>
> >> <property>
> >>   <name>hbase.hregion.memstore.flush.size</name>
> >>   <value>33554432</value>
> >>   <description>Flush more often. Default: 67108864</description>
> >> </property>
> >>
> >> This parameter works on per region amount, so this means if any of my
> >> 400 (currently) regions on a regionserver has 30MB+ in memstore, the
> >> hbase will flush it to disk.
> >>
> >>
> >> Here are some metrics from a regionserver:
> >>
> >> requests=2, regions=370, stores=370, storefiles=1390,
> >> storefileIndexSize=304, memstoreSize=2233, compactionQueueSize=0,
> >> flushQueueSize=0, usedHeap=3516, maxHeap=4987,
> >> blockCacheSize=790656256, blockCacheFree=255245888,
> >> blockCacheCount=2436, blockCacheHitCount=218015828,
> >> blockCacheMissCount=13514652, blockCacheEvictedCount=2561516,
> >> blockCacheHitRatio=94, blockCacheHitCachingRatio=98
> >>
> >> Note, that memstore is only 2G, this particular regionserver HEAP is set
> >> to 5G.
> >>
> >> And last but not least, its very important to have good GC setup:
> >>
> >> export HBASE_OPTS="$HBASE_OPTS -verbose:gc -Xms5000m
> >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
> >> -XX:+PrintGCDateStamps
> >> -XX:+HeapDumpOnOutOfMemoryError -Xloggc:$HBASE_HOME/logs/gc-hbase.log \
> >> -XX:MaxTenuringThreshold=15 -XX:SurvivorRatio=8 \
> >> -XX:+UseParNewGC \
> >> -XX:NewSize=128m -XX:MaxNewSize=128m \
> >> -XX:-UseAdaptiveSizePolicy \
> >> -XX:+CMSParallelRemarkEnabled \
> >> -XX:-TraceClassUnloading
> >> "
> >>
> >> -Jack
> >>
> >> On Thu, Jan 17, 2013 at 3:29 PM, Varun Sharma <[EMAIL PROTECTED]>
> wrote:
> >> > Hey Jack,
> >> >
> >> > Thanks for the useful information. By flush size being 15 %, do you
> mean
> >> > the memstore flush size ? 15 % would mean close to 1G, have you seen
> any
> >> > issues with flushes taking too long ?
> >> >
> >> > Thanks
> >> > Varun
> >> >
> >> > On Sun, Jan 13, 2013 at 8:17 AM, Jack Levin <[EMAIL PROTECTED]>
> wrote:
> >> >
> >> >> That's right, Memstore size , not flush size is increased.  Filesize
> is
> >> >> 10G. Overall write cache is 60% of heap and read cache is 20%.  Flush
> >> size
> >> >> is 15%.  64 maxlogs at 128MB. One namenode server, one secondary that
> >> can
> >> >> be promoted.  On the way to hbase images are written to a queue, so
> >> that we
> >> >> can take Hbase down for maintenance and still do inserts later.
> >>  ImageShack
> >> >> has ‘perma cache’ servers that allows writes and serving of data even
> >> when
> >> >> hbase is down for hours, consider it 4th replica 😉 outside of hadoop
> >> >>
> >> >> Jack
> >> >>
> >> >>  *From:* Mohit Anchlia <[EMAIL PROTECTED]>
> >> >> *Sent:* ‎January‎ ‎13‎, ‎2013 ‎7‎:‎48‎ ‎AM
> >> >> *To:* [EMAIL PROTECTED]
> >> >> *Subject:* Re: Storing images in Hbase
> >> >>
> >> >> Thanks Jack for sharing this information. This definitely makes sense
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB