Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: Storing images in Hbase


+
Michael Segel 2013-01-11, 15:00
+
Mohammad Tariq 2013-01-11, 15:27
+
Mohit Anchlia 2013-01-11, 17:40
+
Jack Levin 2013-01-11, 17:47
+
Jack Levin 2013-01-11, 17:51
+
Mohit Anchlia 2013-01-13, 15:47
+
kavishahuja 2013-01-05, 10:11
+
谢良 2013-01-06, 03:58
+
Mohit Anchlia 2013-01-06, 05:45
+
谢良 2013-01-06, 06:14
+
Damien Hardy 2013-01-06, 09:35
+
Yusup Ashrap 2013-01-06, 11:58
+
Andrew Purtell 2013-01-06, 20:12
+
Asaf Mesika 2013-01-06, 20:28
+
Andrew Purtell 2013-01-06, 20:49
+
Andrew Purtell 2013-01-06, 20:52
+
Mohit Anchlia 2013-01-06, 21:09
+
Amandeep Khurana 2013-01-06, 20:33
+
Marcos Ortiz 2013-01-11, 18:01
+
Jack Levin 2013-01-13, 16:17
+
Varun Sharma 2013-01-17, 23:29
+
Jack Levin 2013-01-20, 19:49
+
Varun Sharma 2013-01-22, 01:10
+
Varun Sharma 2013-01-22, 01:12
+
Jack Levin 2013-01-24, 04:53
+
S Ahmed 2013-01-24, 22:13
+
Jack Levin 2013-01-25, 07:41
+
S Ahmed 2013-01-27, 02:00
+
Jack Levin 2013-01-27, 02:56
+
yiyu jia 2013-01-27, 15:37
+
Jack Levin 2013-01-27, 16:56
+
yiyu jia 2013-01-27, 21:58
+
Jack Levin 2013-01-28, 04:06
+
Jack Levin 2013-01-28, 04:16
+
Andrew Purtell 2013-01-28, 18:58
+
yiyu jia 2013-01-28, 20:23
+
Andrew Purtell 2013-01-28, 21:13
+
yiyu jia 2013-01-28, 21:44
+
Andrew Purtell 2013-01-28, 21:49
+
Adrien Mogenet 2013-01-28, 10:01
Copy link to this message
-
Re: Storing images in Hbase
I've never tried it, HBASE worked out nicely for this task, caching
and all is a bonus for files.

-jack

On Mon, Jan 28, 2013 at 2:01 AM, Adrien Mogenet
<[EMAIL PROTECTED]> wrote:
> Could HCatalog be an option ?
> Le 26 janv. 2013 21:56, "Jack Levin" <[EMAIL PROTECTED]> a écrit :
>>
>> AFAIK, namenode would not like tracking 20 billion small files :)
>>
>> -jack
>>
>> On Sat, Jan 26, 2013 at 6:00 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
>> > That's pretty amazing.
>> >
>> > What I am confused is, why did you go with hbase and not just straight
> into
>> > hdfs?
>> >
>> >
>> >
>> >
>> > On Fri, Jan 25, 2013 at 2:41 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> >
>> >> Two people including myself, its fairly hands off. Took about 3 months
> to
>> >> tune it right, however we did have had multiple years of experience
> with
>> >> datanodes and hadoop in general, so that was a good boost.
>> >>
>> >> We have 4 hbase clusters today, image store being largest
>> >> On Jan 24, 2013 2:14 PM, "S Ahmed" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> > Jack, out of curiosity, how many people manage the hbase related
> servers?
>> >> >
>> >> > Does it require constant monitoring or its fairly hands-off now?
>  (or a
>> >> bit
>> >> > of both, early days was getting things write/learning and now its
> purring
>> >> > along).
>> >> >
>> >> >
>> >> > On Wed, Jan 23, 2013 at 11:53 PM, Jack Levin <[EMAIL PROTECTED]>
> wrote:
>> >> >
>> >> > > Its best to keep some RAM for caching of the filesystem, besides we
>> >> > > also run datanode which takes heap as well.
>> >> > > Now, please keep in mind that even if you specify heap of say 5GB,
> if
>> >> > > your server opens threads to communicate with other systems via RPC
>> >> > > (which hbase does a lot), you will indeed use HEAP +
>> >> > > Nthreads*thread*kb_size.  There is a good Sun Microsystems document
>> >> > > about it. (I don't have the link handy).
>> >> > >
>> >> > > -Jack
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Mon, Jan 21, 2013 at 5:10 PM, Varun Sharma <[EMAIL PROTECTED]>
>> >> > wrote:
>> >> > > > Thanks for the useful information. I wonder why you use only 5G
> heap
>> >> > when
>> >> > > > you have an 8G machine ? Is there a reason to not use all of it
> (the
>> >> > > > DataNode typically takes a 1G of RAM)
>> >> > > >
>> >> > > > On Sun, Jan 20, 2013 at 11:49 AM, Jack Levin <[EMAIL PROTECTED]>
>> >> > wrote:
>> >> > > >
>> >> > > >> I forgot to mention that I also have this setup:
>> >> > > >>
>> >> > > >> <property>
>> >> > > >>   <name>hbase.hregion.memstore.flush.size</name>
>> >> > > >>   <value>33554432</value>
>> >> > > >>   <description>Flush more often. Default: 67108864</description>
>> >> > > >> </property>
>> >> > > >>
>> >> > > >> This parameter works on per region amount, so this means if any
> of
>> >> my
>> >> > > >> 400 (currently) regions on a regionserver has 30MB+ in
> memstore, the
>> >> > > >> hbase will flush it to disk.
>> >> > > >>
>> >> > > >>
>> >> > > >> Here are some metrics from a regionserver:
>> >> > > >>
>> >> > > >> requests=2, regions=370, stores=370, storefiles=1390,
>> >> > > >> storefileIndexSize=304, memstoreSize=2233,
> compactionQueueSize=0,
>> >> > > >> flushQueueSize=0, usedHeap=3516, maxHeap=4987,
>> >> > > >> blockCacheSize=790656256, blockCacheFree=255245888,
>> >> > > >> blockCacheCount=2436, blockCacheHitCount=218015828,
>> >> > > >> blockCacheMissCount=13514652, blockCacheEvictedCount=2561516,
>> >> > > >> blockCacheHitRatio=94, blockCacheHitCachingRatio=98
>> >> > > >>
>> >> > > >> Note, that memstore is only 2G, this particular regionserver
> HEAP is
>> >> > set
>> >> > > >> to 5G.
>> >> > > >>
>> >> > > >> And last but not least, its very important to have good GC
> setup:
>> >> > > >>
>> >> > > >> export HBASE_OPTS="$HBASE_OPTS -verbose:gc -Xms5000m
>> >> > > >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
>> >> > > >> -XX:+PrintGCDateStamps
>> >> > > >> -XX:+HeapDumpOnOutOfMemoryError
>> >> -Xloggc:$HBASE_HOME/logs/gc-hbase.log
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB