Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> 0.92 Max Row Size


+
Wayne 2012-01-20, 19:43
+
Stack 2012-01-20, 20:45
+
Wayne 2012-01-21, 13:34
Copy link to this message
-
Re: 0.92 Max Row Size
Thrift has been upgraded to 0.8 in trunk. 0.92 still uses 0.7

Can you provide Jira number which deals with memory leak ?

Thanks

On Jan 21, 2012, at 5:34 AM, Wayne <[EMAIL PROTECTED]> wrote:

> Sorry but it would be too hard for us to be able to provide enough info in
> a Jira to accurately reproduce. Our read problem is through thrift and has
> everything to do with the row just being too big to bring back in its
> entirety (13 million col row times out 1/3 of the time). Filters in .92 and
> thrift should help us there. I just closed
> https://issues.apache.org/jira/browse/HBASE-4187 as filters now support
> offset, limit patterns for the get. Of course we would all prefer a
> streaming model to avoid any of these issues and having to build our
> own pseudo streaming model. Is Thrift still the best option for high
> performance python based reads? From Hadoop World it seems some people are
> pushing thrift and others are pushing Avro. Does .92 bundle/work with
> Thrift .8 and are the memory leaks fixed in .8?
>
> As far as the write bottleneck it has a lot to do with memory, and other
> low level config issues. I would hope that the automated tests of hbase can
> eventually include patterns for large col counts. In order for hbase to
> truly be a col based storage system it needs to scale cols into the 100s
> millions and beyond. This is the pattern we have the hardest time modeling
> in base because there is an unknown "limit" here we have to watch out for.
> There is a known limit that a row must be stored within 1 and only one
> region, but that should not be a problem. One single large region storing
> one large row should still "work".
>
> Thanks.
>
> On Fri, Jan 20, 2012 at 3:45 PM, Stack <[EMAIL PROTECTED]> wrote:
>
>> On Fri, Jan 20, 2012 at 11:43 AM, Wayne <[EMAIL PROTECTED]> wrote:
>>
>>> Does 0.92 support a significant increase in row size over 0.90.x? With
>>> 0.90.4 we have seen writes start choking at 30 million cols/row and reads
>>> start choking at 10 million cols/row. Can we assume these numbers will go
>>> up with .92 and if yes how much?
>>>
>>>
>> Any chance of a JIRA on issues you see Wayne when writes/read choke?
>> Thanks,
>>
>> St.Ack
>> P.S. I don't know of any comparison.  We have new fileformat in 0.92.0 and
>> both read/write paths have been amended so it could be different; not sure
>> if better or worse.
>>
+
Wayne 2012-01-23, 14:42
+
Stack 2012-01-21, 22:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB