Wayne 2012-01-20, 19:43
Stack 2012-01-20, 20:45
Wayne 2012-01-21, 13:34
-Re: 0.92 Max Row Size
yuzhihong@... 2012-01-21, 14:29
Thrift has been upgraded to 0.8 in trunk. 0.92 still uses 0.7
Can you provide Jira number which deals with memory leak ?
On Jan 21, 2012, at 5:34 AM, Wayne <[EMAIL PROTECTED]> wrote:
> Sorry but it would be too hard for us to be able to provide enough info in
> a Jira to accurately reproduce. Our read problem is through thrift and has
> everything to do with the row just being too big to bring back in its
> entirety (13 million col row times out 1/3 of the time). Filters in .92 and
> thrift should help us there. I just closed
> https://issues.apache.org/jira/browse/HBASE-4187 as filters now support
> offset, limit patterns for the get. Of course we would all prefer a
> streaming model to avoid any of these issues and having to build our
> own pseudo streaming model. Is Thrift still the best option for high
> performance python based reads? From Hadoop World it seems some people are
> pushing thrift and others are pushing Avro. Does .92 bundle/work with
> Thrift .8 and are the memory leaks fixed in .8?
> As far as the write bottleneck it has a lot to do with memory, and other
> low level config issues. I would hope that the automated tests of hbase can
> eventually include patterns for large col counts. In order for hbase to
> truly be a col based storage system it needs to scale cols into the 100s
> millions and beyond. This is the pattern we have the hardest time modeling
> in base because there is an unknown "limit" here we have to watch out for.
> There is a known limit that a row must be stored within 1 and only one
> region, but that should not be a problem. One single large region storing
> one large row should still "work".
> On Fri, Jan 20, 2012 at 3:45 PM, Stack <[EMAIL PROTECTED]> wrote:
>> On Fri, Jan 20, 2012 at 11:43 AM, Wayne <[EMAIL PROTECTED]> wrote:
>>> Does 0.92 support a significant increase in row size over 0.90.x? With
>>> 0.90.4 we have seen writes start choking at 30 million cols/row and reads
>>> start choking at 10 million cols/row. Can we assume these numbers will go
>>> up with .92 and if yes how much?
>> Any chance of a JIRA on issues you see Wayne when writes/read choke?
>> P.S. I don't know of any comparison. We have new fileformat in 0.92.0 and
>> both read/write paths have been amended so it could be different; not sure
>> if better or worse.
Wayne 2012-01-23, 14:42
Stack 2012-01-21, 22:22