Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase Region Server crash if column size become to big


+
John 2013-09-11, 11:07
+
Jean-Marc Spaggiari 2013-09-11, 11:34
+
John 2013-09-11, 11:42
+
Ted Yu 2013-09-11, 12:16
+
John 2013-09-11, 12:38
+
Bing Jiang 2013-09-11, 13:41
+
John 2013-09-11, 14:46
+
John 2013-09-11, 14:47
+
Michael Segel 2013-09-11, 14:53
+
Kevin Odell 2013-09-11, 15:02
+
John 2013-09-11, 15:08
+
Kevin Odell 2013-09-11, 15:15
+
Dhaval Shah 2013-09-11, 15:15
+
Kevin Odell 2013-09-11, 15:20
+
John 2013-09-11, 15:26
+
Dhaval Shah 2013-09-11, 15:33
+
Kevin Odell 2013-09-11, 15:30
Copy link to this message
-
Re: HBase Region Server crash if column size become to big
Well, there's a column width, and then there's the row's width.

Unless I'm mistaken... rows can't span regions.  Right?

Note: the OP didn't say he got the error trying to add a column, but had issues in retrieving a column....

I personally never tried to break HBase on such an edge use case... (I like to avoid the problem in the first place....)

Has anyone tested this specific limit?
On Sep 11, 2013, at 10:02 AM, "Kevin O'dell" <[EMAIL PROTECTED]> wrote:

> I have not see the exact error, but if I recall correctly jobs will fail if
> the column is larger than 10MB and we have not raised the default
> setting(which I don't have in front of me) ?
>
>
> On Wed, Sep 11, 2013 at 10:53 AM, Michael Segel
> <[EMAIL PROTECTED]>wrote:
>
>> Just out of curiosity...
>>
>> How wide are the columns?
>>
>> What's the region size?
>>
>> Does anyone know the error message you'll get if your row is wider than a
>> region?
>>
>>
>> On Sep 11, 2013, at 9:47 AM, John <[EMAIL PROTECTED]> wrote:
>>
>>> sry, I mean 570000 columns, not rows
>>>
>>>
>>> 2013/9/11 John <[EMAIL PROTECTED]>
>>>
>>>> thanks for all the answers! The only entry I got in the
>>>> "hbase-cmf-hbase1-REGIONSERVER-mydomain.org.log.out" log file after I
>>>> executing the get command in the hbase shell is this:
>>>>
>>>> 2013-09-11 16:38:56,175 WARN org.apache.hadoop.ipc.HBaseServer:
>>>> (operationTooLarge): {"processingtimems":3196,"client":"
>> 192.168.0.1:50629
>>>>
>> ","timeRange":[0,9223372036854775807],"starttimems":1378910332920,"responsesize":108211303,"class":"HRegionServer","table":"P_SO","cacheBlocks":true,"families":{"myCf":["ALL"]},"row":"myRow","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
>>>>
>>>> After this the RegionServer is down, nothing more. BTW I found out that
>>>> the row should have ~570000 rows. The size should be arround ~70mb
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> 2013/9/11 Bing Jiang <[EMAIL PROTECTED]>
>>>>
>>>>> hi john.
>>>>> I think it is a fresh question. Could you print the log from the
>>>>> regionserver crashed ?
>>>>> On Sep 11, 2013 8:38 PM, "John" <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Okay, I will take a look at the ColumnPaginationFilter.
>>>>>>
>>>>>> I tried to reproduce the error. I created a new table and add one new
>>>>> row
>>>>>> with 250 000 columns, but everything works fine if I execute a get to
>>>>> the
>>>>>> table. The only difference to my original programm was that I have
>> added
>>>>>> the data directly throught the hbase java api and not with the map
>>>>> reduce
>>>>>> bulk load. Maybe that can be the reason?
>>>>>>
>>>>>> I wonder a little bit about the hdfs structure if I compare both
>> methods
>>>>>> (hbase api/bulk load). If I add the data through the hbase api there
>> is
>>>>> no
>>>>>> file in
>>>>> /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*,
>>>>>> but if I use the bulk load method there is a file for every time I
>>>>> executed
>>>>>> a new bulk load:
>>>>>>
>>>>>> root@pc11:~/hadoop# hadoop fs -ls
>>>>>> /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf
>>>>>> root@pc11:~/hadoop# hadoop fs -ls
>>>>>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/
>>>>>> Found 2 items
>>>>>> -rw-r--r--   1 root supergroup  118824462 2013-09-11 11:46
>>>>>>
>>>>>>
>>>>>
>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a
>>>>>> -rw-r--r--   1 root supergroup  158576842 2013-09-11 11:35
>>>>>>
>>>>>>
>>>>>
>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8
>>>>>>
>>>>>> If I ececute a get operation in the hbase shell to my the "MyTable"
>>>>> table
>>>>>> if got the result:
>>>>>>
>>>>>> hbase(main):004:0> get 'mytestTable', 'sampleRowKey'
>>>>>> ... <-- all results
>>>>>> 250000 row(s) in 38.4440 seconds
>>>>>>
>>>>>> but if I try to get the results for my "bulkLoadTable" I got this (+

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
+
Dhaval Shah 2013-09-11, 15:38
+
Michael Segel 2013-09-11, 18:12
+
Dhaval Shah 2013-09-11, 18:13
+
Ted Yu 2013-09-11, 13:19
+
Jean-Marc Spaggiari 2013-09-11, 11:48
+
John 2013-09-11, 15:58
+
Bryan Beaudreault 2013-09-11, 16:15
+
John 2013-09-11, 17:03