Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> tservers running out of heap space


+
Anthony Fox 2012-11-29, 16:14
+
Keith Turner 2012-11-29, 17:09
+
Anthony Fox 2012-11-29, 17:20
+
Keith Turner 2012-11-29, 18:49
+
Anthony Fox 2012-11-29, 19:09
+
Keith Turner 2012-11-29, 19:22
+
Anthony Fox 2012-11-29, 19:24
+
Anthony Fox 2012-11-29, 20:50
+
Anthony Fox 2012-12-05, 15:55
+
Eric Newton 2012-12-05, 17:10
Copy link to this message
-
Re: tservers running out of heap space
On Thu, Nov 29, 2012 at 12:20 PM, Anthony Fox <[EMAIL PROTECTED]> wrote:

> Compacting down to a single file is not feasible - there's about 70G in
> 255 tablets across 15 tablet servers.  Is there another way to tune the
> compressor pool or another mechanism to verify that this is the issue?
In 1.4 you can compact a range.   If you can get your query to fail in a
range of the table, then you can compact that range and see if it helps.
>
>
> On Thu, Nov 29, 2012 at 12:09 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
>
>>
>>
>> On Thu, Nov 29, 2012 at 11:14 AM, Anthony Fox <[EMAIL PROTECTED]>wrote:
>>
>>> I am experiencing some issues running multiple parallel scans against
>>> Accumulo.  Running single scans works just fine but when I ramp up the
>>> number of simultaneous clients, my tablet servers die due to running out of
>>> heap space.  I've tried raising max heap to 4G which should be more than
>>> enough but I still see this error.  I've tried with
>>> table.cache.block.enable=false
>>> table.cache.index.enable=false, and table.scan.cache.enable=false and
>>> all combinations of caching enabled as well.
>>>
>>> My scans involve a custom intersecting iterator that maintains no more
>>> state than the top key and value.  The scans also do a bit of aggregation
>>> on column qualifiers but the result is small and the number of returned
>>> entries is only in the dozens.  The size of each returned value is only
>>> around 500 bytes.
>>>
>>> Any ideas why this may be happening or where to look for further info?
>>>
>>
>> One know issues is hadoops compressor pool.  If you have a tablet with 8
>> files and you query 10 terms, you will allocate 80 decompressors.   Each
>> decompressor uses 128K.   If you have 10 concurrent queries, 10 terms, and
>> 10 files then you will allocate 1000 decompressors.    These decompressors
>> come from a pool that never shrinks.  So if you allocate 1000 at the same
>> time, they will stay around.
>>
>> Try compacting your table down to one file and rerun your query just to
>> see if that helps.   If it does, then thats an important clue.
>>
>>
>>
>>>
>>> Thanks,
>>> Anthony
>>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB