Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> RE: EXTERNAL: Re: Failing Tablet Servers


+
Cardon, Tejay E 2012-09-20, 21:26
+
Jim Klucar 2012-09-20, 22:44
+
Cardon, Tejay E 2012-09-20, 22:50
+
Jim Klucar 2012-09-20, 22:56
+
Adam Fuchs 2012-09-20, 23:21
+
Cardon, Tejay E 2012-09-21, 14:12
+
John Vines 2012-09-21, 14:25
+
Cardon, Tejay E 2012-09-21, 14:35
Copy link to this message
-
Re: EXTERNAL: Re: Failing Tablet Servers
tejay,

Here's a good article about Java using native memory.

http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/index.html#how

On Fri, Sep 21, 2012 at 10:35 AM, Cardon, Tejay E
<[EMAIL PROTECTED]> wrote:
> Gotcha.  So if I’m using java maps then my tserver_opts needs to be
> tserver.memory.maps + extra for the rest of the tserver because the memory
> map will be taken from the overall memory allocated to the tserver.  But if
> I’m using native maps, then I need far less tserver memory because the map
> memory is not deducted from the tserver.  Is that correct?
>
>
>
> Thanks,
> tejay
>
>
>
> From: John Vines [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 21, 2012 8:26 AM
> To: [EMAIL PROTECTED]
> Subject: Re: EXTERNAL: Re: Failing Tablet Servers
>
>
>
> memory.maps is what defines the size of the in memory map. When using native
> maps, that space does not come out of the heap size. But when using
> non-native maps, it comes out of the heap space.
>
> I think the issue Eric is trying to hit at is the fickleness of the java
> garbage collector. When you give a process that much heap, that's so much
> more data you can hold before you need to garbage collect. However, that
> also means when it does garbage collect, it's collecting a LOT more, which
> can result is poor performance.
>
> John
>
>
>
> On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E <[EMAIL PROTECTED]>
> wrote:
>
> Jim, Eric, and Adam,
>
> Thanks.  It sounds like you’re all saying the same thing.  Originally I was
> doing each key/value as its own mutation, and it was blowing up much faster
> (probably due to the volume/overhead of the mutation objects themselves.
> I’ll try refactoring to break them up into something in-between.  My keys
> are small (<25 Bytes), and my values are empty, but I’ll aim for ~1,000
> key/values per mutation and see how that works out for me.
>
>
>
> Eric,
>
> I was under the impression that the memory.maps setting was not very
> important when using native maps.  Apparently I’m mistaken there.  What does
> this setting control when in a native map setting?  And, in general, what’s
> the proper balance between tserver_opts and tserver.memory.maps?
>
>
>
> With regards to the “Finished gathering information from 24 servers in 27.45
> seconds”  Do you have any recommendations for how to chase down the
> bottleneck?  I’m pretty sure I’m having GC issues, but I’m not sure what is
> causing them on the server side.  I’m sending a fairly small number of very
> large mutation objects, which I’d expect to be a moderate problem for the
> GC, but not a huge one..
>
>
>
> Thanks again to everyone for being so responsive and helpful.
>
>
>
> Tejay Cardon
>
>
>
>
>
> From: Eric Newton [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 21, 2012 8:03 AM
>
>
> To: [EMAIL PROTECTED]
> Subject: EXTERNAL: Re: Failing Tablet Servers
>
>
>
> A few items noted from your logs:
>
>
>
> tserver.memory.maps.max = 1G
>
>
>
> If you are giving your processes 10G, you might want to make the map larger,
> say 6G, and then reduce the JVM by 6G.
>
>
>
> Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000
> entries created)
>
>
>
> You are creating rows with 1M columns.  This is ok, but you might want to
> write them out more incrementally.
>
>
>
> WARN : Running low on memory
>
>
>
> That's pretty self-explanatory.  I'm guessing that the very large mutations
> are causing the tablet servers to run out of memory before they are held
> waiting for minor compactions.
>
>
>
> Finished gathering information from 24 servers in 27.45 seconds
>
>
>
> Something is running slow, probably due to GC thrashing.
>
>
>
> WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]
>
>
>
> And there's a server crashing, probably due to an OOM condition.
>
>
>
> Send smaller mutations.  Maybe keep it to 200K column updates.  You can
> still have 1M wide rows, just send 5 mutations.
>
>
>
> -Eric
>
>
>
> On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <[EMAIL PROTECTED]>
+
Eric Newton 2012-09-21, 14:32
+
Cardon, Tejay E 2012-09-21, 14:50
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB