|
Andrey Stepachev
2010-12-30, 08:01
Andrey Stepachev
2010-12-30, 08:13
Todd Lipcon
2010-12-30, 15:38
Andrey Stepachev
2010-12-30, 20:33
Andrey Stepachev
2011-01-11, 14:57
Friso van Vollenhoven
2011-01-11, 15:13
Andrey Stepachev
2011-01-11, 15:24
Xavier Stevens
2011-01-11, 16:43
Andrey Stepachev
2011-01-11, 20:55
Andrey Stepachev
2011-01-12, 06:46
Friso van Vollenhoven
2011-01-12, 09:09
Andrey Stepachev
2011-01-12, 09:59
Friso van Vollenhoven
2011-01-12, 10:51
Andrey Stepachev
2011-01-12, 11:05
Friso van Vollenhoven
2011-01-12, 11:09
Tatsuya Kawano
2011-01-12, 11:44
Friso van Vollenhoven
2011-01-12, 11:58
Stack
2011-01-12, 19:08
Todd Lipcon
2011-01-12, 20:14
Friso van Vollenhoven
2011-01-12, 21:20
Todd Lipcon
2011-01-12, 22:12
Todd Lipcon
2011-01-12, 22:50
Tatsuya Kawano
2011-01-12, 23:25
Todd Lipcon
2011-01-12, 23:30
Tatsuya Kawano
2011-01-13, 01:01
Todd Lipcon
2011-01-13, 06:42
Friso van Vollenhoven
2011-01-13, 08:12
Friso van Vollenhoven
2011-01-13, 08:17
Friso van Vollenhoven
2011-01-13, 08:25
Todd Lipcon
2011-01-13, 16:13
|
-
Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2010-12-30, 08:01
Hi All.
After heavy load into hbase (single node, nondistributed test system) I got 4Gb process size of my HBase java process. On 6GB machine there was no room for anything else (disk cache and so on). Does anybody knows, what is going on, and how you solve this. What heap memory is set on you hosts and how much of RSS hbase process actually use. I don't see such things before, all tomcat and other java apps don't eats significally more memory then -Xmx. Connection name: pid: 23476 org.apache.hadoop.hbase.master.HMaster start Virtual Machine: Java HotSpot(TM) 64-Bit Server VM version 17.1-b03 Vendor: Sun Microsystems Inc. Name: 23476@mars Uptime: 12 hours 4 minutes Process CPU time: 5 hours 45 minutes JIT compiler: HotSpot 64-Bit Server Compiler Total compile time: 19,223 seconds ------------------------------ Current heap size: 703 903 kbytes Maximum heap size: 2 030 976kbytes Committed memory: 2 030 976 kbytes Pending finalization: 0 objects Garbage collector: Name = 'ParNew', Collections = 9 990, Total time spent = 5 minutes Garbage collector: Name = 'ConcurrentMarkSweep', Collections 20, Total time spent = 35,754 seconds ------------------------------ Operating System: Linux 2.6.34.7-0.5-xen Architecture: amd64 Number of processors: 8 Committed virtual memory: 4 403 512 kbytes Total physical memory: 6 815 744 kbytes Free physical memory: 82 720 kbytes Total swap space: 8 393 924 kbytes Free swap space: 8 050 880 kbytes
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2010-12-30, 08:13
Addition information:
ps shows, that my HBase process eats up to 4GB of RSS. $ ps --sort=-rss -eopid,rss | head | grep HMaster PID RSS 23476 3824892 2010/12/30 Andrey Stepachev <[EMAIL PROTECTED]> > Hi All. > > After heavy load into hbase (single node, nondistributed test system) I got > 4Gb process size of my HBase java process. > On 6GB machine there was no room for anything else (disk cache and so on). > > Does anybody knows, what is going on, and how you solve this. What heap > memory is set on you hosts > and how much of RSS hbase process actually use. > > I don't see such things before, all tomcat and other java apps don't eats > significally more memory then -Xmx. > > Connection name: pid: 23476 org.apache.hadoop.hbase.master.HMaster > start Virtual Machine: Java HotSpot(TM) 64-Bit Server VM version > 17.1-b03 Vendor: Sun Microsystems Inc. Name: 23476@mars Uptime: 12 > hours 4 minutes Process CPU time: 5 hours 45 minutes JIT compiler: HotSpot > 64-Bit Server Compiler Total compile time: 19,223 seconds > ------------------------------ > Current heap size: 703 903 kbytes Maximum heap size: 2 030 976kbytes Committed memory: > 2 030 976 kbytes Pending finalization: 0 objects Garbage > collector: Name = 'ParNew', Collections = 9 990, Total time spent = 5 > minutes Garbage collector: Name = 'ConcurrentMarkSweep', Collections > 20, Total time spent = 35,754 seconds > ------------------------------ > Operating System: Linux 2.6.34.7-0.5-xen Architecture: amd64 Number of processors: > 8 Committed virtual memory: 4 403 512 kbytes Total physical > memory: 6 815 744 kbytes Free physical memory: 82 720 kbytes Total swap space: > 8 393 924 kbytes Free swap space: 8 050 880 kbytes > > > >
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2010-12-30, 15:38
Hi Andrey,
Any chance you're using hadoop-lzo with CDH3b3? There was a leak in earlier versions of hadoop-lzo that showed up under CDH3b3. You should upgrade to the newest. If that's not it, let me know, will keep thinking. -Todd On Thu, Dec 30, 2010 at 12:13 AM, Andrey Stepachev <[EMAIL PROTECTED]> wrote: > Addition information: > > ps shows, that my HBase process eats up to 4GB of RSS. > > $ ps --sort=-rss -eopid,rss | head | grep HMaster > PID RSS > 23476 3824892 > > > 2010/12/30 Andrey Stepachev <[EMAIL PROTECTED]> > > > Hi All. > > > > After heavy load into hbase (single node, nondistributed test system) I > got > > 4Gb process size of my HBase java process. > > On 6GB machine there was no room for anything else (disk cache and so > on). > > > > Does anybody knows, what is going on, and how you solve this. What heap > > memory is set on you hosts > > and how much of RSS hbase process actually use. > > > > I don't see such things before, all tomcat and other java apps don't eats > > significally more memory then -Xmx. > > > > Connection name: pid: 23476 org.apache.hadoop.hbase.master.HMaster > > start Virtual Machine: Java HotSpot(TM) 64-Bit Server VM version > > 17.1-b03 Vendor: Sun Microsystems Inc. Name: 23476@mars > Uptime: 12 > > hours 4 minutes Process CPU time: 5 hours 45 minutes JIT compiler: > HotSpot > > 64-Bit Server Compiler Total compile time: 19,223 seconds > > ------------------------------ > > Current heap size: 703 903 kbytes Maximum heap size: 2 030 > 976kbytes Committed memory: > > 2 030 976 kbytes Pending finalization: 0 objects Garbage > > collector: Name = 'ParNew', Collections = 9 990, Total time spent = 5 > > minutes Garbage collector: Name = 'ConcurrentMarkSweep', Collections > > > 20, Total time spent = 35,754 seconds > > ------------------------------ > > Operating System: Linux 2.6.34.7-0.5-xen Architecture: amd64 > Number of processors: > > 8 Committed virtual memory: 4 403 512 kbytes Total physical > > memory: 6 815 744 kbytes Free physical memory: 82 720 kbytes > Total swap space: > > 8 393 924 kbytes Free swap space: 8 050 880 kbytes > > > > > > > > > -- Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2010-12-30, 20:33
No, I'm not using LZO on this host. Only cloudera hadoop 0.20.2+320 + hbase
0.89.20100830 Digging google gives only hints, that jit or something in jvm can eat memory, but nothing concrete. pmap shows that some memory blocks are grow in size... but what are they, i can't imagine. 000000004010a000 29564K 28996K 28996K 28996K 0K rwxp [heap] 00007f0d88000000 65492K 132K 132K 132K 0K rwxp [anon] 00007f0d8bff5000 44K 0K 0K 0K 0K ---p [anon] 00007f0d8c000000 51760K 132K 132K 132K 0K rwxp [anon] 00007f0d8f28c000 13776K 0K 0K 0K 0K ---p [anon] 00007f0d90000000 65536K 65536K 65536K 65536K 0K rwxp [anon] 00007f0d9c000000 65488K 65488K 65488K 65488K 0K rwxp [anon] 00007f0dc14af000 170372K 170372K 170372K 170372K 0K rwxp [anon] 00007f0dcbb10000 1877632K 1874624K 1874624K 1874624K 0K rwxp [anon] 00007f0e3e4b0000 33496K 20028K 20028K 20028K 0K rwxp [anon] 00007f0e40566000 52520K 0K 0K 0K 0K rwxp [anon] ... many rows skipped with small numbers 2010/12/30 Todd Lipcon <[EMAIL PROTECTED]> > Hi Andrey, > > Any chance you're using hadoop-lzo with CDH3b3? There was a leak in earlier > versions of hadoop-lzo that showed up under CDH3b3. You should upgrade to > the newest. > > If that's not it, let me know, will keep thinking. > > -Todd > > On Thu, Dec 30, 2010 at 12:13 AM, Andrey Stepachev <[EMAIL PROTECTED]> > wrote: > > > Addition information: > > > > ps shows, that my HBase process eats up to 4GB of RSS. > > > > $ ps --sort=-rss -eopid,rss | head | grep HMaster > > PID RSS > > 23476 3824892 > > > > > > 2010/12/30 Andrey Stepachev <[EMAIL PROTECTED]> > > > > > Hi All. > > > > > > After heavy load into hbase (single node, nondistributed test system) I > > got > > > 4Gb process size of my HBase java process. > > > On 6GB machine there was no room for anything else (disk cache and so > > on). > > > > > > Does anybody knows, what is going on, and how you solve this. What heap > > > memory is set on you hosts > > > and how much of RSS hbase process actually use. > > > > > > I don't see such things before, all tomcat and other java apps don't > eats > > > significally more memory then -Xmx. > > > > > > Connection name: pid: 23476 org.apache.hadoop.hbase.master.HMaster > > > start Virtual Machine: Java HotSpot(TM) 64-Bit Server VM version > > > 17.1-b03 Vendor: Sun Microsystems Inc. Name: 23476@mars > > Uptime: 12 > > > hours 4 minutes Process CPU time: 5 hours 45 minutes JIT > compiler: > > HotSpot > > > 64-Bit Server Compiler Total compile time: 19,223 seconds > > > ------------------------------ > > > Current heap size: 703 903 kbytes Maximum heap size: 2 030 > > 976kbytes Committed memory: > > > 2 030 976 kbytes Pending finalization: 0 objects Garbage > > > collector: Name = 'ParNew', Collections = 9 990, Total time spent = 5 > > > minutes Garbage collector: Name = 'ConcurrentMarkSweep', > Collections > > > > > 20, Total time spent = 35,754 seconds > > > ------------------------------ > > > Operating System: Linux 2.6.34.7-0.5-xen Architecture: amd64 > > Number of processors: > > > 8 Committed virtual memory: 4 403 512 kbytes Total physical > > > memory: 6 815 744 kbytes Free physical memory: 82 720 kbytes > > Total swap space: > > > 8 393 924 kbytes Free swap space: 8 050 880 kbytes > > > > > > > > > > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-11, 14:57
After starting the hbase in jroсkit found the same memory leakage.
After the launch Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head Tue Jan 11 16:49:31 2011 11 16:49:31 MSK 2011 PID RSS VSZ% CPU 7863 2547760 5576744 78.7 JR dumps: Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 KB # 20) After running the mr which make high write load (~1hour) Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head Tue Jan 11 17:08:56 2011 11 17:08:56 MSK 2011 PID RSS VSZ% CPU 7863 4072396 5459572 100 JR said not important below specify why) http://paste.ubuntu.com/552820/ <http://paste.ubuntu.com/552820/> 7863: Total mapped 5742628KB +165888KB (reserved=1144000KB -1532404KB) - Java heap 2048000KB (reserved=0KB -1472176KB) - GC tables 68512KB - Thread stacks 38028KB +792KB (#threads=114 +3) - Compiled code 1048576KB (used=3376KB +776KB) - Internal 1480KB +256KB - OS 517944KB -31744KB - Other 1996792KB +195816KB - Classblocks 1280KB (malloced=1156KB +45KB #3421 +136) - Java class data 20992KB +768KB (malloced=20843KB +840KB #15774 +640 in 3421 classes) - Native memory tracking 1024KB (malloced=325KB +10KB #20) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ OS *java r x 0x0000000000400000.( 76KB) OS *java rw 0x0000000000612000 ( 4KB) OS *[heap] rw 0x0000000000613000.( 478712KB) INT Poll r 0x000000007fffe000 ( 4KB) INT Membar rw 0x000000007ffff000.( 4KB) MSP Classblocks (1/2) rw 0x0000000082ec0000 ( 768KB) MSP Classblocks (2/2) rw 0x0000000082f80000 ( 512KB) HEAP Java heap rw 0x0000000083000000.(2048000KB) rw 0x00007f2574000000 ( 65500KB) 0x00007f2577ff7000.( 36KB) rw 0x00007f2584000000 ( 65492KB) 0x00007f2587ff5000.( 44KB) rw 0x00007f258c000000 ( 65500KB) 0x00007f258fff7000 ( 36KB) rw 0x00007f2590000000 ( 65500KB) 0x00007f2593ff7000 ( 36KB) rw 0x00007f2594000000 ( 65500KB) 0x00007f2597ff7000 ( 36KB) rw 0x00007f2598000000 ( 131036KB) 0x00007f259fff7000 ( 36KB) rw 0x00007f25a0000000 ( 65528KB) 0x00007f25a3ffe000 ( 8KB) rw 0x00007f25a4000000 ( 65496KB) 0x00007f25a7ff6000 ( 40KB) rw 0x00007f25a8000000 ( 65496KB) 0x00007f25abff6000 ( 40KB) rw 0x00007f25ac000000 ( 65504KB) So, the difference was in the pieces of memory like this: rw 0x00007f2590000000 (65500KB) 0x00007f2593ff7000 (36KB) Looks like HLog allocates memory (looks like HLog, becase it is very similar size) If we count this blocks we get amount of lost memory: 65M * 32 + 132M = 2212M So, it looks like HLog allcates to many memory, and question is: how to restrict it? 2010/12/30 Andrey Stepachev <[EMAIL PROTECTED]>
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-11, 15:13
Are you using LZO by any chance? If so, which version?
Friso On 11 jan 2011, at 15:57, Andrey Stepachev wrote: > After starting the hbase in jroсkit found the same memory leakage. > > After the launch > > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > Tue Jan 11 16:49:31 2011 > > 11 16:49:31 MSK 2011 > PID RSS VSZ% CPU > 7863 2547760 5576744 78.7 > > > > JR dumps: > > Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB > (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# > threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal > 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced > = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 > in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 > KB # 20) > > > > After running the mr which make high write load (~1hour) > > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > Tue Jan 11 17:08:56 2011 > > 11 17:08:56 MSK 2011 > PID RSS VSZ% CPU > 7863 4072396 5459572 100 > > > > JR said not important below specify why) > > http://paste.ubuntu.com/552820/ > <http://paste.ubuntu.com/552820/> > > > 7863: > Total mapped 5742628KB +165888KB (reserved=1144000KB > -1532404KB) > - Java heap 2048000KB (reserved=0KB -1472176KB) > - GC tables 68512KB > - Thread stacks 38028KB +792KB (#threads=114 +3) > - Compiled code 1048576KB (used=3376KB +776KB) > - Internal 1480KB +256KB > - OS 517944KB -31744KB > - Other 1996792KB +195816KB > - Classblocks 1280KB (malloced=1156KB > +45KB #3421 +136) > - Java class data 20992KB +768KB (malloced=20843KB > +840KB #15774 +640 in 3421 classes) > - Native memory tracking 1024KB (malloced=325KB +10KB #20) > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > OS *java r x 0x0000000000400000.( 76KB) > OS *java rw 0x0000000000612000 ( 4KB) > OS *[heap] rw 0x0000000000613000.( 478712KB) > INT Poll r 0x000000007fffe000 ( 4KB) > INT Membar rw 0x000000007ffff000.( 4KB) > MSP Classblocks (1/2) rw 0x0000000082ec0000 ( 768KB) > MSP Classblocks (2/2) rw 0x0000000082f80000 ( 512KB) > HEAP Java heap rw 0x0000000083000000.(2048000KB) > rw 0x00007f2574000000 ( 65500KB) > 0x00007f2577ff7000.( 36KB) > rw 0x00007f2584000000 ( 65492KB) > 0x00007f2587ff5000.( 44KB) > rw 0x00007f258c000000 ( 65500KB) > 0x00007f258fff7000 ( 36KB) > rw 0x00007f2590000000 ( 65500KB) > 0x00007f2593ff7000 ( 36KB) > rw 0x00007f2594000000 ( 65500KB) > 0x00007f2597ff7000 ( 36KB) > rw 0x00007f2598000000 ( 131036KB) > 0x00007f259fff7000 ( 36KB) > rw 0x00007f25a0000000 ( 65528KB) > 0x00007f25a3ffe000 ( 8KB) > rw 0x00007f25a4000000 ( 65496KB) > 0x00007f25a7ff6000 ( 40KB) > rw 0x00007f25a8000000 ( 65496KB)
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-11, 15:24
No. I don't use LZO. I tried even remove any native support (i.e. all .so
from class path) and use java gzip. But nothing. 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> > Are you using LZO by any chance? If so, which version? > > Friso > > > On 11 jan 2011, at 15:57, Andrey Stepachev wrote: > > > After starting the hbase in jroсkit found the same memory leakage. > > > > After the launch > > > > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > > Tue Jan 11 16:49:31 2011 > > > > 11 16:49:31 MSK 2011 > > PID RSS VSZ% CPU > > 7863 2547760 5576744 78.7 > > > > > > > > JR dumps: > > > > Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB > > (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# > > threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal > > 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced > > = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 > > in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 > > KB # 20) > > > > > > > > After running the mr which make high write load (~1hour) > > > > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > > Tue Jan 11 17:08:56 2011 > > > > 11 17:08:56 MSK 2011 > > PID RSS VSZ% CPU > > 7863 4072396 5459572 100 > > > > > > > > JR said not important below specify why) > > > > http://paste.ubuntu.com/552820/ > > <http://paste.ubuntu.com/552820/> > > > > > > 7863: > > Total mapped 5742628KB +165888KB (reserved=1144000KB > > -1532404KB) > > - Java heap 2048000KB (reserved=0KB > -1472176KB) > > - GC tables 68512KB > > - Thread stacks 38028KB +792KB (#threads=114 +3) > > - Compiled code 1048576KB (used=3376KB +776KB) > > - Internal 1480KB +256KB > > - OS 517944KB -31744KB > > - Other 1996792KB +195816KB > > - Classblocks 1280KB (malloced=1156KB > > +45KB #3421 +136) > > - Java class data 20992KB +768KB (malloced=20843KB > > +840KB #15774 +640 in 3421 classes) > > - Native memory tracking 1024KB (malloced=325KB +10KB > #20) > > > > > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > OS *java r x 0x0000000000400000.( > 76KB) > > OS *java rw 0x0000000000612000 ( > 4KB) > > OS *[heap] rw 0x0000000000613000.( > 478712KB) > > INT Poll r 0x000000007fffe000 ( > 4KB) > > INT Membar rw 0x000000007ffff000.( > 4KB) > > MSP Classblocks (1/2) rw 0x0000000082ec0000 ( > 768KB) > > MSP Classblocks (2/2) rw 0x0000000082f80000 ( > 512KB) > > HEAP Java heap rw > 0x0000000083000000.(2048000KB) > > rw 0x00007f2574000000 ( > 65500KB) > > 0x00007f2577ff7000.( > 36KB) > > rw 0x00007f2584000000 ( > 65492KB) > > 0x00007f2587ff5000.( > 44KB) > > rw 0x00007f258c000000 ( > 65500KB) > > 0x00007f258fff7000 ( > 36KB) > > rw 0x00007f2590000000 ( > 65500KB) > > 0x00007f2593ff7000 ( > 36KB) > > rw 0x00007f2594000000 ( > 65500KB) > > 0x00007f2597ff7000 ( > 36KB) > > rw 0x00007f2598000000 ( > 131036KB) > > 0x00007f259fff7000 ( > 36KB) > > rw 0x00007f25a0000000 (
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryXavier Stevens 2011-01-11, 16:43
Are you using a newer linux kernel with the new and "improved" memory
allocator? If so try setting this in hadoop-env.sh: export MALLOC_ARENA_MAX=<number of cores you want to use> Maybe start by setting it to 4. You can thank Todd Lipcon if this works for you. Cheers, -Xavier On 1/11/11 7:24 AM, Andrey Stepachev wrote: > No. I don't use LZO. I tried even remove any native support (i.e. all .so > from class path) > and use java gzip. But nothing. > > > 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> > >> Are you using LZO by any chance? If so, which version? >> >> Friso >> >> >> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >> >>> After starting the hbase in jroсkit found the same memory leakage. >>> >>> After the launch >>> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>> Tue Jan 11 16:49:31 2011 >>> >>> 11 16:49:31 MSK 2011 >>> PID RSS VSZ% CPU >>> 7863 2547760 5576744 78.7 >>> >>> >>> >>> JR dumps: >>> >>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# >>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal >>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced >>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 >>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 >>> KB # 20) >>> >>> >>> >>> After running the mr which make high write load (~1hour) >>> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>> Tue Jan 11 17:08:56 2011 >>> >>> 11 17:08:56 MSK 2011 >>> PID RSS VSZ% CPU >>> 7863 4072396 5459572 100 >>> >>> >>> >>> JR said not important below specify why) >>> >>> http://paste.ubuntu.com/552820/ >>> <http://paste.ubuntu.com/552820/> >>> >>> >>> 7863: >>> Total mapped 5742628KB +165888KB (reserved=1144000KB >>> -1532404KB) >>> - Java heap 2048000KB (reserved=0KB >> -1472176KB) >>> - GC tables 68512KB >>> - Thread stacks 38028KB +792KB (#threads=114 +3) >>> - Compiled code 1048576KB (used=3376KB +776KB) >>> - Internal 1480KB +256KB >>> - OS 517944KB -31744KB >>> - Other 1996792KB +195816KB >>> - Classblocks 1280KB (malloced=1156KB >>> +45KB #3421 +136) >>> - Java class data 20992KB +768KB (malloced=20843KB >>> +840KB #15774 +640 in 3421 classes) >>> - Native memory tracking 1024KB (malloced=325KB +10KB >> #20) >>> >>> >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> OS *java r x 0x0000000000400000.( >> 76KB) >>> OS *java rw 0x0000000000612000 ( >> 4KB) >>> OS *[heap] rw 0x0000000000613000.( >> 478712KB) >>> INT Poll r 0x000000007fffe000 ( >> 4KB) >>> INT Membar rw 0x000000007ffff000.( >> 4KB) >>> MSP Classblocks (1/2) rw 0x0000000082ec0000 ( >> 768KB) >>> MSP Classblocks (2/2) rw 0x0000000082f80000 ( >> 512KB) >>> HEAP Java heap rw >> 0x0000000083000000.(2048000KB) >>> rw 0x00007f2574000000 ( >> 65500KB) >>> 0x00007f2577ff7000.( >> 36KB) >>> rw 0x00007f2584000000 ( >> 65492KB) >>> 0x00007f2587ff5000.( >> 44KB) >>> rw 0x00007f258c000000 ( >> 65500KB) >>> 0x00007f258fff7000 ( >> 36KB) >>> rw 0x00007f2590000000 ( >> 65500KB) >>> 0x00007f2593ff7000 (
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-11, 20:55
I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
problem thread. All those 65M blocks here. And JVM continues to eat memory on heavy write load. And yes, I use "improved" kernel Linux 2.6.34.7-0.5. 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> > Are you using a newer linux kernel with the new and "improved" memory > allocator? > > If so try setting this in hadoop-env.sh: > > export MALLOC_ARENA_MAX=<number of cores you want to use> > > Maybe start by setting it to 4. You can thank Todd Lipcon if this works > for you. > > Cheers, > > > -Xavier > > On 1/11/11 7:24 AM, Andrey Stepachev wrote: > > No. I don't use LZO. I tried even remove any native support (i.e. all .so > > from class path) > > and use java gzip. But nothing. > > > > > > 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> > > > >> Are you using LZO by any chance? If so, which version? > >> > >> Friso > >> > >> > >> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: > >> > >>> After starting the hbase in jroсkit found the same memory leakage. > >>> > >>> After the launch > >>> > >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > >>> Tue Jan 11 16:49:31 2011 > >>> > >>> 11 16:49:31 MSK 2011 > >>> PID RSS VSZ% CPU > >>> 7863 2547760 5576744 78.7 > >>> > >>> > >>> > >>> JR dumps: > >>> > >>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB > >>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# > >>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal > >>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced > >>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 > >>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 > >>> KB # 20) > >>> > >>> > >>> > >>> After running the mr which make high write load (~1hour) > >>> > >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > >>> Tue Jan 11 17:08:56 2011 > >>> > >>> 11 17:08:56 MSK 2011 > >>> PID RSS VSZ% CPU > >>> 7863 4072396 5459572 100 > >>> > >>> > >>> > >>> JR said not important below specify why) > >>> > >>> http://paste.ubuntu.com/552820/ > >>> <http://paste.ubuntu.com/552820/> > >>> > >>> > >>> 7863: > >>> Total mapped 5742628KB +165888KB (reserved=1144000KB > >>> -1532404KB) > >>> - Java heap 2048000KB (reserved=0KB > >> -1472176KB) > >>> - GC tables 68512KB > >>> - Thread stacks 38028KB +792KB (#threads=114 +3) > >>> - Compiled code 1048576KB (used=3376KB +776KB) > >>> - Internal 1480KB +256KB > >>> - OS 517944KB -31744KB > >>> - Other 1996792KB +195816KB > >>> - Classblocks 1280KB (malloced=1156KB > >>> +45KB #3421 +136) > >>> - Java class data 20992KB +768KB (malloced=20843KB > >>> +840KB #15774 +640 in 3421 classes) > >>> - Native memory tracking 1024KB (malloced=325KB +10KB > >> #20) > >>> > >>> > >> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> OS *java r x 0x0000000000400000.( > >> 76KB) > >>> OS *java rw 0x0000000000612000 ( > >> 4KB) > >>> OS *[heap] rw 0x0000000000613000.( > >> 478712KB) > >>> INT Poll r 0x000000007fffe000 ( > >> 4KB) > >>> INT Membar rw 0x000000007ffff000.( > >> 4KB) > >>> MSP Classblocks (1/2) rw 0x0000000082ec0000 ( > >> 768KB) > >>> MSP Classblocks (2/2) rw 0x0000000082f80000 ( > >> 512KB) > >>> HEAP Java heap rw > >> 0x0000000083000000.(2048000KB) > >>> rw 0x00007f2574000000 ( > >> 65500KB) > >>> 0x00007f2577ff7000.(
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-12, 06:46
My bad. All things work. Thanks for Todd Lipcon :)
2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> > I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO > problem thread. All those 65M blocks here. And JVM continues to eat memory > on heavy write load. And yes, I use "improved" kernel > Linux 2.6.34.7-0.5. > > 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> > > Are you using a newer linux kernel with the new and "improved" memory >> allocator? >> >> If so try setting this in hadoop-env.sh: >> >> export MALLOC_ARENA_MAX=<number of cores you want to use> >> >> Maybe start by setting it to 4. You can thank Todd Lipcon if this works >> for you. >> >> Cheers, >> >> >> -Xavier >> >> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >> > No. I don't use LZO. I tried even remove any native support (i.e. all >> .so >> > from class path) >> > and use java gzip. But nothing. >> > >> > >> > 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >> > >> >> Are you using LZO by any chance? If so, which version? >> >> >> >> Friso >> >> >> >> >> >> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >> >> >> >>> After starting the hbase in jroсkit found the same memory leakage. >> >>> >> >>> After the launch >> >>> >> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >> >>> Tue Jan 11 16:49:31 2011 >> >>> >> >>> 11 16:49:31 MSK 2011 >> >>> PID RSS VSZ% CPU >> >>> 7863 2547760 5576744 78.7 >> >>> >> >>> >> >>> >> >>> JR dumps: >> >>> >> >>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >> >>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# >> >>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal >> >>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced >> >>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 >> >>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 >> >>> KB # 20) >> >>> >> >>> >> >>> >> >>> After running the mr which make high write load (~1hour) >> >>> >> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >> >>> Tue Jan 11 17:08:56 2011 >> >>> >> >>> 11 17:08:56 MSK 2011 >> >>> PID RSS VSZ% CPU >> >>> 7863 4072396 5459572 100 >> >>> >> >>> >> >>> >> >>> JR said not important below specify why) >> >>> >> >>> http://paste.ubuntu.com/552820/ >> >>> <http://paste.ubuntu.com/552820/> >> >>> >> >>> >> >>> 7863: >> >>> Total mapped 5742628KB +165888KB (reserved=1144000KB >> >>> -1532404KB) >> >>> - Java heap 2048000KB (reserved=0KB >> >> -1472176KB) >> >>> - GC tables 68512KB >> >>> - Thread stacks 38028KB +792KB (#threads=114 +3) >> >>> - Compiled code 1048576KB (used=3376KB +776KB) >> >>> - Internal 1480KB +256KB >> >>> - OS 517944KB -31744KB >> >>> - Other 1996792KB +195816KB >> >>> - Classblocks 1280KB (malloced=1156KB >> >>> +45KB #3421 +136) >> >>> - Java class data 20992KB +768KB (malloced=20843KB >> >>> +840KB #15774 +640 in 3421 classes) >> >>> - Native memory tracking 1024KB (malloced=325KB >> +10KB >> >> #20) >> >>> >> >>> >> >> >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> OS *java r x 0x0000000000400000.( >> >> 76KB) >> >>> OS *java rw 0x0000000000612000 ( >> >> 4KB) >> >>> OS *[heap] rw 0x0000000000613000.( >> >> 478712KB) >> >>> INT Poll r 0x000000007fffe000 ( >> >> 4KB) >> >>> INT Membar rw 0x000000007ffff000.( >> >> 4KB) >> >>> MSP Classblocks (1/2) rw 0x0000000082ec0000 ( >> >> 768KB) >> >>> MSP Classblocks (2/2) rw 0x0000000082f80000 ( >> >> 512KB)
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-12, 09:09
Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in hbase-env.sh?
Did you also use the -XX:MaxDirectMemorySize=256m ? It would be nice to check that this is a different than the leakage with LZO... Thanks, Friso On 12 jan 2011, at 07:46, Andrey Stepachev wrote: > My bad. All things work. Thanks for Todd Lipcon :) > > 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> > >> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO >> problem thread. All those 65M blocks here. And JVM continues to eat memory >> on heavy write load. And yes, I use "improved" kernel >> Linux 2.6.34.7-0.5. >> >> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> >> >> Are you using a newer linux kernel with the new and "improved" memory >>> allocator? >>> >>> If so try setting this in hadoop-env.sh: >>> >>> export MALLOC_ARENA_MAX=<number of cores you want to use> >>> >>> Maybe start by setting it to 4. You can thank Todd Lipcon if this works >>> for you. >>> >>> Cheers, >>> >>> >>> -Xavier >>> >>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >>>> No. I don't use LZO. I tried even remove any native support (i.e. all >>> .so >>>> from class path) >>>> and use java gzip. But nothing. >>>> >>>> >>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >>>> >>>>> Are you using LZO by any chance? If so, which version? >>>>> >>>>> Friso >>>>> >>>>> >>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >>>>> >>>>>> After starting the hbase in jroсkit found the same memory leakage. >>>>>> >>>>>> After the launch >>>>>> >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>> Tue Jan 11 16:49:31 2011 >>>>>> >>>>>> 11 16:49:31 MSK 2011 >>>>>> PID RSS VSZ% CPU >>>>>> 7863 2547760 5576744 78.7 >>>>>> >>>>>> >>>>>> >>>>>> JR dumps: >>>>>> >>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (# >>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal >>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced >>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134 >>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10 >>>>>> KB # 20) >>>>>> >>>>>> >>>>>> >>>>>> After running the mr which make high write load (~1hour) >>>>>> >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>> Tue Jan 11 17:08:56 2011 >>>>>> >>>>>> 11 17:08:56 MSK 2011 >>>>>> PID RSS VSZ% CPU >>>>>> 7863 4072396 5459572 100 >>>>>> >>>>>> >>>>>> >>>>>> JR said not important below specify why) >>>>>> >>>>>> http://paste.ubuntu.com/552820/ >>>>>> <http://paste.ubuntu.com/552820/> >>>>>> >>>>>> >>>>>> 7863: >>>>>> Total mapped 5742628KB +165888KB (reserved=1144000KB >>>>>> -1532404KB) >>>>>> - Java heap 2048000KB (reserved=0KB >>>>> -1472176KB) >>>>>> - GC tables 68512KB >>>>>> - Thread stacks 38028KB +792KB (#threads=114 +3) >>>>>> - Compiled code 1048576KB (used=3376KB +776KB) >>>>>> - Internal 1480KB +256KB >>>>>> - OS 517944KB -31744KB >>>>>> - Other 1996792KB +195816KB >>>>>> - Classblocks 1280KB (malloced=1156KB >>>>>> +45KB #3421 +136) >>>>>> - Java class data 20992KB +768KB (malloced=20843KB >>>>>> +840KB #15774 +640 in 3421 classes) >>>>>> - Native memory tracking 1024KB (malloced=325KB >>> +10KB >>>>> #20) >>>>>> >>>>>> >>>>> >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> OS *java r x 0x0000000000400000.( >>>>> 76KB) >>>>>> OS *java rw 0x0000000000612000 ( >>>>> 4KB) >>>>>> OS *[heap] rw 0x0000000000613000.( >>>>> 478712KB)
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-12, 09:59
with MALLOC_ARENA_MAX=2
I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything (even no OOM exceptions or so on). But it looks like i have exactly the same issue (it looks like). I have many 64Mb anon memory blocks. (sometimes they 132MB). And on heavy load i have rapidly growing rss size of jvm process. 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> > Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in > hbase-env.sh? > > Did you also use the -XX:MaxDirectMemorySize=256m ? > > It would be nice to check that this is a different than the leakage with > LZO... > > > Thanks, > Friso > > > On 12 jan 2011, at 07:46, Andrey Stepachev wrote: > > > My bad. All things work. Thanks for Todd Lipcon :) > > > > 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> > > > >> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO > >> problem thread. All those 65M blocks here. And JVM continues to eat > memory > >> on heavy write load. And yes, I use "improved" kernel > >> Linux 2.6.34.7-0.5. > >> > >> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> > >> > >> Are you using a newer linux kernel with the new and "improved" memory > >>> allocator? > >>> > >>> If so try setting this in hadoop-env.sh: > >>> > >>> export MALLOC_ARENA_MAX=<number of cores you want to use> > >>> > >>> Maybe start by setting it to 4. You can thank Todd Lipcon if this > works > >>> for you. > >>> > >>> Cheers, > >>> > >>> > >>> -Xavier > >>> > >>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: > >>>> No. I don't use LZO. I tried even remove any native support (i.e. all > >>> .so > >>>> from class path) > >>>> and use java gzip. But nothing. > >>>> > >>>> > >>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> > >>>> > >>>>> Are you using LZO by any chance? If so, which version? > >>>>> > >>>>> Friso > >>>>> > >>>>> > >>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: > >>>>> > >>>>>> After starting the hbase in jroсkit found the same memory leakage. > >>>>>> > >>>>>> After the launch > >>>>>> > >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > >>>>>> Tue Jan 11 16:49:31 2011 > >>>>>> > >>>>>> 11 16:49:31 MSK 2011 > >>>>>> PID RSS VSZ% CPU > >>>>>> 7863 2547760 5576744 78.7 > >>>>>> > >>>>>> > >>>>>> > >>>>>> JR dumps: > >>>>>> > >>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB > >>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB > (# > >>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal > >>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB > (malloced > >>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # > 15134 > >>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB > +10 > >>>>>> KB # 20) > >>>>>> > >>>>>> > >>>>>> > >>>>>> After running the mr which make high write load (~1hour) > >>>>>> > >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head > >>>>>> Tue Jan 11 17:08:56 2011 > >>>>>> > >>>>>> 11 17:08:56 MSK 2011 > >>>>>> PID RSS VSZ% CPU > >>>>>> 7863 4072396 5459572 100 > >>>>>> > >>>>>> > >>>>>> > >>>>>> JR said not important below specify why) > >>>>>> > >>>>>> http://paste.ubuntu.com/552820/ > >>>>>> <http://paste.ubuntu.com/552820/> > >>>>>> > >>>>>> > >>>>>> 7863: > >>>>>> Total mapped 5742628KB +165888KB > (reserved=1144000KB > >>>>>> -1532404KB) > >>>>>> - Java heap 2048000KB (reserved=0KB > >>>>> -1472176KB) > >>>>>> - GC tables 68512KB > >>>>>> - Thread stacks 38028KB +792KB (#threads=114 +3) > >>>>>> - Compiled code 1048576KB (used=3376KB > +776KB) > >>>>>> - Internal 1480KB +256KB > >>>>>> - OS 517944KB -31744KB > >>>>>> - Other 1996792KB +195816KB > >>>>>> - Classblocks 1280KB (malloced=1156KB
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-12, 10:51
Thanks.
I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue. I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not). Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right? Friso On 12 jan 2011, at 10:59, Andrey Stepachev wrote: > with MALLOC_ARENA_MAX=2 > > I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything > (even no OOM > exceptions or so on). > > But it looks like i have exactly the same issue (it looks like). I have many > 64Mb anon memory blocks. > (sometimes they 132MB). And on heavy load i have rapidly growing rss size of > jvm process. > > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> > >> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in >> hbase-env.sh? >> >> Did you also use the -XX:MaxDirectMemorySize=256m ? >> >> It would be nice to check that this is a different than the leakage with >> LZO... >> >> >> Thanks, >> Friso >> >> >> On 12 jan 2011, at 07:46, Andrey Stepachev wrote: >> >>> My bad. All things work. Thanks for Todd Lipcon :) >>> >>> 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> >>> >>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO >>>> problem thread. All those 65M blocks here. And JVM continues to eat >> memory >>>> on heavy write load. And yes, I use "improved" kernel >>>> Linux 2.6.34.7-0.5. >>>> >>>> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> >>>> >>>> Are you using a newer linux kernel with the new and "improved" memory >>>>> allocator? >>>>> >>>>> If so try setting this in hadoop-env.sh: >>>>> >>>>> export MALLOC_ARENA_MAX=<number of cores you want to use> >>>>> >>>>> Maybe start by setting it to 4. You can thank Todd Lipcon if this >> works >>>>> for you. >>>>> >>>>> Cheers, >>>>> >>>>> >>>>> -Xavier >>>>> >>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all >>>>> .so >>>>>> from class path) >>>>>> and use java gzip. But nothing. >>>>>> >>>>>> >>>>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >>>>>> >>>>>>> Are you using LZO by any chance? If so, which version? >>>>>>> >>>>>>> Friso >>>>>>> >>>>>>> >>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >>>>>>> >>>>>>>> After starting the hbase in jroсkit found the same memory leakage. >>>>>>>> >>>>>>>> After the launch >>>>>>>> >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>>>> Tue Jan 11 16:49:31 2011 >>>>>>>> >>>>>>>> 11 16:49:31 MSK 2011 >>>>>>>> PID RSS VSZ% CPU >>>>>>>> 7863 2547760 5576744 78.7 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> JR dumps: >>>>>>>> >>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB >> (# >>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal >>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB >> (malloced >>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # >> 15134 >>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB >> +10 >>>>>>>> KB # 20) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> After running the mr which make high write load (~1hour) >>>>>>>> >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>>>> Tue Jan 11 17:08:56 2011 >>>>>>>> >>>>>>>> 11 17:08:56 MSK 2011 >>>>>>>> PID RSS VSZ% CPU >>>>>>>> 7863 4072396 5459572 100 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> JR said not important below specify why) >>>>>>>> >>>>>>>> http://paste.ubuntu.com/552820/ >>>>>>>> <http://paste.ubuntu.com/552820/> >>>>>>>> >>>>>>>> >>>>>>>> 7863: >>>>>>>> Total mapped 5742628KB +165888KB >> (reserved=1144000KB
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryAndrey Stepachev 2011-01-12, 11:05
No, I use only malloc env var, and I set it (as suggested before) into
hbase-env.sh, and it looks like it eats more less memory (in my case 4.7G vs 3.3G with 2Gheap) 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> > Thanks. > > I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show > this issue. > > I tried with a newer Hbase and LZO version, also with the MALLOC... setting > but without max direct memory set, so I was wondering whether you need a > combination of the two to fix things (apparently not). > > Now i am wondering whether I did something wrong setting the env var. It > should just be picked up when it's in hbase-env.sh, right? > > > Friso > > > > On 12 jan 2011, at 10:59, Andrey Stepachev wrote: > > > with MALLOC_ARENA_MAX=2 > > > > I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect > anything > > (even no OOM > > exceptions or so on). > > > > But it looks like i have exactly the same issue (it looks like). I have > many > > 64Mb anon memory blocks. > > (sometimes they 132MB). And on heavy load i have rapidly growing rss size > of > > jvm process. > > > > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> > > > >> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in > >> hbase-env.sh? > >> > >> Did you also use the -XX:MaxDirectMemorySize=256m ? > >> > >> It would be nice to check that this is a different than the leakage with > >> LZO... > >> > >> > >> Thanks, > >> Friso > >> > >> > >> On 12 jan 2011, at 07:46, Andrey Stepachev wrote: > >> > >>> My bad. All things work. Thanks for Todd Lipcon :) > >>> > >>> 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> > >>> > >>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in > LZO > >>>> problem thread. All those 65M blocks here. And JVM continues to eat > >> memory > >>>> on heavy write load. And yes, I use "improved" kernel > >>>> Linux 2.6.34.7-0.5. > >>>> > >>>> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> > >>>> > >>>> Are you using a newer linux kernel with the new and "improved" memory > >>>>> allocator? > >>>>> > >>>>> If so try setting this in hadoop-env.sh: > >>>>> > >>>>> export MALLOC_ARENA_MAX=<number of cores you want to use> > >>>>> > >>>>> Maybe start by setting it to 4. You can thank Todd Lipcon if this > >> works > >>>>> for you. > >>>>> > >>>>> Cheers, > >>>>> > >>>>> > >>>>> -Xavier > >>>>> > >>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: > >>>>>> No. I don't use LZO. I tried even remove any native support (i.e. > all > >>>>> .so > >>>>>> from class path) > >>>>>> and use java gzip. But nothing. > >>>>>> > >>>>>> > >>>>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> > >>>>>> > >>>>>>> Are you using LZO by any chance? If so, which version? > >>>>>>> > >>>>>>> Friso > >>>>>>> > >>>>>>> > >>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: > >>>>>>> > >>>>>>>> After starting the hbase in jroсkit found the same memory leakage. > >>>>>>>> > >>>>>>>> After the launch > >>>>>>>> > >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | > head > >>>>>>>> Tue Jan 11 16:49:31 2011 > >>>>>>>> > >>>>>>>> 11 16:49:31 MSK 2011 > >>>>>>>> PID RSS VSZ% CPU > >>>>>>>> 7863 2547760 5576744 78.7 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> JR dumps: > >>>>>>>> > >>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap > 2048000KB > >>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB > >> (# > >>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - > Internal > >>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB > >> (malloced > >>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # > >> 15134 > >>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB > >> +10 > >>>>>>>> KB # 20) > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> After running the mr which make high write load (~1hour) > >>>>>>>> > >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu |
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-12, 11:09
Once I have a moment to play with our dev cluster, I will give this another go.
Thanks, Friso On 12 jan 2011, at 12:05, Andrey Stepachev wrote: > No, I use only malloc env var, and I set it (as suggested before) into > hbase-env.sh, and it looks like it eats more less memory (in my case 4.7G vs > 3.3G with 2Gheap) > > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> > >> Thanks. >> >> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show >> this issue. >> >> I tried with a newer Hbase and LZO version, also with the MALLOC... setting >> but without max direct memory set, so I was wondering whether you need a >> combination of the two to fix things (apparently not). >> >> Now i am wondering whether I did something wrong setting the env var. It >> should just be picked up when it's in hbase-env.sh, right? >> >> >> Friso >> >> >> >> On 12 jan 2011, at 10:59, Andrey Stepachev wrote: >> >>> with MALLOC_ARENA_MAX=2 >>> >>> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect >> anything >>> (even no OOM >>> exceptions or so on). >>> >>> But it looks like i have exactly the same issue (it looks like). I have >> many >>> 64Mb anon memory blocks. >>> (sometimes they 132MB). And on heavy load i have rapidly growing rss size >> of >>> jvm process. >>> >>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> >>> >>>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in >>>> hbase-env.sh? >>>> >>>> Did you also use the -XX:MaxDirectMemorySize=256m ? >>>> >>>> It would be nice to check that this is a different than the leakage with >>>> LZO... >>>> >>>> >>>> Thanks, >>>> Friso >>>> >>>> >>>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote: >>>> >>>>> My bad. All things work. Thanks for Todd Lipcon :) >>>>> >>>>> 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> >>>>> >>>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in >> LZO >>>>>> problem thread. All those 65M blocks here. And JVM continues to eat >>>> memory >>>>>> on heavy write load. And yes, I use "improved" kernel >>>>>> Linux 2.6.34.7-0.5. >>>>>> >>>>>> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> >>>>>> >>>>>> Are you using a newer linux kernel with the new and "improved" memory >>>>>>> allocator? >>>>>>> >>>>>>> If so try setting this in hadoop-env.sh: >>>>>>> >>>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use> >>>>>>> >>>>>>> Maybe start by setting it to 4. You can thank Todd Lipcon if this >>>> works >>>>>>> for you. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> >>>>>>> -Xavier >>>>>>> >>>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >>>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. >> all >>>>>>> .so >>>>>>>> from class path) >>>>>>>> and use java gzip. But nothing. >>>>>>>> >>>>>>>> >>>>>>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >>>>>>>> >>>>>>>>> Are you using LZO by any chance? If so, which version? >>>>>>>>> >>>>>>>>> Friso >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >>>>>>>>> >>>>>>>>>> After starting the hbase in jroсkit found the same memory leakage. >>>>>>>>>> >>>>>>>>>> After the launch >>>>>>>>>> >>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | >> head >>>>>>>>>> Tue Jan 11 16:49:31 2011 >>>>>>>>>> >>>>>>>>>> 11 16:49:31 MSK 2011 >>>>>>>>>> PID RSS VSZ% CPU >>>>>>>>>> 7863 2547760 5576744 78.7 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> JR dumps: >>>>>>>>>> >>>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap >> 2048000KB >>>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB >>>> (# >>>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - >> Internal >>>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB >>>> (malloced >>>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # >>>> 15134 >>>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTatsuya Kawano 2011-01-12, 11:44
Hi, Have you tried the ASF version of hadoop-core? (The one distributed with HBase 0.90RC.) It doesn't call reinit() so I'm hoping it will just work fine with the latest hadoop-lzo and other compressors. Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan On Jan 12, 2011, at 7:51 PM, Friso van Vollenhoven <[EMAIL PROTECTED]> wrote: > Thanks. > > I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue. > > I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not). > > Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right? > > > Friso > > > > On 12 jan 2011, at 10:59, Andrey Stepachev wrote: > >> with MALLOC_ARENA_MAX=2 >> >> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything >> (even no OOM >> exceptions or so on). >> >> But it looks like i have exactly the same issue (it looks like). I have many >> 64Mb anon memory blocks. >> (sometimes they 132MB). And on heavy load i have rapidly growing rss size of >> jvm process. >> >> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> >> >>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in >>> hbase-env.sh? >>> >>> Did you also use the -XX:MaxDirectMemorySize=256m ? >>> >>> It would be nice to check that this is a different than the leakage with >>> LZO... >>> >>> >>> Thanks, >>> Friso >>> >>> >>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote: >>> >>>> My bad. All things work. Thanks for Todd Lipcon :) >>>> >>>> 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> >>>> >>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO >>>>> problem thread. All those 65M blocks here. And JVM continues to eat >>> memory >>>>> on heavy write load. And yes, I use "improved" kernel >>>>> Linux 2.6.34.7-0.5. >>>>> >>>>> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> >>>>> >>>>> Are you using a newer linux kernel with the new and "improved" memory >>>>>> allocator? >>>>>> >>>>>> If so try setting this in hadoop-env.sh: >>>>>> >>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use> >>>>>> >>>>>> Maybe start by setting it to 4. You can thank Todd Lipcon if this >>> works >>>>>> for you. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> >>>>>> -Xavier >>>>>> >>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all >>>>>> .so >>>>>>> from class path) >>>>>>> and use java gzip. But nothing. >>>>>>> >>>>>>> >>>>>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >>>>>>> >>>>>>>> Are you using LZO by any chance? If so, which version? >>>>>>>> >>>>>>>> Friso >>>>>>>> >>>>>>>> >>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >>>>>>>> >>>>>>>>> After starting the hbase in jroсkit found the same memory leakage. >>>>>>>>> >>>>>>>>> After the launch >>>>>>>>> >>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>>>>> Tue Jan 11 16:49:31 2011 >>>>>>>>> >>>>>>>>> 11 16:49:31 MSK 2011 >>>>>>>>> PID RSS VSZ% CPU >>>>>>>>> 7863 2547760 5576744 78.7 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> JR dumps: >>>>>>>>> >>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB >>> (# >>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal >>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB >>> (malloced >>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # >>> 15134 >>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB >>> +10 >>>>>>>>> KB # 20) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> After running the mr which make high write load (~1hour) >>>>>>>>> >>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-12, 11:58
No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak...
Friso On 12 jan 2011, at 12:44, Tatsuya Kawano wrote: > > Hi, > > Have you tried the ASF version of hadoop-core? (The one distributed with HBase 0.90RC.) > > It doesn't call reinit() so I'm hoping it will just work fine with the latest hadoop-lzo and other compressors. > > Thanks, > > -- > Tatsuya Kawano (Mr.) > Tokyo, Japan > > > On Jan 12, 2011, at 7:51 PM, Friso van Vollenhoven <[EMAIL PROTECTED]> wrote: > >> Thanks. >> >> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue. >> >> I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not). >> >> Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right? >> >> >> Friso >> >> >> >> On 12 jan 2011, at 10:59, Andrey Stepachev wrote: >> >>> with MALLOC_ARENA_MAX=2 >>> >>> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything >>> (even no OOM >>> exceptions or so on). >>> >>> But it looks like i have exactly the same issue (it looks like). I have many >>> 64Mb anon memory blocks. >>> (sometimes they 132MB). And on heavy load i have rapidly growing rss size of >>> jvm process. >>> >>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]> >>> >>>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in >>>> hbase-env.sh? >>>> >>>> Did you also use the -XX:MaxDirectMemorySize=256m ? >>>> >>>> It would be nice to check that this is a different than the leakage with >>>> LZO... >>>> >>>> >>>> Thanks, >>>> Friso >>>> >>>> >>>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote: >>>> >>>>> My bad. All things work. Thanks for Todd Lipcon :) >>>>> >>>>> 2011/1/11 Andrey Stepachev <[EMAIL PROTECTED]> >>>>> >>>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO >>>>>> problem thread. All those 65M blocks here. And JVM continues to eat >>>> memory >>>>>> on heavy write load. And yes, I use "improved" kernel >>>>>> Linux 2.6.34.7-0.5. >>>>>> >>>>>> 2011/1/11 Xavier Stevens <[EMAIL PROTECTED]> >>>>>> >>>>>> Are you using a newer linux kernel with the new and "improved" memory >>>>>>> allocator? >>>>>>> >>>>>>> If so try setting this in hadoop-env.sh: >>>>>>> >>>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use> >>>>>>> >>>>>>> Maybe start by setting it to 4. You can thank Todd Lipcon if this >>>> works >>>>>>> for you. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> >>>>>>> -Xavier >>>>>>> >>>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote: >>>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all >>>>>>> .so >>>>>>>> from class path) >>>>>>>> and use java gzip. But nothing. >>>>>>>> >>>>>>>> >>>>>>>> 2011/1/11 Friso van Vollenhoven <[EMAIL PROTECTED]> >>>>>>>> >>>>>>>>> Are you using LZO by any chance? If so, which version? >>>>>>>>> >>>>>>>>> Friso >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote: >>>>>>>>> >>>>>>>>>> After starting the hbase in jroсkit found the same memory leakage. >>>>>>>>>> >>>>>>>>>> After the launch >>>>>>>>>> >>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head >>>>>>>>>> Tue Jan 11 16:49:31 2011 >>>>>>>>>> >>>>>>>>>> 11 16:49:31 MSK 2011 >>>>>>>>>> PID RSS VSZ% CPU >>>>>>>>>> 7863 2547760 5576744 78.7 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> JR dumps: >>>>>>>>>> >>>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB >>>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB >>>> (# >>>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryStack 2011-01-12, 19:08
2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>:
> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak... > (Sorry for dumb question Friso) But HBase is leaking because we make use of the Compression API in a manner that produces leaks? Thanks, St.Ack
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-12, 20:14
Hey all,
I will be looking into this today :) -Todd On Wed, Jan 12, 2011 at 11:08 AM, Stack <[EMAIL PROTECTED]> wrote: > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: > > No, I haven't. But the Hadoop (mapreduce) LZO compression is not the > problem. Compressing the map output using LZO works just fine. The problem > is HBase LZO compression. The region server process is the one with the > memory leak... > > > > (Sorry for dumb question Friso) But HBase is leaking because we make > use of the Compression API in a manner that produces leaks? > Thanks, > St.Ack > -- Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-12, 21:20
Hi,
My guess is indeed that it has to do with using the reinit() method on compressors and making them long lived instead of throwaway together with the LZO implementation of reinit(), which magically causes NIO buffer objects not to be finalized and as a result not release their native allocations. It's just theory and I haven't had the time to properly verify this (unfortunately, I spend most of my time writing application code), but Todd said he will be looking into it further. I browsed the LZO code to see what was going on there, but with my limited knowledge of the HBase code it would be bald to say that this is for sure the case. It would be my first direction of investigation. I would add some logging to the LZO code where new direct byte buffers are created to log how often that happens and what size they are and then redo the workload that shows the leak. Together with some profiling you should be able to see how long it takes for these get finalized. Cheers, Friso On 12 jan 2011, at 20:08, Stack wrote: > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak... >> > > (Sorry for dumb question Friso) But HBase is leaking because we make > use of the Compression API in a manner that produces leaks? > Thanks, > St.Ack
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-12, 22:12
Yea, you're definitely on the right track. Have you considered systems
programming, Friso? :) Hopefully should have a candidate patch to LZO later today. -Todd On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < [EMAIL PROTECTED]> wrote: > Hi, > My guess is indeed that it has to do with using the reinit() method on > compressors and making them long lived instead of throwaway together with > the LZO implementation of reinit(), which magically causes NIO buffer > objects not to be finalized and as a result not release their native > allocations. It's just theory and I haven't had the time to properly verify > this (unfortunately, I spend most of my time writing application code), but > Todd said he will be looking into it further. I browsed the LZO code to see > what was going on there, but with my limited knowledge of the HBase code it > would be bald to say that this is for sure the case. It would be my first > direction of investigation. I would add some logging to the LZO code where > new direct byte buffers are created to log how often that happens and what > size they are and then redo the workload that shows the leak. Together with > some profiling you should be able to see how long it takes for these get > finalized. > > Cheers, > Friso > > > > On 12 jan 2011, at 20:08, Stack wrote: > > > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: > >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the > problem. Compressing the map output using LZO works just fine. The problem > is HBase LZO compression. The region server process is the one with the > memory leak... > >> > > > > (Sorry for dumb question Friso) But HBase is leaking because we make > > use of the Compression API in a manner that produces leaks? > > Thanks, > > St.Ack > > -- Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-12, 22:50
Can someone who is having this issue try checking out the following git
branch and rebuilding LZO? https://github.com/toddlipcon/hadoop-lzo/tree/realloc This definitely stems one leak of a 64KB directbuffer on every reinit. -Todd On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > Yea, you're definitely on the right track. Have you considered systems > programming, Friso? :) > > Hopefully should have a candidate patch to LZO later today. > > -Todd > > On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < > [EMAIL PROTECTED]> wrote: > >> Hi, >> My guess is indeed that it has to do with using the reinit() method on >> compressors and making them long lived instead of throwaway together with >> the LZO implementation of reinit(), which magically causes NIO buffer >> objects not to be finalized and as a result not release their native >> allocations. It's just theory and I haven't had the time to properly verify >> this (unfortunately, I spend most of my time writing application code), but >> Todd said he will be looking into it further. I browsed the LZO code to see >> what was going on there, but with my limited knowledge of the HBase code it >> would be bald to say that this is for sure the case. It would be my first >> direction of investigation. I would add some logging to the LZO code where >> new direct byte buffers are created to log how often that happens and what >> size they are and then redo the workload that shows the leak. Together with >> some profiling you should be able to see how long it takes for these get >> finalized. >> >> Cheers, >> Friso >> >> >> >> On 12 jan 2011, at 20:08, Stack wrote: >> >> > 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: >> >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the >> problem. Compressing the map output using LZO works just fine. The problem >> is HBase LZO compression. The region server process is the one with the >> memory leak... >> >> >> > >> > (Sorry for dumb question Friso) But HBase is leaking because we make >> > use of the Compression API in a manner that produces leaks? >> > Thanks, >> > St.Ack >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTatsuya Kawano 2011-01-12, 23:25
Hi Friso and everyone,
OK. We don't have to spend time to juggle hadoop-core jars anymore since Todd is working hard on enhancing hadoop-lzo behavior. I think your assumption is correct, but what I was trying to say was HBase doesn't change the way to use Hadoop compressors since HBase 0.20 release while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of them don't have reinit(). HBase tries to create an output compression stream on each compression block, and one HFile flush will contain roughly 1000 compression blocks. I think reinit() could get called 1000 times on one flush, and if hadoop-lzo allocates 64MB block on reinit() (HBase's compression blocks is about 64KB though), it will become pretty much something you're observing now. Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > Can someone who is having this issue try checking out the following git > branch and rebuilding LZO? > > https://github.com/toddlipcon/hadoop-lzo/tree/realloc > > This definitely stems one leak of a 64KB directbuffer on every reinit. > > -Todd > > On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> Yea, you're definitely on the right track. Have you considered systems >> programming, Friso? :) >> >> Hopefully should have a candidate patch to LZO later today. >> >> -Todd >> >> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >> [EMAIL PROTECTED]> wrote: >> >>> Hi, >>> My guess is indeed that it has to do with using the reinit() method on >>> compressors and making them long lived instead of throwaway together with >>> the LZO implementation of reinit(), which magically causes NIO buffer >>> objects not to be finalized and as a result not release their native >>> allocations. It's just theory and I haven't had the time to properly verify >>> this (unfortunately, I spend most of my time writing application code), but >>> Todd said he will be looking into it further. I browsed the LZO code to see >>> what was going on there, but with my limited knowledge of the HBase code it >>> would be bald to say that this is for sure the case. It would be my first >>> direction of investigation. I would add some logging to the LZO code where >>> new direct byte buffers are created to log how often that happens and what >>> size they are and then redo the workload that shows the leak. Together with >>> some profiling you should be able to see how long it takes for these get >>> finalized. >>> >>> Cheers, >>> Friso >>> >>> >>> >>> On 12 jan 2011, at 20:08, Stack wrote: >>> >>>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: >>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the >>> problem. Compressing the map output using LZO works just fine. The problem >>> is HBase LZO compression. The region server process is the one with the >>> memory leak... >>>>> >>>> >>>> (Sorry for dumb question Friso) But HBase is leaking because we make >>>> use of the Compression API in a manner that produces leaks? >>>> Thanks, >>>> St.Ack >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-12, 23:30
On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote:
> Hi Friso and everyone, > > OK. We don't have to spend time to juggle hadoop-core jars anymore since > Todd is working hard on enhancing hadoop-lzo behavior. > > I think your assumption is correct, but what I was trying to say was HBase > doesn't change the way to use Hadoop compressors since HBase 0.20 release > while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and > CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and > CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I > thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of > them don't have reinit(). > > Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you have a CDH3b3 cluster for one of the other features included, you need to use a 3b3 client jar as well, which includes the reinit stuff. > HBase tries to create an output compression stream on each compression > block, and one HFile flush will contain roughly 1000 compression blocks. I > think reinit() could get called 1000 times on one flush, and if hadoop-lzo > allocates 64MB block on reinit() (HBase's compression blocks is about 64KB > though), it will become pretty much something you're observing now. > > Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. And in some circumstances (like all the rigged tests I've attempted to do) these get cleaned up nicely by the JVM. It seems only in pretty large heaps in real workloads does the leak actually end up running away. -Todd > > On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > Can someone who is having this issue try checking out the following git > > branch and rebuilding LZO? > > > > https://github.com/toddlipcon/hadoop-lzo/tree/realloc > > > > This definitely stems one leak of a 64KB directbuffer on every reinit. > > > > -Todd > > > > On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > >> Yea, you're definitely on the right track. Have you considered systems > >> programming, Friso? :) > >> > >> Hopefully should have a candidate patch to LZO later today. > >> > >> -Todd > >> > >> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < > >> [EMAIL PROTECTED]> wrote: > >> > >>> Hi, > >>> My guess is indeed that it has to do with using the reinit() method on > >>> compressors and making them long lived instead of throwaway together > with > >>> the LZO implementation of reinit(), which magically causes NIO buffer > >>> objects not to be finalized and as a result not release their native > >>> allocations. It's just theory and I haven't had the time to properly > verify > >>> this (unfortunately, I spend most of my time writing application code), > but > >>> Todd said he will be looking into it further. I browsed the LZO code to > see > >>> what was going on there, but with my limited knowledge of the HBase > code it > >>> would be bald to say that this is for sure the case. It would be my > first > >>> direction of investigation. I would add some logging to the LZO code > where > >>> new direct byte buffers are created to log how often that happens and > what > >>> size they are and then redo the workload that shows the leak. Together > with > >>> some profiling you should be able to see how long it takes for these > get > >>> finalized. > >>> > >>> Cheers, > >>> Friso > >>> > >>> > >>> > >>> On 12 jan 2011, at 20:08, Stack wrote: > >>> > >>>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: > >>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the > >>> problem. Compressing the map output using LZO works just fine. The > problem > >>> is HBase LZO compression. The region server process is the one with the > >>> memory leak... > >>>>> > >>>> > >>>> (Sorry for dumb question Friso) But HBase is leaking because we make > >>>> use of the Compression API in a manner that produces leaks? Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTatsuya Kawano 2011-01-13, 01:01
Hi Todd,
> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you > have a CDH3b3 cluster for one of the other features included, you need to > use a 3b3 client jar as well, Yeah, I saw the number "+737" after the version number. Thanks for clarifying it. (and sorry for the bad suggestion.) > And > in some circumstances (like all the rigged tests I've attempted to do) these > get cleaned up nicely by the JVM. It seems only in pretty large heaps in > real workloads does the leak actually end up running away. This issue should be circumstance dependent as we don't have direct control on deallocating those buffers. We need them GCed but they never occupy the Java heap to encourage the GC to run. -Tatsuya On Jan 13, 2011, at 8:30 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote: > >> Hi Friso and everyone, >> >> OK. We don't have to spend time to juggle hadoop-core jars anymore since >> Todd is working hard on enhancing hadoop-lzo behavior. >> >> I think your assumption is correct, but what I was trying to say was HBase >> doesn't change the way to use Hadoop compressors since HBase 0.20 release >> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and >> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and >> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I >> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of >> them don't have reinit(). >> >> > Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you > have a CDH3b3 cluster for one of the other features included, you need to > use a 3b3 client jar as well, which includes the reinit stuff. > > >> HBase tries to create an output compression stream on each compression >> block, and one HFile flush will contain roughly 1000 compression blocks. I >> think reinit() could get called 1000 times on one flush, and if hadoop-lzo >> allocates 64MB block on reinit() (HBase's compression blocks is about 64KB >> though), it will become pretty much something you're observing now. >> >> > Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. And > in some circumstances (like all the rigged tests I've attempted to do) these > get cleaned up nicely by the JVM. It seems only in pretty large heaps in > real workloads does the leak actually end up running away. > > -Todd > >> >> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> >>> Can someone who is having this issue try checking out the following git >>> branch and rebuilding LZO? >>> >>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc >>> >>> This definitely stems one leak of a 64KB directbuffer on every reinit. >>> >>> -Todd >>> >>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >>> >>>> Yea, you're definitely on the right track. Have you considered systems >>>> programming, Friso? :) >>>> >>>> Hopefully should have a candidate patch to LZO later today. >>>> >>>> -Todd >>>> >>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> Hi, >>>>> My guess is indeed that it has to do with using the reinit() method on >>>>> compressors and making them long lived instead of throwaway together >> with >>>>> the LZO implementation of reinit(), which magically causes NIO buffer >>>>> objects not to be finalized and as a result not release their native >>>>> allocations. It's just theory and I haven't had the time to properly >> verify >>>>> this (unfortunately, I spend most of my time writing application code), >> but >>>>> Todd said he will be looking into it further. I browsed the LZO code to >> see >>>>> what was going on there, but with my limited knowledge of the HBase >> code it >>>>> would be bald to say that this is for sure the case. It would be my >> first >>>>> direction of investigation. I would add some logging to the LZO code
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-13, 06:42
On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote:
> > And > > in some circumstances (like all the rigged tests I've attempted to do) > these > > get cleaned up nicely by the JVM. It seems only in pretty large heaps in > > real workloads does the leak actually end up running away. > > This issue should be circumstance dependent as we don't have direct control > on deallocating those buffers. We need them GCed but they never occupy the > Java heap to encourage the GC to run. > Thanks to reflection and use of undocumented APIs, you can actually free() a direct buffer - check out the patch referenced earlier in this thread. Of course it probably doesn't work on other JVMs... oh well. -Todd > > > On Jan 13, 2011, at 8:30 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <[EMAIL PROTECTED] > >wrote: > > > >> Hi Friso and everyone, > >> > >> OK. We don't have to spend time to juggle hadoop-core jars anymore since > >> Todd is working hard on enhancing hadoop-lzo behavior. > >> > >> I think your assumption is correct, but what I was trying to say was > HBase > >> doesn't change the way to use Hadoop compressors since HBase 0.20 > release > >> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and > >> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) > and > >> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so > I > >> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of > >> them don't have reinit(). > >> > >> > > Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you > > have a CDH3b3 cluster for one of the other features included, you need to > > use a 3b3 client jar as well, which includes the reinit stuff. > > > > > >> HBase tries to create an output compression stream on each compression > >> block, and one HFile flush will contain roughly 1000 compression blocks. > I > >> think reinit() could get called 1000 times on one flush, and if > hadoop-lzo > >> allocates 64MB block on reinit() (HBase's compression blocks is about > 64KB > >> though), it will become pretty much something you're observing now. > >> > >> > > Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. > And > > in some circumstances (like all the rigged tests I've attempted to do) > these > > get cleaned up nicely by the JVM. It seems only in pretty large heaps in > > real workloads does the leak actually end up running away. > > > > -Todd > > > >> > >> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> > >>> Can someone who is having this issue try checking out the following git > >>> branch and rebuilding LZO? > >>> > >>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc > >>> > >>> This definitely stems one leak of a 64KB directbuffer on every reinit. > >>> > >>> -Todd > >>> > >>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> > wrote: > >>> > >>>> Yea, you're definitely on the right track. Have you considered systems > >>>> programming, Friso? :) > >>>> > >>>> Hopefully should have a candidate patch to LZO later today. > >>>> > >>>> -Todd > >>>> > >>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < > >>>> [EMAIL PROTECTED]> wrote: > >>>> > >>>>> Hi, > >>>>> My guess is indeed that it has to do with using the reinit() method > on > >>>>> compressors and making them long lived instead of throwaway together > >> with > >>>>> the LZO implementation of reinit(), which magically causes NIO buffer > >>>>> objects not to be finalized and as a result not release their native > >>>>> allocations. It's just theory and I haven't had the time to properly > >> verify > >>>>> this (unfortunately, I spend most of my time writing application > code), > >> but > >>>>> Todd said he will be looking into it further. I browsed the LZO code > to > >> see > >>>>> what was going on there, but with my limited knowledge of the HBase Todd Lipcon Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-13, 08:12
Hey Todd,
Hopefully I can get to this somewhere next week. We have had our NN corrupted, so we are rebuilding the prod cluster, meaning we use dev for backing our apps now, so I have no environment to give it a go. Stay tuned... >> Yea, you're definitely on the right track. Have you considered systems >> programming, Friso? :) > Well, at least then you get to do your own memory management most of the time... Friso > Can someone who is having this issue try checking out the following git > branch and rebuilding LZO? > > https://github.com/toddlipcon/hadoop-lzo/tree/realloc > > This definitely stems one leak of a 64KB directbuffer on every reinit. > > -Todd > > On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> Yea, you're definitely on the right track. Have you considered systems >> programming, Friso? :) >> >> Hopefully should have a candidate patch to LZO later today. >> >> -Todd >> >> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >> [EMAIL PROTECTED]> wrote: >> >>> Hi, >>> My guess is indeed that it has to do with using the reinit() method on >>> compressors and making them long lived instead of throwaway together with >>> the LZO implementation of reinit(), which magically causes NIO buffer >>> objects not to be finalized and as a result not release their native >>> allocations. It's just theory and I haven't had the time to properly verify >>> this (unfortunately, I spend most of my time writing application code), but >>> Todd said he will be looking into it further. I browsed the LZO code to see >>> what was going on there, but with my limited knowledge of the HBase code it >>> would be bald to say that this is for sure the case. It would be my first >>> direction of investigation. I would add some logging to the LZO code where >>> new direct byte buffers are created to log how often that happens and what >>> size they are and then redo the workload that shows the leak. Together with >>> some profiling you should be able to see how long it takes for these get >>> finalized. >>> >>> Cheers, >>> Friso >>> >>> >>> >>> On 12 jan 2011, at 20:08, Stack wrote: >>> >>>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: >>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the >>> problem. Compressing the map output using LZO works just fine. The problem >>> is HBase LZO compression. The region server process is the one with the >>> memory leak... >>>>> >>>> >>>> (Sorry for dumb question Friso) But HBase is leaking because we make >>>> use of the Compression API in a manner that produces leaks? >>>> Thanks, >>>> St.Ack >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-13, 08:17
Inline...
> Hi Friso and everyone, > > OK. We don't have to spend time to juggle hadoop-core jars anymore since Todd is working hard on enhancing hadoop-lzo behavior. > > I think your assumption is correct, but what I was trying to say was HBase doesn't change the way to use Hadoop compressors since HBase 0.20 release while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of them don't have reinit(). Ah, so my mistake was that I thought using the reinit() is something HBase specific, but it just depends on the Hadoop jar that you drop in the lib folder then. It's just that I never saw these problems in mappers and reducers but only in the RS. @Stack, to answer your question once more then: I don't think it's a problem with the way that HBase uses the compressors, but it's a problem with the (LZO) compressor implementation in combination with the usage pattern that you get when using HBase and particular types of workloads. > > HBase tries to create an output compression stream on each compression block, and one HFile flush will contain roughly 1000 compression blocks. I think reinit() could get called 1000 times on one flush, and if hadoop-lzo allocates 64MB block on reinit() (HBase's compression blocks is about 64KB though), it will become pretty much something you're observing now. > > Thanks, > > -- > Tatsuya Kawano (Mr.) > Tokyo, Japan > > > On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> Can someone who is having this issue try checking out the following git >> branch and rebuilding LZO? >> >> https://github.com/toddlipcon/hadoop-lzo/tree/realloc >> >> This definitely stems one leak of a 64KB directbuffer on every reinit. >> >> -Todd >> >> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> >>> Yea, you're definitely on the right track. Have you considered systems >>> programming, Friso? :) >>> >>> Hopefully should have a candidate patch to LZO later today. >>> >>> -Todd >>> >>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi, >>>> My guess is indeed that it has to do with using the reinit() method on >>>> compressors and making them long lived instead of throwaway together with >>>> the LZO implementation of reinit(), which magically causes NIO buffer >>>> objects not to be finalized and as a result not release their native >>>> allocations. It's just theory and I haven't had the time to properly verify >>>> this (unfortunately, I spend most of my time writing application code), but >>>> Todd said he will be looking into it further. I browsed the LZO code to see >>>> what was going on there, but with my limited knowledge of the HBase code it >>>> would be bald to say that this is for sure the case. It would be my first >>>> direction of investigation. I would add some logging to the LZO code where >>>> new direct byte buffers are created to log how often that happens and what >>>> size they are and then redo the workload that shows the leak. Together with >>>> some profiling you should be able to see how long it takes for these get >>>> finalized. >>>> >>>> Cheers, >>>> Friso >>>> >>>> >>>> >>>> On 12 jan 2011, at 20:08, Stack wrote: >>>> >>>>> 2011/1/12 Friso van Vollenhoven <[EMAIL PROTECTED]>: >>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the >>>> problem. Compressing the map output using LZO works just fine. The problem >>>> is HBase LZO compression. The region server process is the one with the >>>> memory leak... >>>>>> >>>>> >>>>> (Sorry for dumb question Friso) But HBase is leaking because we make >>>>> use of the Compression API in a manner that produces leaks? >>>>> Thanks, >>>>> St.Ack >>>> >>>> >>> >>>
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryFriso van Vollenhoven 2011-01-13, 08:25
Hey Todd,
I saw the patch. On what JVM (versions) have you tested this? (Probably the wrong list for this, but: is there a officially supported JVM version for CDH3?) Thanks, Friso On 13 jan 2011, at 07:42, Todd Lipcon wrote: > On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote: > >>> And >>> in some circumstances (like all the rigged tests I've attempted to do) >> these >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps in >>> real workloads does the leak actually end up running away. >> >> This issue should be circumstance dependent as we don't have direct control >> on deallocating those buffers. We need them GCed but they never occupy the >> Java heap to encourage the GC to run. >> > > Thanks to reflection and use of undocumented APIs, you can actually free() a > direct buffer - check out the patch referenced earlier in this thread. > > Of course it probably doesn't work on other JVMs... oh well. > > -Todd > >> >> >> On Jan 13, 2011, at 8:30 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> >>> On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <[EMAIL PROTECTED] >>> wrote: >>> >>>> Hi Friso and everyone, >>>> >>>> OK. We don't have to spend time to juggle hadoop-core jars anymore since >>>> Todd is working hard on enhancing hadoop-lzo behavior. >>>> >>>> I think your assumption is correct, but what I was trying to say was >> HBase >>>> doesn't change the way to use Hadoop compressors since HBase 0.20 >> release >>>> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and >>>> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) >> and >>>> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so >> I >>>> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of >>>> them don't have reinit(). >>>> >>>> >>> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you >>> have a CDH3b3 cluster for one of the other features included, you need to >>> use a 3b3 client jar as well, which includes the reinit stuff. >>> >>> >>>> HBase tries to create an output compression stream on each compression >>>> block, and one HFile flush will contain roughly 1000 compression blocks. >> I >>>> think reinit() could get called 1000 times on one flush, and if >> hadoop-lzo >>>> allocates 64MB block on reinit() (HBase's compression blocks is about >> 64KB >>>> though), it will become pretty much something you're observing now. >>>> >>>> >>> Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. >> And >>> in some circumstances (like all the rigged tests I've attempted to do) >> these >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps in >>> real workloads does the leak actually end up running away. >>> >>> -Todd >>> >>>> >>>> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >>>> >>>>> Can someone who is having this issue try checking out the following git >>>>> branch and rebuilding LZO? >>>>> >>>>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc >>>>> >>>>> This definitely stems one leak of a 64KB directbuffer on every reinit. >>>>> >>>>> -Todd >>>>> >>>>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> >> wrote: >>>>> >>>>>> Yea, you're definitely on the right track. Have you considered systems >>>>>> programming, Friso? :) >>>>>> >>>>>> Hopefully should have a candidate patch to LZO later today. >>>>>> >>>>>> -Todd >>>>>> >>>>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >>>>>> [EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> Hi, >>>>>>> My guess is indeed that it has to do with using the reinit() method >> on >>>>>>> compressors and making them long lived instead of throwaway together >>>> with >>>>>>> the LZO implementation of reinit(), which magically causes NIO buffer >>>>>>> objects not to be finalized and as a result not release their native >>>>>>> allocations. It's just theory and I haven't had the time to properly
-
Re: Java Commited Virtual Memory significally larged then Heap MemoryTodd Lipcon 2011-01-13, 16:13
On Thu, Jan 13, 2011 at 12:25 AM, Friso van Vollenhoven <
[EMAIL PROTECTED]> wrote: > Hey Todd, > > I saw the patch. On what JVM (versions) have you tested this? > I tested on Sun JVM 1.6u22, but the undocumented calls I used have definitely been around for a long time, so it ought to work on any Sun or OpenJDK as far as I know. > > (Probably the wrong list for this, but: is there a officially supported JVM > version for CDH3?) > > We recommend the Sun 1.6 >=u16 but not u18 -Todd > > > On 13 jan 2011, at 07:42, Todd Lipcon wrote: > > > On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <[EMAIL PROTECTED] > >wrote: > > > >>> And > >>> in some circumstances (like all the rigged tests I've attempted to do) > >> these > >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps > in > >>> real workloads does the leak actually end up running away. > >> > >> This issue should be circumstance dependent as we don't have direct > control > >> on deallocating those buffers. We need them GCed but they never occupy > the > >> Java heap to encourage the GC to run. > >> > > > > Thanks to reflection and use of undocumented APIs, you can actually > free() a > > direct buffer - check out the patch referenced earlier in this thread. > > > > Of course it probably doesn't work on other JVMs... oh well. > > > > -Todd > > > >> > >> > >> On Jan 13, 2011, at 8:30 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> > >>> On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <[EMAIL PROTECTED] > >>> wrote: > >>> > >>>> Hi Friso and everyone, > >>>> > >>>> OK. We don't have to spend time to juggle hadoop-core jars anymore > since > >>>> Todd is working hard on enhancing hadoop-lzo behavior. > >>>> > >>>> I think your assumption is correct, but what I was trying to say was > >> HBase > >>>> doesn't change the way to use Hadoop compressors since HBase 0.20 > >> release > >>>> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 > and > >>>> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append > branch) > >> and > >>>> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, > so > >> I > >>>> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both > of > >>>> them don't have reinit(). > >>>> > >>>> > >>> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if > you > >>> have a CDH3b3 cluster for one of the other features included, you need > to > >>> use a 3b3 client jar as well, which includes the reinit stuff. > >>> > >>> > >>>> HBase tries to create an output compression stream on each compression > >>>> block, and one HFile flush will contain roughly 1000 compression > blocks. > >> I > >>>> think reinit() could get called 1000 times on one flush, and if > >> hadoop-lzo > >>>> allocates 64MB block on reinit() (HBase's compression blocks is about > >> 64KB > >>>> though), it will become pretty much something you're observing now. > >>>> > >>>> > >>> Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. > >> And > >>> in some circumstances (like all the rigged tests I've attempted to do) > >> these > >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps > in > >>> real workloads does the leak actually end up running away. > >>> > >>> -Todd > >>> > >>>> > >>>> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >>>> > >>>>> Can someone who is having this issue try checking out the following > git > >>>>> branch and rebuilding LZO? > >>>>> > >>>>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc > >>>>> > >>>>> This definitely stems one leak of a 64KB directbuffer on every > reinit. > >>>>> > >>>>> -Todd > >>>>> > >>>>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[EMAIL PROTECTED]> > >> wrote: > >>>>> > >>>>>> Yea, you're definitely on the right track. Have you considered > systems > >>>>>> programming, Friso? :) > >>>>>> > >>>>>> Hopefully should have a candidate patch to LZO later today. > >>>>>> > >>> Todd Lipcon Software Engineer, Cloudera |