|
|
-
Re: Latest Flume test report and problemDenny Ye 2012-07-25, 09:48
hi Alex,
Attachment is preparing for you! Long term pause may be the critical problem for us. Do you agree me ? Wish your response, thanks! -Regards Denny Ye 2012/7/25 alo.alt <[EMAIL PROTECTED]> > Hey Denny, > > thanks for the report. > > Can you please try to rerun with: > > JAVA_OPTS="-Xms200m -Xmx200m -Xmn32m -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -Xss128k > -XX:+UseMembar -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -Xloggc:/var/log/flume/gc.log" > > Plese attach the gc.log after. > > cheers, > Alex > > Am 25.07.2012 10:35, schrieb Denny Ye: > > hi all, > > > > I tested Flume in last week with ScribeSource( > > https://issues.apache.org/jira/browse/FLUME-1382) and HDFS Sink. More > > detailed conditions and deployment cases listed below. Too many 'Full GC' > > impact the throughput and amount of events promoted into old generation. > I > > have applied some tuning methods, no much effect. > > > > Could someone give me your feedback or tip to reduce the GC problem? > > Wish your attention. > > > > > > PS: Using Mike's report template at > > https://cwiki.apache.org/FLUME/flume-ng-performance-measurements.html > > > > * > > * > > > > *Flume Performance Test 2012-07-25* > > > > *Overview* > > > > The Flume agent was run on its own physical machine in a single JVM. A > > separate client machine generated load against the Flume box in > > List<LogEntry> format. Flume stored data onto a 4-node HDFS cluster > > configured on its own separate hardware. No virtual machines were used in > > this test. > > > > *Hardware specs* > > > > CPU: Inter Xeon L5640 2 x quad-core @ 2.27 GHz (12 physical cores) > > > > Memory: 16 GB > > > > OS: CentOS release 5.3 (Final) > > > > *Flume configuration* > > > > JAVA Version: 1.6.0_20 (Java HotSpot 64-Bit Server VM) > > > > JAVA OPTS: -Xms1024m -Xmx4096m -XX:PermSize=256m -XX:NewRatio=1 > > -XX:SurvivorRatio=5 -XX:InitialTenuringThreshold=15 > > -XX:MaxTenuringThreshold=31 -XX:PretenureSizeThreshold=4096 > > > > Num. agents: 1 > > > > Num. parallel flows: 5 > > > > Source: ScribeSource > > > > Channel: MemoryChannel > > > > Sink: HDFSEventSink > > > > Selector: RandomSelector > > > > *Config-file* > > > > # list sources, channels, sinks for the agent > > > > agent.sources = seqGenSrc > > > > agent.channels = mc1 mc2 mc3 mc4 mc5 > > > > agent.sinks = hdfsSin1 hdfsSin2 hdfsSin3 hdfsSin4 hdfsSin5 > > > > > > > > # define sources > > > > agent.sources.seqGenSrc.type > org.apache.flume.source.scribe.ScribeSource > > > > agent.sources.seqGenSrc.selector.type = io.flume.RandomSelector > > > > > > > > # define sinks > > > > agent.sinks.hdfsSin1.type = hdfs > > > > agent.sinks.hdfsSin1.hdfs.path = /flume_test/data1/ > > > > agent.sinks.hdfsSin1.hdfs.rollInterval = 300 > > > > agent.sinks.hdfsSin1.hdfs.rollSize = 0 > > > > agent.sinks.hdfsSin1.hdfs.rollCount = 1000000 > > > > agent.sinks.hdfsSin1.hdfs.batchSize = 10000 > > > > agent.sinks.hdfsSin1.hdfs.fileType = DataStream > > > > agent.sinks.hdfsSin1.hdfs.txnEventMax = 1000 > > > > # ... define sink #2 #3 #4 #5 ... > > > > > > > > # define channels > > > > agent.channels.mc1.type = memory > > > > agent.channels.mc1.capacity = 1000000 > > > > agent.channels.mc1.transactionCapacity = 1000 > > > > # ... define channel #2 #3 #4 #5 ... > > > > > > > > # specify the channel each sink and source should use > > > > agent.sources.seqGenSrc.channels = mc1 mc2 mc3 mc4 mc5 > > > > agent.sinks.hdfsSin1.channel = mc1 > > > > # ... specify sink #2 #3 #4 #5 ... > > > > *Hadoop configuration* > > > > The HDFS sink was connected to a 4-node Hadoop cluster running CDH3u1. > For > > different HDFS sink, HDFS wrote data into different path. > > > > *Visualization of test setup* > > > > > https://lh3.googleusercontent.com/dGumq1pu1Wr3Bj8WJmRHOoLWmUlGqxC4wW7_XCNO9R1wuh15LRXaKKxGoccpjBXtgqcdSVW-vtg > > > > There are 10 Scribe Clients and each client send 20 million LogEntry > > objects to ScribleSource. |