|
|
+
Shara Shi 2012-08-27, 09:26
+
Denny Ye 2012-08-27, 12:04
+
Shara Shi 2012-08-28, 01:59
+
Denny Ye 2012-08-28, 03:02
+
Shara Shi 2012-08-28, 03:19
-
Re: 答复: 答复: HDFS SINK PerformacneMohit Anchlia 2012-08-28, 04:48
Do you get better performance when you directly write to the cluster? Can
you perform some tests writing to cluster directly and compare? On Mon, Aug 27, 2012 at 8:19 PM, Shara Shi <[EMAIL PROTECTED]> wrote: > Hi Denny**** > > ** ** > > It is 20MB /min , I confirmed **** > > I sent data from avro-client from local to flume agent , I really got > 20MB/min**** > > So I try to find out the reason why. **** > > ** ** > > Regards **** > > Shara**** > > *发件人:* Denny Ye [mailto:[EMAIL PROTECTED]] > *发送时间:* 2012年8月28日 11:02 > *收件人:* [EMAIL PROTECTED] > *主题:* Re: 答复: HDFS SINK Performacne**** > > ** ** > > 20MB/min or 20MB/sec?**** > > I doubt that it may have presentation mistake. Can you confirm it?**** > > ** ** > > -Regards**** > > Denny Ye**** > > 2012/8/28 Shara Shi <[EMAIL PROTECTED]>**** > > Hi Denny**** > > **** > > The throughput is 45MB/sec is OK for me . **** > > But I just got 20M / Minutes **** > > What’s wrong with my configuration?**** > > **** > > Regards**** > > Shara**** > > **** > > **** > > *发件人:* Denny Ye [mailto:[EMAIL PROTECTED]] > *发送时间:* 2012年8月27日 20:05 > *收件人:* [EMAIL PROTECTED] > *主题:* Re: HDFS SINK Performacne**** > > **** > > hi Shara,**** > > You are using MemoryChannel as repository. I tested it with outcomes: > 45MB/sec without full GC in local updated code. Is this your goal? or more > high throughput?**** > > **** > > -Regards**** > > Denny Ye**** > > 2012/8/27 Shara Shi <[EMAIL PROTECTED]>**** > > Hi All, **** > > **** > > Whatever I have tuned parameters of hdfs sink, It can’t get higher > performance over than 20MB per minutes.**** > > Is that normal? I think it is weird.**** > > How can I improve it**** > > **** > > Regards**** > > Ruihong Shi**** > > ==========================================**** > > **** > > # or more contributor license agreements. See the NOTICE file**** > > # distributed with this work for additional information**** > > # regarding copyright ownership. The ASF licenses this file**** > > # to you under the Apache License, Version 2.0 (the**** > > # "License"); you may not use this file except in compliance**** > > # with the License. You may obtain a copy of the License at**** > > #**** > > # http://www.apache.org/licenses/LICENSE-2.0**** > > #**** > > # Unless required by applicable law or agreed to in writing,**** > > # software distributed under the License is distributed on an**** > > # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY**** > > # KIND, either express or implied. See the License for the**** > > # specific language governing permissions and limitations**** > > # under the License.**** > > **** > > # Define a memory channel called ch1 on collector1**** > > collector2.channels.ch2.type = memory**** > > collector2.channels.ch2.capacity=500000**** > > collector2.channels.ch2.keep-alive=1**** > > **** > > **** > > # Define an Avro source called avro-source1 on agent1 and tell it**** > > # to bind to 0.0.0.0:41414. Connect it to channel ch1.**** > > collector2.sources.avro-source1.channels = ch2**** > > collector2.sources.avro-source1.type = avro**** > > collector2.sources.avro-source1.bind = 0.0.0.0**** > > collector2.sources.avro-source1.port = 41415**** > > collector2.sources.avro-soruce1.threads = 10**** > > **** > > **** > > # Define a hdfs sink**** > > collector2.sinks.hdfs.channel = ch2**** > > collector2.sinks.hdfs.type= hdfs**** > > > collector2.sinks.hdfs.hdfs.path=hdfs://namenode:8020/user/root/flume/webdata/exec/%Y/%m/%d/%H > **** > > collector2.sinks.hdfs.batchsize=50000**** > > collector2.sinks.hdfs.runner.type=polling**** > > collector2.sinks.hdfs.runner.polling.interval = 1**** > > collector2.sinks.hdfs.hdfs.rollInterval = 120**** > > collector2.sinks.hdfs.hdfs.rollSize =0**** > > collector2.sinks.hdfs.hdfs.rollCount = 300000**** > > collector2.sinks.hdfs.hdfs.fileType=DataStream**** > > collector2.sinks.hdfs.hdfs.round =true**** > > collector2.sinks.hdfs.hdfs.roundValue = 10**** > > collector2.sinks.hdfs.hdfs.roundUnit = minute**** +
Shara Shi 2012-08-28, 05:08
+
Patrick Wendell 2012-08-28, 05:11
+
Shara Shi 2012-08-28, 05:42
+
Brock Noland 2012-08-28, 11:47
|