Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Constant Traffic on port 35872


+
James Stewart 2013-01-17, 01:06
+
Mike Percy 2013-01-17, 01:18
+
James Stewart 2013-01-17, 01:53
+
Mike Percy 2013-01-17, 03:37
+
James Stewart 2013-01-17, 04:02
+
Alexander Alten-Lorenz 2013-01-17, 07:00
Copy link to this message
-
Re: Constant Traffic on port 35872
You may want to check property named flume.reporter.poller.period in
flume-conf.xml or flume-site.xml; default value is 2000 (millis) in case
of not being defined.
if you cannot find the property in flume-conf.xml, add it to
flume-site.xml and set value to 60000 (60 sec).

I am not sure which version of flume-og you are using, but in what I am
using (v0.9.4), the related class seems to have been deprecated.

- JS

On 1/17/13 4:00 PM, Alexander Alten-Lorenz wrote:
> Depends on the architecture, since the nodes are configured per master's webui. The master calls regularly the in-memory config and spread them around. This is needed for HA, as example.
> Flume 1.x up has another architecture.
>
> - Alex
>
> On Jan 17, 2013, at 5:02 AM, James Stewart <[EMAIL PROTECTED]> wrote:
>
>> Yeah, I’ve just realised that it’s*exactly* the same data that is returned when you connect to http://my.flume.node:35862, (for monitoring etc). Even the order in which the metrics are sent is the same.
>>
>> So it seems that the node is generating this configuration data and pumping it back to the master every 1-2 seconds. This produces ~40-80Kb/sec of largely unnecessary traffic per node, which soon adds up over a WAN.
>>
>> I can understand why this config data would be sent back to the master occasionally but I don’t understand why it does so every 1-2 seconds, ignoring flume.config.heartbeat.period.
>>
>> From: Mike Percy [mailto:[EMAIL PROTECTED]]
>> Sent: Thursday, 17 January 2013 2:37 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: Constant Traffic on port 35872
>>
>> I doubt it's the Thrift RPC layer. It's most likely the app.
>>
>> On Wed, Jan 16, 2013 at 5:53 PM, James Stewart <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
>> I thought it was only ‘heartbeats’ that were supposed to go via port 35872, so I reduced flume.config.heartbeat.period on the nodes to 60 sec. According to the master it’s only seeing heartbeats every 60 sec now, and yet I still get constantly spammed with data on port 35872 from every node.
>>
>> It does look like metric collection or config reporting of some kind, like it’s reporting the configuration of the sources/sinks and even data about the JVM:
>>
>> ............rt.starttime....Thu Jan 17 11:47:07 EST 2013...     rt.vmname...!Java HotSpot(TM) 64-Bit Server VM....name...(pn-opsynxsr0202.aus.optiver.com.jvm-Info....rt.vmversion....16.3-b01....rt.vmvendor....Sun Microsystems Inc.
>> [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>...........
>>
>> But it’s just the same data over and over again every second. This traffic is travelling across a WAN and with a lot of nodes it’s a significant enough amount of data to be a problem.
>>
>> I don’t know much about Java, but could this be something to do with Thrift?
>>
>>
>> From: Mike Percy [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
>> Sent: Thursday, 17 January 2013 12:19 PM
>> To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
>> Subject: Re: Constant Traffic on port 35872
>>
>> I know next to nothing about Flume OG but if I had to guess I'd say it's either a heartbeat or metrics collection. Why do you want it to stop?
>>
>> On Wed, Jan 16, 2013 at 5:06 PM, James Stewart <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
>> Hello all,
>>
>> I’m using flume 0.9.4 – before anybody mentions it, we aren’t in a position to upgrade at the moment due to custom decorators + sinks.
>>
>> I’m seeing constant traffic from my various flume nodes back to my master on port 35872. Even after increasing my timeout period to 60 sec and disabling all custom sources/sinks/decorators, I am still constantly receiving packets from all of my nodes back to my master. I have included a dump of the tcp packets below – I receive this same traffic from every node every 1-2 sec.
Jeong-shik Jang / [EMAIL PROTECTED]
Gruter, Inc., R&D Team Leader
www.gruter.com
Enjoy Connecting
+
James Stewart 2013-01-17, 21:46