Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Cross-site Kafka installation


Copy link to this message
-
Re: Cross-site Kafka installation
Pablo Barrera González 2013-01-22, 20:27
Hi Joel

Thanks for the hints. Apparently it was a configuration error at operating
system level.

We are using Debian Linux. Kafka uses setsockopt call with SO_SNDBUF to
setup the buffer size (socket.send.buffer). The operating system then set
the real buffer size to min(socket.send.buffer, net.core.wmem_max).
net.core.wmem_max was set to a really small value (131071), that was Debian
Linux default. However, if you delegate the buffsize to the operating
system (basically you don't call to setsockopt for SO_SNDBUF), the
operating system handles the buffer size using the configuration in
net.ipv4.tcp_wmem. The default values for our hosts were 4096, 16384,
4194304, meaning that a buffer can automatically grow up to 4MB (if there
is enough memory available).

Having such an small window was the main reason for the low performance.
With the new configuration we increased performance by an order of
magnitude.

The interesting think is that, at least in Linux, if you don't call to
setsockopt the default buffer works really well. I don't notice any
difference between calling to setsockopt or not (after fixing the
configuration of the machine). So why to setsockopt?

Anyway, I see value on using configuration to turn on or off the call to
setsockopt. I will send a patch.

Regards,

Pablo

PS: We are using 0.7.1 and Linux 2.6.32.
2013/1/22 Joel Koshy <[EMAIL PROTECTED]>
>
> We do mirroring across data-centers (but in the same continent). You
should
> basically set a high fetch size and socket buffer size in such scenarios.
>
> In general, you should set a high value for the socket buffer size on the
> consumer configuration (socket.buffersize) and the source cluster's broker
> configuration (socket.send.buffer).
>
> Assuming you are using the high-level consumer, the fetch size
(fetch.size)
> should be higher than the consumer's socket buffer size. Note that the
> socket buffer size configurations are a hint to the underlying platform's
> networking code. If you enable trace logging, you can check the actual
> receive buffer size and determine whether the setting in the OS networking
> layer also needs to be adjusted. Likewise, you will need to use higher
> connection/session timeouts for zookeeper and set your offset commit
> intervals to be fairly large.
>
> Thanks,
>
> Joel
>
>
> On Mon, Jan 21, 2013 at 2:04 PM, Pablo Barrera González <
> [EMAIL PROTECTED]> wrote:
>
> > Hello
> >
> > In my enterprise we are deploying an cross-site installation of Kakfa.
One
> > of the Kafka cluster is located in USA and one consumer is in Europe.
Does
> > anybody have experience in such an environment? Any comments on the
> > configuration and best practices?
> >
> > Thanks in advance
> >
> > Pablo
> >