Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Sending huge binary files via Kafka?


Copy link to this message
-
Re: Sending huge binary files via Kafka?
Magnus,

this sounds like an interesting idea. But at the actual state of Kafka
this would mean, that I would have to extend the Kafka Producer and
Consumer
classes on my own to support that kind of message/file transfer, wouldn't
it?
This is a bit too much effort for me at the moment.
But of course it would be nice, if Kafka Producers or Consumers would
support
zero-copy file transfer natively.

At the moment I'm more thinking about sending a message to the consumer
with an URL of the huge binary file, and let the consumer fetch the file
from that
URL directly. By that we would use Kafka only for sending a notification,
that
a new file exists at the source. The real file transfer would bypass
the Kafka queue.

Andreas Maier

Am 05.09.13 12:28 schrieb "Magnus Edenhill" unter <[EMAIL PROTECTED]>:

>It would be possible to modify an existing client implementation to use
>sendfile(2) to pass message contents from/to the filesystem rather than
>(pre-)allocating
>send&receive buffers, Thus providing zero-copy file transfer over Kafka.
>I believe this is how the broker is implemented.
>
>Regards,
>Magnus
>
>
>2013/9/4 Neha Narkhede <[EMAIL PROTECTED]>
>
>> The message size limit is imposed to protect the brokers and consumers
>>from
>> running out of memory. The consumer does not have support for streaming
>>a
>> message and has to allocate memory to be able to read the largest
>>message.
>> You could try compressing the files but I'm not sure if that will get
>>you
>> as much space saving to make it feasible for Kafka usage.
>>
>> Thanks,
>> Neha
>> On Sep 4, 2013 5:29 AM, "Maier, Dr. Andreas" <[EMAIL PROTECTED]>
>> wrote:
>>
>> > Hello,
>> >
>> > I have a proposal for an architecture on my desk, where people want
>> > to store huge binary files (like images and videos up to a size of
>> several
>> > GB)
>> > in RiakCS. But the connection to RiakCS is supposed to work through
>> Apache
>> > Kafka,
>> > so there will be a Kafka producer fetching the files from the source
>>and
>> > sending them to a Kafka-RiakCS consumer.
>> > Now when I look into the Kafka configuration options
>> > (http://kafka.apache.org/08/configuration.html)
>> > I see that message.max.bytes is 1000000 by default, which would be
>>much
>> > too
>> > small for huge binary files like videos.
>> > So my questions are:
>> > Can this size be increased to support also messages with a size
>> > of several GB? Has anyone already tried this? Are Kafka brokers,
>> > consumers and producers able to handle such a message size?
>> > Will setting such a huge limit on the message size have any impact
>> > on the performance of transporting smaller messages?
>> > Or should we better let our Kafka producers bypass Kafka, when
>> > they encounter such huge binary files at the source and
>> > let them store these files directly in RiakCS?
>> >
>> > Best Regards,
>> >
>> > Andreas Maier
>> >
>> >
>>