Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> python/avro example producer


Copy link to this message
-
Re: python/avro example producer
I'll try to give an unbiased answer :)

I believe my library has more comprehensive coverage of the APIs for
0.7.x (e.g., MultiProduce, MultiFetch), as well as more test coverage
(including integration tests). However, it's newer and has had less real
world usage, so there may be undiscovered bugs.

I also have plans to start implementing the 0.8 protocol in the next few
weeks.

-David

On 12/19/12 10:53 AM, Joseph Crotty wrote:
> Thanks for the insights. Another developer where I work started on a Python
> producer, previously attached, a few weeks ago initially using your
> kafka-python lib, but for some reason switched to pykafka. Is one preferred
> over the other?
>
>
> On Wed, Dec 19, 2012 at 8:36 AM, David Arthur <[EMAIL PROTECTED]> wrote:
>
>> The fundamental unit of Kafka is a Message. A Message contains a few bytes
>> of metadata (a magic number, a crc32 checksum, some attributes) and a
>> payload of bytes. For the most part these details are obscured from the
>> end-user, so all you have to concern yourself with sending the actual data
>> (payload). In Java the payload is simply a byte array, in Python it's just
>> a string.
>>
>> I'd suggest reading through the Quick Start (http://kafka.apache.org/**
>> quickstart.html <http://kafka.apache.org/quickstart.html>), and Design (
>> http://kafka.apache.org/**design.html<http://kafka.apache.org/design.html>)
>> if you're really interested in how things work.
>>
>> As for sending data with my Python producer, just check out the README on
>> the project page: https://github.com/mumrah/**kafka-python#send-a-message-
>> **to-a-topic<https://github.com/mumrah/kafka-python#send-a-message-to-a-topic>
>>
>> Cheers
>>
>>
>> On 12/19/12 10:21 AM, Joseph Crotty wrote:
>>
>>> What exactly does a "payload" mean? Sorry, fairly new to Kafka. Is there
>>> a payload method that needs to be called by the python producer?
>>>
>>> Thanks for any insights. Attached some sample code if you have time to
>>> lead us to the water! Probably something simple we are missing.
>>>
>>> Joe
>>>
>>>
>>> On Wed, Dec 19, 2012 at 7:46 AM, David Arthur <[EMAIL PROTECTED] <mailto:
>>> [EMAIL PROTECTED]>> wrote:
>>>
>>>      Do you mean a Python producer that sends Avro payloads?
>>>
>>>      There are a couple of Python clients floating around, including
>>>      mine: https://github.com/mumrah/**kafka-python<https://github.com/mumrah/kafka-python>
>>>
>>>      The Avro package is in pypi
>>>      (http://pypi.python.org/pypi/**avro/1.7.3<http://pypi.python.org/pypi/avro/1.7.3>),
>>> with official docs and
>>>      getting started with Python on the Avro project page
>>>      (http://avro.apache.org/docs/**1.7.3/gettingstartedpython.**html<http://avro.apache.org/docs/1.7.3/gettingstartedpython.html>
>>> )
>>>
>>>      Good luck!
>>>
>>>
>>>      On 12/18/12 10:38 PM, Joseph Crotty wrote:
>>>
>>>          Anyone have a python/avro producer that slurps up records from
>>>          a flat file
>>>          (i.e., mix of string and binary data) and publishes to Kafka
>>>          they would be
>>>          willing to share?
>>>
>>>          Starting to think this might be a whole lot faster to do in
>>>          Java, but maybe
>>>          someone has a Python solution already in hand.
>>>
>>>          Joe
>>>
>>>
>>>
>>>
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB