Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Sending avro data from other languages


Copy link to this message
-
Re: Sending avro data from other languages
Sure, I was just leaving it alone as Danny had tagged you, assuming
there was some reason.

On 08/06/2012 10:31 AM, Hari Shreedharan wrote:
> Juhani,
>
> Since you are familiar with scribe, it would be great I you could review
> and commit the scribe source.
>
> I took a look but I am not at all familiar with the way scribe works or
> it's protocol to justify a review.
>
>
> Thanks
> Hari
>
> On Sunday, August 5, 2012, Juhani Connolly wrote:
>
>> That would be awesome. I was going to write a thrift source, but Danny's
>> contribution of the scribe source covers our use case really well(since our
>> current data source feeds data in the scribe format. I was in the process
>> of writing a flume transport for it when I realized the avro shortcomings).
>>
>> On 08/03/2012 09:49 PM, Brock Noland wrote:
>>
>> Yeah I agree. FWIW, I am hoping in few weeks I will have a little more
>> spare time and I was planning on writing the Avro patches to ensure
>> languages such as Python, C#, etc could write messages to Flume.
>>
>> On Fri, Aug 3, 2012 at 1:30 AM, Juhani Connolly <
>> [EMAIL PROTECTED]> wrote:
>>
>>   On paper it certainly seems like a good solution, it's just unfortunate
>> that some "supported" languages can't actually interface to it. I
>> understand that thrift can be quite a nuisance to deal with at times.
>>
>>
>> On 08/02/2012 11:01 PM, Brock Noland wrote:
>>
>>   I cannot answer what made us move to Avro. However, I prefer Avro because
>> you don't have to build the thrift compiler and you aren't required to do
>> code generation.
>>
>> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly <
>> [EMAIL PROTECTED].****jp <[EMAIL PROTECTED]>>
>> wrote:
>>
>>    It looks to me like this was because of the transceiver I was using.
>>
>> Unfortunately it seems like avro doesn't have a python implementation of
>> a
>> transceiver that fits the format expected by netty/avro(in fact it only
>> has
>> one transceiver... HTTPTransceiver).
>>
>> To address this, I'm thinking of putting together a thrift source(the
>> legacy source doesn't seem to be usable as it returns nothing, and lacks
>> batching). Does this seem like a reasonable solution to making it
>> possible
>> to send data to flume from other languages(and allowing backoff on
>> failure?). Historically, what made us move away from thrift to avro?
>>
>>
>> On 07/30/2012 05:34 PM, Juhani Connolly wrote:
>>
>>    I'm playing around with making a standalone tail client in python(so
>>
>> that
>> I can access inode data) that tracks position in a file and then sends
>> it
>> across avro to an avro sink.
>>
>> However I'm having issues with the avro part of this and wondering if
>> anyone more familiar with it could help.
>>
>> I took the flume.avdl file and converted it using "java -jar
>> ~/Downloads/avro-tools-1.6.3.******jar idl flume.avdl flume.avpr"
>>
>>
>> I then run it through a simple test program to see if its sending the
>> data correctly and it sends from the python client fine, but the sink
>> end
>> OOM's because presumably the wire format is wrong:
>>
>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN
>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /
>> 172.28.19.112:41414
>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: /
>> 172.22.114.32:55671
>> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from
>> downstream.
>> java.lang.OutOfMemoryError: Java heap space
>>            at java.util.ArrayList.<init>(******ArrayList.java:112)
>>            at org.apache.avro.ipc.******NettyTransportCodec$****
>> NettyFrameDecoder.
>> **decodePackHeader(******NettyTransportCodec.java:154)
>>            at org.apache.avro.ipc.******NettyTransportCodec$**
>> NettyFrameDecoder.decode(******NettyTransportCodec.java:131)