Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # dev - Sending avro data from other languages


+
Juhani Connolly 2012-07-30, 08:34
+
Juhani Connolly 2012-08-02, 04:06
+
Brock Noland 2012-08-02, 14:01
+
Juhani Connolly 2012-08-03, 06:30
+
Brock Noland 2012-08-03, 12:49
Copy link to this message
-
Re: Sending avro data from other languages
Juhani Connolly 2012-08-06, 01:21
That would be awesome. I was going to write a thrift source, but Danny's
contribution of the scribe source covers our use case really well(since
our current data source feeds data in the scribe format. I was in the
process of writing a flume transport for it when I realized the avro
shortcomings).

On 08/03/2012 09:49 PM, Brock Noland wrote:
> Yeah I agree. FWIW, I am hoping in few weeks I will have a little more
> spare time and I was planning on writing the Avro patches to ensure
> languages such as Python, C#, etc could write messages to Flume.
>
> On Fri, Aug 3, 2012 at 1:30 AM, Juhani Connolly <
> [EMAIL PROTECTED]> wrote:
>
>> On paper it certainly seems like a good solution, it's just unfortunate
>> that some "supported" languages can't actually interface to it. I
>> understand that thrift can be quite a nuisance to deal with at times.
>>
>>
>> On 08/02/2012 11:01 PM, Brock Noland wrote:
>>
>>> I cannot answer what made us move to Avro. However, I prefer Avro because
>>> you don't have to build the thrift compiler and you aren't required to do
>>> code generation.
>>>
>>> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly <
>>> [EMAIL PROTECTED].**jp <[EMAIL PROTECTED]>>
>>> wrote:
>>>
>>>   It looks to me like this was because of the transceiver I was using.
>>>> Unfortunately it seems like avro doesn't have a python implementation of
>>>> a
>>>> transceiver that fits the format expected by netty/avro(in fact it only
>>>> has
>>>> one transceiver... HTTPTransceiver).
>>>>
>>>> To address this, I'm thinking of putting together a thrift source(the
>>>> legacy source doesn't seem to be usable as it returns nothing, and lacks
>>>> batching). Does this seem like a reasonable solution to making it
>>>> possible
>>>> to send data to flume from other languages(and allowing backoff on
>>>> failure?). Historically, what made us move away from thrift to avro?
>>>>
>>>>
>>>> On 07/30/2012 05:34 PM, Juhani Connolly wrote:
>>>>
>>>>   I'm playing around with making a standalone tail client in python(so
>>>>> that
>>>>> I can access inode data) that tracks position in a file and then sends
>>>>> it
>>>>> across avro to an avro sink.
>>>>>
>>>>> However I'm having issues with the avro part of this and wondering if
>>>>> anyone more familiar with it could help.
>>>>>
>>>>> I took the flume.avdl file and converted it using "java -jar
>>>>> ~/Downloads/avro-tools-1.6.3.****jar idl flume.avdl flume.avpr"
>>>>>
>>>>>
>>>>> I then run it through a simple test program to see if its sending the
>>>>> data correctly and it sends from the python client fine, but the sink
>>>>> end
>>>>> OOM's because presumably the wire format is wrong:
>>>>>
>>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>>> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN
>>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>>> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /
>>>>> 172.28.19.112:41414
>>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>>> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: /
>>>>> 172.22.114.32:55671
>>>>> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from
>>>>> downstream.
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>           at java.util.ArrayList.<init>(****ArrayList.java:112)
>>>>>           at org.apache.avro.ipc.****NettyTransportCodec$****
>>>>> NettyFrameDecoder.
>>>>> **decodePackHeader(****NettyTransportCodec.java:154)
>>>>>           at org.apache.avro.ipc.****NettyTransportCodec$**
>>>>> NettyFrameDecoder.decode(****NettyTransportCodec.java:131)
>>>>>           at org.jboss.netty.handler.codec.****frame.FrameDecoder.**
>>>>> callDecode(
>>>>> **FrameDecoder.java:282)
>>>>>           at org.jboss.netty.handler.codec.****frame.FrameDecoder.**
>>>>> messageReceived(FrameDecoder.****java:216)
>>>>>           at org.jboss.netty.channel.****Channels.fireMessageReceived(**
>>>>> **
>>>
+
Hari Shreedharan 2012-08-06, 01:31
+
Juhani Connolly 2012-08-06, 04:26
+
Arvind Prabhakar 2012-08-02, 16:59
+
Brock Noland 2012-07-30, 13:31