Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> Sending avro data from other languages


+
Juhani Connolly 2012-07-30, 08:34
+
Juhani Connolly 2012-08-02, 04:06
+
Brock Noland 2012-08-02, 14:01
Copy link to this message
-
Re: Sending avro data from other languages
On paper it certainly seems like a good solution, it's just unfortunate
that some "supported" languages can't actually interface to it. I
understand that thrift can be quite a nuisance to deal with at times.

On 08/02/2012 11:01 PM, Brock Noland wrote:
> I cannot answer what made us move to Avro. However, I prefer Avro because
> you don't have to build the thrift compiler and you aren't required to do
> code generation.
>
> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly <
> [EMAIL PROTECTED]> wrote:
>
>> It looks to me like this was because of the transceiver I was using.
>>
>> Unfortunately it seems like avro doesn't have a python implementation of a
>> transceiver that fits the format expected by netty/avro(in fact it only has
>> one transceiver... HTTPTransceiver).
>>
>> To address this, I'm thinking of putting together a thrift source(the
>> legacy source doesn't seem to be usable as it returns nothing, and lacks
>> batching). Does this seem like a reasonable solution to making it possible
>> to send data to flume from other languages(and allowing backoff on
>> failure?). Historically, what made us move away from thrift to avro?
>>
>>
>> On 07/30/2012 05:34 PM, Juhani Connolly wrote:
>>
>>> I'm playing around with making a standalone tail client in python(so that
>>> I can access inode data) that tracks position in a file and then sends it
>>> across avro to an avro sink.
>>>
>>> However I'm having issues with the avro part of this and wondering if
>>> anyone more familiar with it could help.
>>>
>>> I took the flume.avdl file and converted it using "java -jar
>>> ~/Downloads/avro-tools-1.6.3.**jar idl flume.avdl flume.avpr"
>>>
>>> I then run it through a simple test program to see if its sending the
>>> data correctly and it sends from the python client fine, but the sink end
>>> OOM's because presumably the wire format is wrong:
>>>
>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN
>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /172.28.19.112:41414
>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: /
>>> 172.22.114.32:55671
>>> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from
>>> downstream.
>>> java.lang.OutOfMemoryError: Java heap space
>>>          at java.util.ArrayList.<init>(**ArrayList.java:112)
>>>          at org.apache.avro.ipc.**NettyTransportCodec$**NettyFrameDecoder.
>>> **decodePackHeader(**NettyTransportCodec.java:154)
>>>          at org.apache.avro.ipc.**NettyTransportCodec$**
>>> NettyFrameDecoder.decode(**NettyTransportCodec.java:131)
>>>          at org.jboss.netty.handler.codec.**frame.FrameDecoder.callDecode(
>>> **FrameDecoder.java:282)
>>>          at org.jboss.netty.handler.codec.**frame.FrameDecoder.**
>>> messageReceived(FrameDecoder.**java:216)
>>>          at org.jboss.netty.channel.**Channels.fireMessageReceived(**
>>> Channels.java:274)
>>>          at org.jboss.netty.channel.**Channels.fireMessageReceived(**
>>> Channels.java:261)
>>>          at org.jboss.netty.channel.**socket.nio.NioWorker.read(**
>>> NioWorker.java:351)
>>>          at org.jboss.netty.channel.**socket.nio.NioWorker.**
>>> processSelectedKeys(NioWorker.**java:282)
>>>          at org.jboss.netty.channel.**socket.nio.NioWorker.run(**
>>> NioWorker.java:202)
>>>          at java.util.concurrent.**ThreadPoolExecutor$Worker.**
>>> runTask(ThreadPoolExecutor.**java:886)
>>>          at java.util.concurrent.**ThreadPoolExecutor$Worker.run(**
>>> ThreadPoolExecutor.java:908)
>>>          at java.lang.Thread.run(Thread.**java:619)
>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>> 172.22.114.32:55671 :> /172.28.19.112:41414] DISCONNECTED
>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>> 172.22.114.32:55671 :> /172.28.19.112:41414] UNBOUND
+
Brock Noland 2012-08-03, 12:49
+
Juhani Connolly 2012-08-06, 01:21
+
Hari Shreedharan 2012-08-06, 01:31
+
Juhani Connolly 2012-08-06, 04:26
+
Arvind Prabhakar 2012-08-02, 16:59
+
Brock Noland 2012-07-30, 13:31