Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Sending avro data from other languages


Copy link to this message
-
Re: Sending avro data from other languages
Another alternative to consider for cross-platform/language support would
be protocol buffers. That has relatively better tooling and integration
than other similar systems and is used by other projects as well.

Regards,
Arvind Prabhakar

On Thu, Aug 2, 2012 at 7:01 AM, Brock Noland <[EMAIL PROTECTED]> wrote:

> I cannot answer what made us move to Avro. However, I prefer Avro because
> you don't have to build the thrift compiler and you aren't required to do
> code generation.
>
> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly <
> [EMAIL PROTECTED]> wrote:
>
> > It looks to me like this was because of the transceiver I was using.
> >
> > Unfortunately it seems like avro doesn't have a python implementation of
> a
> > transceiver that fits the format expected by netty/avro(in fact it only
> has
> > one transceiver... HTTPTransceiver).
> >
> > To address this, I'm thinking of putting together a thrift source(the
> > legacy source doesn't seem to be usable as it returns nothing, and lacks
> > batching). Does this seem like a reasonable solution to making it
> possible
> > to send data to flume from other languages(and allowing backoff on
> > failure?). Historically, what made us move away from thrift to avro?
> >
> >
> > On 07/30/2012 05:34 PM, Juhani Connolly wrote:
> >
> >> I'm playing around with making a standalone tail client in python(so
> that
> >> I can access inode data) that tracks position in a file and then sends
> it
> >> across avro to an avro sink.
> >>
> >> However I'm having issues with the avro part of this and wondering if
> >> anyone more familiar with it could help.
> >>
> >> I took the flume.avdl file and converted it using "java -jar
> >> ~/Downloads/avro-tools-1.6.3.**jar idl flume.avdl flume.avpr"
> >>
> >> I then run it through a simple test program to see if its sending the
> >> data correctly and it sends from the python client fine, but the sink
> end
> >> OOM's because presumably the wire format is wrong:
> >>
> >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
> >> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN
> >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
> >> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /
> 172.28.19.112:41414
> >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
> >> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: /
> >> 172.22.114.32:55671
> >> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from
> >> downstream.
> >> java.lang.OutOfMemoryError: Java heap space
> >>         at java.util.ArrayList.<init>(**ArrayList.java:112)
> >>         at
> org.apache.avro.ipc.**NettyTransportCodec$**NettyFrameDecoder.
> >> **decodePackHeader(**NettyTransportCodec.java:154)
> >>         at org.apache.avro.ipc.**NettyTransportCodec$**
> >> NettyFrameDecoder.decode(**NettyTransportCodec.java:131)
> >>         at
> org.jboss.netty.handler.codec.**frame.FrameDecoder.callDecode(
> >> **FrameDecoder.java:282)
> >>         at org.jboss.netty.handler.codec.**frame.FrameDecoder.**
> >> messageReceived(FrameDecoder.**java:216)
> >>         at org.jboss.netty.channel.**Channels.fireMessageReceived(**
> >> Channels.java:274)
> >>         at org.jboss.netty.channel.**Channels.fireMessageReceived(**
> >> Channels.java:261)
> >>         at org.jboss.netty.channel.**socket.nio.NioWorker.read(**
> >> NioWorker.java:351)
> >>         at org.jboss.netty.channel.**socket.nio.NioWorker.**
> >> processSelectedKeys(NioWorker.**java:282)
> >>         at org.jboss.netty.channel.**socket.nio.NioWorker.run(**
> >> NioWorker.java:202)
> >>         at java.util.concurrent.**ThreadPoolExecutor$Worker.**
> >> runTask(ThreadPoolExecutor.**java:886)
> >>         at java.util.concurrent.**ThreadPoolExecutor$Worker.run(**
> >> ThreadPoolExecutor.java:908)
> >>         at java.lang.Thread.run(Thread.**java:619)
> >> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
> >> 172.22.114.32:55671 :> /172.28.19.112:41414] DISCONNECTED
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB