Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Using Python and Flume to store avro data


+
Bart Verwilst 2012-11-08, 18:45
+
Hari Shreedharan 2012-11-08, 18:51
+
Bart Verwilst 2012-11-08, 18:57
+
Hari Shreedharan 2012-11-08, 19:06
+
Bart Verwilst 2012-11-08, 21:02
+
Hari Shreedharan 2012-11-08, 21:12
+
Bart Verwilst 2012-11-08, 21:34
+
Hari Shreedharan 2012-11-08, 21:50
+
Bart Verwilst 2012-11-08, 22:49
+
Brock Noland 2012-11-09, 01:30
+
Juhani Connolly 2012-11-09, 01:46
+
Camp, Roy 2012-11-12, 19:52
Copy link to this message
-
Re: Using Python and Flume to store avro data
Andrew Jones 2012-11-13, 09:28
We also use Thrift to send from multiple languages, but have written a
custom source to accept the messages.

Writing a custom source was quite easy. Start by looking at the code
for ThriftLegacySource and AvroSource.

Andrew
On 12 November 2012 19:52, Camp, Roy <[EMAIL PROTECTED]> wrote:

> We use thrift to send from Python, PHP & Java.  Unfortunately with
> Flume-NG you must use the legacyThrift source which works well but does not
> handle a confirmation/ack back to the app.  We have found that failures
> usually result in connection exception thus allowing us to reconnect and
> retry so we have virtually no data loss. Everything downstream from that
> localhost Flume instance (after written to the file channel) is E2E safe.
>
> Roy
>
>
> -----Original Message-----
> From: Juhani Connolly [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, November 08, 2012 5:46 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Using Python and Flume to store avro data
>
> Hi Bart,
>
> we send data  from python to the scribe source and it works fine. We had
> everything set up in scribe before which made the switchover simple. If you
> don't mind the extra overhead of http, go for that, but if you want to keep
> things to a minimum, using the scribe source can be viable.
>
> You can't send data to avro because the python support in avro is missing
> the appropriate encoder(I can't remember what it was, I'd have to check
> over the code again)
>
> On 11/09/2012 03:45 AM, Bart Verwilst wrote:
> > Hi,
> >
> > I've been spending quite a few hours trying to push avro data to Flume
> > so i can store it on HDFS, this all with Python.
> > It seems like something that is impossible for now, since the only way
> > to push avro data to Flume is by the use of deprecated thrift binding
> > that look pretty cumbersome to get working.
> > I would like to know what's the best way to import avro data into
> > Flume with Python? Maybe Flume isnt the right tool and I should use
> > something else? My goal is to have multiple python workers pushing
> > data to HDFS which ( by means of Flume in this case ) consolidates
> > this all in 1 file there.
> >
> > Any thoughts?
> >
> > Thanks!
> >
> > Bart
> >
> >
>
>
+
Bart Verwilst 2012-11-16, 10:54