Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Using Python and Flume to store avro data


Copy link to this message
-
Re: Using Python and Flume to store avro data


Hi Hari,

Just to be absolutely sure, you can write to avro files
by using this? If so, I will try out a snapshot of 1.3 tomorrow and
start playing with it. ;)

Kind regards,

Bart

Hari Shreedharan
schreef op 08.11.2012 20:06:

> No, I am talking about:
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=bc1928bc2e23293cb20f4bc2693a3bc262f507b3
[2]
>
> This will be in the next release which will be out soon.
>

> Thanks,
> Hari
>
> --
> Hari Shreedharan
>
> On Thursday,
November 8, 2012 at 10:57 AM, Bart Verwilst wrote:
>
>> Hi Hari,
>>

>> Are you talking about ipc.HTTPTransciever (
http://nullege.com/codes/search/avro.ipc.HTTPTransceiver [1] )? This was
the class I tried before i noticed it wasn't supported by Flume-1.2 :)

>>
>> I assume the http/json source will also allow for avro to be
received?
>>
>> Kind regards,
>>
>> Bart
>>
>> Hari Shreedharan
schreef op 08.11.2012 19:51:
>>
>>> The next release of Flume-1.3.0
adds support for an HTTP source, which will allow you to send data to
Flume via HTTP/JSON(the representation of the data is pluggable - but a
JSON representation is default). You could use this to write data to
Flume from Python, which I believe has good http and json support.
>>>

>>> Thanks,
>>> Hari
>>>
>>> --
>>> Hari Shreedharan
>>>
>>> On
Thursday, November 8, 2012 at 10:45 AM, Bart Verwilst wrote:
>>>
>>>>
Hi,
>>>>
>>>> I've been spending quite a few hours trying to push avro
data to Flume
>>>> so i can store it on HDFS, this all with Python.

>>>> It seems like something that is impossible for now, since the only
way
>>>> to push avro data to Flume is by the use of deprecated thrift
binding
>>>> that look pretty cumbersome to get working.
>>>> I would
like to know what's the best way to import avro data into Flume
>>>>
with Python? Maybe Flume isnt the right tool and I should use something

>>>> else? My goal is to have multiple python workers pushing data to
HDFS
>>>> which ( by means of Flume in this case ) consolidates this
all in 1 file
>>>> there.
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!

>>>>
>>>> Bart
 

Links:
------
[1]
http://nullege.com/codes/search/avro.ipc.HTTPTransceiver
[2]
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=bc1928bc2e23293cb20f4bc2693a3bc262f507b3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB