Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Using Python and Flume to store avro data


+
Bart Verwilst 2012-11-08, 18:45
+
Hari Shreedharan 2012-11-08, 18:51
+
Bart Verwilst 2012-11-08, 18:57
+
Hari Shreedharan 2012-11-08, 19:06
Copy link to this message
-
Re: Using Python and Flume to store avro data
Bart Verwilst 2012-11-08, 21:02


Hi Hari,

Just to be absolutely sure, you can write to avro files
by using this? If so, I will try out a snapshot of 1.3 tomorrow and
start playing with it. ;)

Kind regards,

Bart

Hari Shreedharan
schreef op 08.11.2012 20:06:

> No, I am talking about:
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=bc1928bc2e23293cb20f4bc2693a3bc262f507b3
[2]
>
> This will be in the next release which will be out soon.
>

> Thanks,
> Hari
>
> --
> Hari Shreedharan
>
> On Thursday,
November 8, 2012 at 10:57 AM, Bart Verwilst wrote:
>
>> Hi Hari,
>>

>> Are you talking about ipc.HTTPTransciever (
http://nullege.com/codes/search/avro.ipc.HTTPTransceiver [1] )? This was
the class I tried before i noticed it wasn't supported by Flume-1.2 :)

>>
>> I assume the http/json source will also allow for avro to be
received?
>>
>> Kind regards,
>>
>> Bart
>>
>> Hari Shreedharan
schreef op 08.11.2012 19:51:
>>
>>> The next release of Flume-1.3.0
adds support for an HTTP source, which will allow you to send data to
Flume via HTTP/JSON(the representation of the data is pluggable - but a
JSON representation is default). You could use this to write data to
Flume from Python, which I believe has good http and json support.
>>>

>>> Thanks,
>>> Hari
>>>
>>> --
>>> Hari Shreedharan
>>>
>>> On
Thursday, November 8, 2012 at 10:45 AM, Bart Verwilst wrote:
>>>
>>>>
Hi,
>>>>
>>>> I've been spending quite a few hours trying to push avro
data to Flume
>>>> so i can store it on HDFS, this all with Python.

>>>> It seems like something that is impossible for now, since the only
way
>>>> to push avro data to Flume is by the use of deprecated thrift
binding
>>>> that look pretty cumbersome to get working.
>>>> I would
like to know what's the best way to import avro data into Flume
>>>>
with Python? Maybe Flume isnt the right tool and I should use something

>>>> else? My goal is to have multiple python workers pushing data to
HDFS
>>>> which ( by means of Flume in this case ) consolidates this
all in 1 file
>>>> there.
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!

>>>>
>>>> Bart
 

Links:
------
[1]
http://nullege.com/codes/search/avro.ipc.HTTPTransceiver
[2]
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=bc1928bc2e23293cb20f4bc2693a3bc262f507b3
+
Hari Shreedharan 2012-11-08, 21:12
+
Bart Verwilst 2012-11-08, 21:34
+
Hari Shreedharan 2012-11-08, 21:50
+
Bart Verwilst 2012-11-08, 22:49
+
Brock Noland 2012-11-09, 01:30
+
Juhani Connolly 2012-11-09, 01:46
+
Camp, Roy 2012-11-12, 19:52
+
Andrew Jones 2012-11-13, 09:28
+
Bart Verwilst 2012-11-16, 10:54