Depending on your level of comfort, you can do one of the following:
1. Use Python to fetch your data and then send the events via HTTP to the
Flume HTTP Source 
2. Use Java to create a custom source  in Flume that handles the data
fetching and then puts it in a channel  so that it can be funneled into
the sinks  and 
Option 1 would be easier for you since you can get the data in Python and
just stream it down via HTTP to Flume.
Option 2 will be more involved since you need to write code that
communicates with external endpoints.
*Author and Instructor for the Upcoming Book and Lecture Series*
*Massive Log Data Aggregation, Processing, Searching and Visualization with
Open Source Software*
On 18 July 2013 13:38, Sunita Arvind <[EMAIL PROTECTED]> wrote:
> Hello friends,
> I am new to flume and have written a python script to fetch some data from
> social media. My response is JSON. I am seeking help on following issues:
> 1. I am finding it hard to make python and flume talk. Is it just my
> ignorance or it is indeed a long route? AFAIK, I need to understand thrift
> API and Avro etc to achieve this. I also read about pipes. Would this be a
> simple implementation
> 2. I am equally comfortable (uncomfortable) in java. Hence wondering if
> its better to re-write my application in Java so that I can easily
> integrate it with flume. Are there any advantages of having a java
> application, as all of hadoop is java?
> 3. I need to schedule the agent to run on a daily basis. Which of the
> above approaches would help me achieve this easily?
> 4. Going by this -
> http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%[EMAIL PROTECTED]%3Elooks like we need to manually clean up disk space even with flume. I am
> not clear on the advantages I would have with flume over using a simple
> cron job to do the task. I can manually write statements like "hadoop fs
> -put <location of output file on local> <location on hdfs>" in the cron job
> Appreciate your help and guidance