Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> sleep() in script doesn't work when called by exec Source


Copy link to this message
-
sleep() in script doesn't work when called by exec Source
Hi,

I am testing with apache-flume-1.4.0-bin.
I made a naive python script for exec source to do throttling by calling sleep() function.
But the sleep() doesn't work when called by exec source.
Any ideas about this or do you have some simply solution for throttling instead of a custom source?

Flume config:
agent.sources = src1
agent.sources.src1.type = exec
agent.sources.src1.command = read-file-throttle.py

read-file-throttle.py:
#!/usr/bin/python

import time

count=0
pre_time=time.time()
with open("apache.log") as infile:
    for line in infile:
        line = line.strip()
        print line
        count += 1
        if count % 50000 == 0:
            now_time = time.time()
            diff = now_time - pre_time
            if diff < 10:
                #print "sleeping %s seconds ..." % (diff)
                time.sleep(diff)
                pre_time = now_time
Thank you very much.

Best Regards,
Yongkun Wang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB