Can you elaborate on your use case a bit?
At what point would your business logic decide that the file is complete
(by time or other decision to cut a file as completed)? And then when do
you batch process from what the stream has pilled up for you ?
Writing to HDFS http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample is
pretty straight forward and doing so in a consumer is not a lot of fuss
Whether you need more layers and overhead all gets back to what you are
trying to accomplish and such :) You might need to use Zookeeper or
something to coordinate what to run the batch process (depending on how you
kick this off) so you know what is going on in the Consumers is completed
in the other system.
On Fri, Aug 30, 2013 at 2:18 PM, Mark <[EMAIL PROTECTED]> wrote: