Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - HDFS Sink log rotation on the basis of time of writing

Copy link to this message
Re: HDFS Sink log rotation on the basis of time of writing
Brock Noland 2012-11-05, 15:30

If you just did not bucket the data at all, it would be organized by
the time they arrived at the sink.


On Fri, Nov 2, 2012 at 6:08 PM, Pankaj Gupta <[EMAIL PROTECTED]> wrote:
> Hi,
> Is it possible to organize files written to HDFS into buckets based on the
> time of writing rather than the timestamp in the header? Alternatively, is
> it possible to insert the timestamp injector just before the HDFS Sink?
> My use case is  to organize files such that they are organized
> chronologically as well as alphabetically by name and that there is only one
> file being written to at a time. This will make it easier to look for newly
> available data so that MapReduce jobs can process them.
> Thanks in Advance,
> Pankaj

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/