Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Spooling and duplicated files

Copy link to this message
Re: Spooling and duplicated files
When were these two files added?

This recommendation is made so that Flume does not attempt to process files
that are currently being written to.

It could be possible that the original file was already processed and
renamed (with the suffix added to the original name) and then a new file is
added to the spooling directory after this change.

If Flume is currently working on a file and you write to this file while
things are in progress, you will encounter an exception that will stop the


>  From User Guide I read:
>  "This channel expects that only immutable, uniquely named files are
> dropped in the spooling directory. If duplicate names are used, or files
> are modified while being read, the source will fail with an error message.
> For some use cases this may require adding unique identifiers (such as a
> timestamp) to log file names when they are copied into the spooling
> directory."
>  But I added two files with the same name and both were processed
> correctly. Is that ok? What am I miss understanding?
>  Thanks.
> ------------------------------
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx