Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Spooling and duplicated files


Copy link to this message
-
Re: Spooling and duplicated files
One after the other but you are right, I am using  deletePolicy=immediate but I thought Flume would keep track in memory of the processed files. Thanks.

De: Israel Ekpo <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Responder a: Flume User List <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Fecha: miércoles, 22 de mayo de 2013 08:33
Para: Flume User List <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Asunto: Re: Spooling and duplicated files

When were these two files added?

This recommendation is made so that Flume does not attempt to process files that are currently being written to.

It could be possible that the original file was already processed and renamed (with the suffix added to the original name) and then a new file is added to the spooling directory after this change.

If Flume is currently working on a file and you write to this file while things are in progress, you will encounter an exception that will stop the agent.

On 21 May 2013 09:11, ZORAIDA HIDALGO SANCHEZ <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
>From User Guide I read:

"This channel expects that only immutable, uniquely named files are dropped in the spooling directory. If duplicate names are used, or files are modified while being read, the source will fail with an error message. For some use cases this may require adding unique identifiers (such as a timestamp) to log file names when they are copied into the spooling directory."

But I added two files with the same name and both were processed correctly. Is that ok? What am I miss understanding?

Thanks.

________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx
________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB