Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Delete first line of a file


+
ZORAIDA HIDALGO SANCHEZ 2013-09-12, 16:47
+
Paul Chavez 2013-09-12, 16:58
Copy link to this message
-
Re: Delete first line of a file
There is no out of the box command to remove the first line from an event body but you could write one yourself and plug it in.

If you just want to read CSV records from an event that contains a file, and do so while ignoring the first line, you can use ignoreFirstLine : true on the readCSV or readLine command.

See http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html#readCSV

If you split the file into separate lines upstream from the MorphlineInterceptor (e.g. in SpoolingDirectorySource without BlobDeserializer) a morphline won't help, of course, unless someone implements a feature on, say, SpoolingDirectorySource or a file tailer that can run a morphline directly on the input file to help split the input into events.

Wolfgang.

On Sep 12, 2013, at 9:47 AM, ZORAIDA HIDALGO SANCHEZ wrote:

> Dear all,
>
> our csv files come with a header line at the beginning of the file. What is
> the best approach for removing this line? I have been trying using
> org.apache.flume.sink.solr.morphline.MorphlineInterceptor but I am not able
> to remove it. It looks like it does not match the line with the expected
> content.
>
> Thanks in advance.
>
> Zoraida.-
>
>
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
> This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB