Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa >> mail # dev >> Creating a new adaptor: FileTailingAdaptor that would not cut lines


+
Luangsay Sourygna 2013-04-18, 18:33
Copy link to this message
-
Re: Creating a new adaptor: FileTailingAdaptor that would not cut lines
I think the best solution is to use Log4j socket appender and Chukwa log4j
socket adaptor to get the full entry of the log without worry about line
feed.  However, this solution only works with program that is written in
Java, and does not keep a copy of existing log file on disk.

I think your proposal is a good idea to solve tailing text file and only
line delimited entry will be send.  How do we handle partial line and log
file has rotated?

regards,
Eric

On Thu, Apr 18, 2013 at 11:33 AM, Luangsay Sourygna <[EMAIL PROTECTED]>wrote:

> Hi all,
>
> FileTailingAdaptor is great to tail log files and send them to Hadoop.
> However, last line of the chunk is usually cut which leads to some errors.
>
> I know that we can use CharFileTailingAdaptorUTF8 to solve such problem.
> Nonetheless, this adaptor calls the MapProcessor.process() method for every
> line in each chunk, thus slowing a lot the Demux phase.
>
> I suggest creating a new adaptor that would mix the benefits of the two
> adaptors: the (Demux) speed of FileTailingAdaptor and
> the preservation of lines from CharFileTailingAdaptorUTF8.
>
> The implementation of the extractRecords() would be:
> - "for loop" on the buffer, starting from the end of the buffer and going
> backward
> - if we find a separator, save the offset and exit the loop
> - rest of method would be similar to CharFileTailingAdaptorUTF8.
>
> Could you guys please tell me what do you think about it?
> How do you currently manage the "lines cut" with Chukwa?
>
> Regards,
>
> Sourygna
>
+
Luangsay Sourygna 2013-04-19, 19:01
+
Eric Yang 2013-04-21, 17:05
+
Luangsay Sourygna 2013-04-21, 22:07
+
Eric Yang 2013-04-22, 04:25
+
Luangsay Sourygna 2013-04-24, 04:49
+
Eric Yang 2013-04-25, 04:33
+
Luangsay Sourygna 2013-04-21, 15:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB