I am using spool directory source, but am facing problem because of the
My use case is, that I am storing some log files in the spool directory,
however the system which is producing the log files really don't know about
logging standards. There are log records of size MB. My requirement is to
store these files into hadoop, so I thought of using flume. So in short
whenever the spool directory source reads a record greater than the
bufferMaxLineLength it throws an exception and then the process is stuck as
the same message is tried over and over again. I don't want to increase the
bufferMaxLineLegth to a ridiculous large value just because of very few big
messages as it will require to increase the heap size (keeping the channel
capacity and transaction capacity constant).
Is there a way other than specifying befferMaxLineLength to read these
unusually large messages? or any suggestions to gracefully dropping such
messages while using spool directory source?