Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Apache flume 1.4.0 - Spooling directory issue


Copy link to this message
-
Re: Apache flume 1.4.0 - Spooling directory issue
Jagadish,

I don't think this has something to do with dependencies or libraries.

It seems you need to configure the 'inputCharset' property for your source
to match the expected character set for the input files on those systems.

The default used by Flume is UTF-8 and if the input file is not in the
expected character set, then when a sequence of input units that are not
well-formed are encountered or when the code encounters a sequence of input
units that denote a character that cannot be represented in the output
charset then we have a problem.

I think you should find out what the character set for those systems are
and try to make them match to avoid the problem.

If you include the operating systems these agents are running it might be
easier to guess which character set it is.

I am guessing you are running a Linux distro based on your file paths but
it could be Mac OS X.
References
http://goo.gl/A8cjo
http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CoderResult.html
*Author and Instructor for the Upcoming Book and Lecture Series*
*Massive Log Data Aggregation, Processing, Searching and Visualization with
Open Source Software*
*http://massivelogdata.com*
On 18 July 2013 05:48, Jagadish Bihani <[EMAIL PROTECTED]> wrote:

>  Hi
>
> I am using spooling directory source with apache flume 1.4.0 and having
> the problem
> that *same configuration works on some machines and doesn't work on some
> machines.*
>
> Configuration used to work with flume 1.3.1. (Only the properties related
> to deserializer are changed).
>
> *Configuration is:*
> agent.sources.spooler.type = spooldir
> agent.sources.spooler.spoolDir = /TRACKING_TSV_FILES/tracking_backup_flume
> agent.sources.spooler.batchSize = 10000
> agent.sources.spooler.channels = fileChannel
> agent.sources.spooler.spooldir.deserializer= LINE
> agent.sources.spooler.spooldir.deserializer.maxLineLength =  20000
>
>
> *Exception stack trace is: **
> *java.nio.charset.MalformedInputException: Input length = 1
>     at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
>     at
> org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:169)
>
>     at
> org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
>
>     at
> org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
>
>     at
> org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
>
>     at
> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
>
>     at
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
>
>     at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>
> Is there any software/libraries dependency for it to work?
>
> Regards,
> Jagadish
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB