Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Problem reading file with spooling directory


Copy link to this message
-
Re: Problem reading file with spooling directory

I too had the same problem (in flume 1.4).
We had checked that the input data is actually  utf-8.
When we used input charset as 'unicode' it worked.
By "worked" I mean, it didn't give this exception.
At the destination that data was garbage for us?

Is it a known thing or are we missing anything?
On 08/04/2013 12:26 PM, Anat Rozenzon wrote:
> Hi,
>
> I'm trying to read a directory with the spooler and at some point I'm
> starting to get these errors:
>
> 01 Aug 2013 10:10:17,892 ERROR [pool-6-thread-1]
> (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:173)
> - Uncaught exception in Runnable
> java.nio.charset.MalformedInputException: Input length = 1
>         at
> java.nio.charset.CoderResult.throwException(CoderResult.java:277)
>         at
> org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:169)
>         at
> org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
>         at
> org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
>         at
> org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
>         at
> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
>         at
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
>
>
> I can see that this is a character set issue, however, the files are
> suppose to be UTF-8 files.
> Still some characters are invalid, is there any way to ignore these lines?
>
> Also, is there a way to know which file/line is causing the exception?
>
> Thanks
> Anat

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB