Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Problem reading file with spooling directory


Copy link to this message
-
Re: Problem reading file with spooling directory
Jagadish Bihani 2013-08-08, 06:29

I too had the same problem (in flume 1.4).
We had checked that the input data is actually  utf-8.
When we used input charset as 'unicode' it worked.
By "worked" I mean, it didn't give this exception.
At the destination that data was garbage for us?

Is it a known thing or are we missing anything?
On 08/04/2013 12:26 PM, Anat Rozenzon wrote:
> Hi,
>
> I'm trying to read a directory with the spooler and at some point I'm
> starting to get these errors:
>
> 01 Aug 2013 10:10:17,892 ERROR [pool-6-thread-1]
> (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:173)
> - Uncaught exception in Runnable
> java.nio.charset.MalformedInputException: Input length = 1
>         at
> java.nio.charset.CoderResult.throwException(CoderResult.java:277)
>         at
> org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:169)
>         at
> org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
>         at
> org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
>         at
> org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
>         at
> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
>         at
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
>
>
> I can see that this is a character set issue, however, the files are
> suppose to be UTF-8 files.
> Still some characters are invalid, is there any way to ignore these lines?
>
> Also, is there a way to know which file/line is causing the exception?
>
> Thanks
> Anat