Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume not moving data to HDFS or local


Copy link to this message
-
RE: Flume not moving data to HDFS or local
It should commit when one of the various file roll configuration values are hit. There's a list of them and their defaults in the flume user guide.

For managing new files on your app servers, the best option right now seems to be a spooling directory source along with some kind of cron jobs that run locally on the app servers to drop files in the spool directory when ready. In my case I run a job that executes a custom script to checkpoint a file that is appended to all day long, creating incremental files every minute to drop in the spool directory.
From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 31, 2013 12:47 PM
To: [EMAIL PROTECTED]
Subject: RE: Flume not moving data to HDFS or local
It got resolved it was due to wrong version of guava jar file in flume lib, but still I can see a .tmp extention in teh fiel in HDFS, when does it actually gets commited ? :) ... One another question though What should I change in my configuration file to capture new files being generated in a directory in remote m,achine ?
Say for example there is one new file generated every hour in my webserver hostlog directory. What do I change in my configuration so that I get teh new file directly in my HDFS compressed ?

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God."
"Maybe other people will try to limit me but I don't limit myself"

________________________________
From: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: RE: Flume not moving data to HDFS or local
Date: Thu, 31 Oct 2013 19:29:36 +0000
Hi Paul

I see following error :-

13/10/31 12:27:01 ERROR hdfs.HDFSEventSink: process failed
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
          at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)
          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)
          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)
          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)
          at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)
          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)
          at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)
          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)
          at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
          at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
          at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
          at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:724)
Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
          at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)
          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)
          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)
          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)
          at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)
          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)
          at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)
          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)
          at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
          at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
          at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
          at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurren
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB