Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume not moving data to HDFS or local


Copy link to this message
-
RE: Flume not moving data to HDFS or local
Can you describe the process to setup spooling directory source ? I am sorry I do not know how to to do that. If you can give me a step by step description on how to configure that and the configuration changes I need to make in my conf to get it done I will be really thankful .. Appreciate your help :)

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.”

"Maybe other people will try to limit me but I don't limit myself"
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Date: Thu, 31 Oct 2013 14:38:54 -0700
Subject: RE: Flume not moving data to HDFS or local

It should commit when one of the various file roll configuration values are hit. There’s a list of them and their defaults in the flume user guide. For managing new files on your app servers, the best option right now seems to be a spooling directory source along with some kind of cron jobs that run locally on the app servers to drop files in the spool directory when ready. In my case I run a job that executes a custom script to checkpoint a file that is appended to all day long, creating incremental files every minute to drop in the spool directory.  From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 31, 2013 12:47 PM
To: [EMAIL PROTECTED]
Subject: RE: Flume not moving data to HDFS or local
It got resolved it was due to wrong version of guava jar file in flume lib, but still I can see a .tmp extention in teh fiel in HDFS, when does it actually gets commited ? :) ... One another question though What should I change in my configuration file to capture new files being generated in a directory in remote m,achine ?Say for example there is one new file generated every hour in my webserver hostlog directory. What do I change in my configuration so that I get teh new file directly in my HDFS compressed ?

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.”
"Maybe other people will try to limit me but I don't limit myself"

From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: RE: Flume not moving data to HDFS or local
Date: Thu, 31 Oct 2013 19:29:36 +0000Hi Paul I see following error :- 13/10/31 12:27:01 ERROR hdfs.HDFSEventSink: process failedjava.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;          at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)          at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)          at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)          at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)          at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)          at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)          at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)          at java.util.concurrent.FutureTask.run(FutureTask.java:262)          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)          at java.lang.Thread.run(Thread.java:724)Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;          at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)          at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)          at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)          at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)          at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)          at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB