Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # dev - Review Request: FLUME-1425: Create Standalone Spooling Client


+
Patrick Wendell 2012-08-04, 06:47
+
Patrick Wendell 2012-08-04, 06:59
+
Jarek Cecho 2012-08-07, 07:09
+
Patrick Wendell 2012-08-07, 17:46
+
Hari Shreedharan 2012-08-07, 18:40
+
Patrick Wendell 2012-08-11, 05:32
+
Patrick Wendell 2012-08-11, 05:34
+
Patrick Wendell 2012-08-14, 21:54
+
Patrick Wendell 2012-08-14, 22:02
+
Mike Percy 2012-08-23, 10:11
+
Mike Percy 2012-10-07, 04:30
+
Patrick Wendell 2012-10-11, 18:10
+
Mike Percy 2012-10-11, 19:30
+
Patrick Wendell 2012-10-11, 18:02
+
Mike Percy 2012-10-11, 19:31
+
Patrick Wendell 2012-10-11, 19:47
+
Patrick Wendell 2012-10-12, 17:15
+
Cameron Gandevia 2012-10-15, 19:18
+
Patrick Wendell 2012-10-15, 21:11
+
Mike Percy 2012-10-15, 22:02
+
Cameron Gandevia 2012-10-16, 01:19
+
Mike Percy 2012-10-17, 00:26
+
Mike Percy 2012-10-16, 21:49
+
Patrick Wendell 2012-10-19, 18:49
+
Mike Percy 2012-10-19, 22:43
+
Mike Percy 2012-10-17, 02:21
+
Patrick Wendell 2012-10-19, 18:48
+
Patrick Wendell 2012-10-19, 17:09
+
Mike Percy 2012-10-19, 22:31
+
Brock Noland 2012-10-22, 14:26
+
Patrick Wendell 2012-10-22, 19:31
+
Patrick Wendell 2012-10-22, 20:36
Copy link to this message
-
Re: Review Request: FLUME-1425: Create a SpoolDirectory Source and Client
Mike Percy 2012-11-06, 01:13

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6377/#review13120
-----------------------------------------------------------
Hi Patrick, overall this looks good to me. Only thing is the test failure. I am tempted to just commit this, but I'm trying to understand what's going on with the test.

It looks to me that the test is trying to assert that the files not be processed if they are empty. I added some debug messages to the assert and this is what it is printing:

testBehaviorWithEmptyFile(org.apache.flume.client.avro.TestSpoolingFileLineReader)  Time elapsed: 1.012 sec  <<< FAILURE!
java.lang.AssertionError: Does not contain file: /var/folders/fg/ym7gqsvs5h30gg13yjpt1hm00000gn/T/1352164166517-0/file1, only contains: [/var/folders/fg/ym7gqsvs5h30gg13yjpt1hm00000gn/T/1352164166517-0/file1.COMPLETE, /var/folders/fg/ym7gqsvs5h30gg13yjpt1hm00000gn/T/1352164166517-0/file2.COMPLETE]
  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.assertTrue(Assert.java:43)
  at org.apache.flume.client.avro.TestSpoolingFileLineReader.testBehaviorWithEmptyFile(TestSpoolingFileLineReader.java:399)

So the test is asserting that it should not have rolled test1 to test1.COMPLETE, but it has (At the end of this test, it has actually run both).

I don't see any code in the patch indicating that files that have no data should not be read. So I don't understand how this test could be passing for you.

Please try re-running the test using this command from the Flume top-level: mvn clean install -Dtest=TestSpoolingFileLineReader -DfailIfNoTests=false ... I will be very surprised if it passes.

I am still reviewing this patch and am currently leaning toward adding an @Ignore to this test for commit, since I'd like to get this feature into 1.3.0. In the mean time, any comments on the desired behavior are appreciated.

Regards,
Mike
- Mike Percy
On Oct. 22, 2012, 8:36 p.m., Patrick Wendell wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6377/
> -----------------------------------------------------------
>
> (Updated Oct. 22, 2012, 8:36 p.m.)
>
>
> Review request for Flume.
>
>
> Description
> -------
>
> This patch adds a spooling directory based source. The  idea is that a user can have a spool directory where files are deposited for ingestion into flume. Once ingested, the files are clearly renamed and the implementation guarantees at-least-once delivery semantics similar to those achieved within flume itself, even across failures and restarts of the JVM running the code.
>
> This helps fill the gap for people who want a way to get reliable delivery of events into flume, but don't want to directly write their application against the flume API. They can simply drop log files off in a spooldir and let flume ingest asynchronously (using some shell scripts or other automated process).
>
> Unlike the prior iteration, this patch implements a first-class source. It also extends the avro client to support spooling in a similar manner.
>
>
> This addresses bug FlUME-1425.
>     https://issues.apache.org/jira/browse/FlUME-1425
>
>
> Diffs
> -----
>
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceConfiguration.java da804d7
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceType.java abbbf1c
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/AvroCLIClient.java 4a5ecae
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/BufferedLineReader.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/LineReader.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySource.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySourceConfigurationConstants.java PRE-CREATION
+
Mike Percy 2012-11-06, 02:12
+
Patrick Wendell 2012-11-06, 06:34
+
Alexander Alten-Lorenz 2012-11-06, 08:27
+
Alexander Alten-Lorenz 2012-11-06, 09:15
+
Mike Percy 2012-10-30, 06:22
+
Patrick Wendell 2012-11-01, 04:47