|
Mike Percy
2012-10-29, 23:49
Mike Percy
2012-10-29, 23:51
Juhani Connolly
2012-10-31, 03:56
Juhani Connolly
2012-10-31, 06:11
Juhani Connolly
2012-10-31, 06:12
Mike Percy
2012-10-31, 07:27
Mike Percy
2012-10-31, 07:44
Juhani Connolly
2012-10-31, 07:46
Juhani Connolly
2012-10-31, 07:48
Juhani Connolly
2012-10-31, 08:19
Juhani Connolly
2012-10-31, 10:49
Juhani Connolly
2012-10-31, 10:53
Mike Percy
2012-11-02, 09:34
Alexander Alten-Lorenz
2012-11-06, 08:36
Juhani Connolly
2012-11-07, 01:34
Alexander Alten-Lorenz
2012-11-07, 09:15
Mike Percy
2012-11-09, 20:01
Alexander Alten-Lorenz
2012-11-11, 10:56
Juhani Connolly
2012-11-12, 03:07
Juhani Connolly
2012-11-14, 08:01
Juhani Connolly
2012-11-16, 02:00
Mike Percy
2012-11-16, 07:25
Mike Percy
2012-11-16, 07:27
Mike Percy
2012-11-16, 07:33
Juhani Connolly
2012-11-16, 08:10
Juhani Connolly
2012-11-16, 08:12
Juhani Connolly
2012-11-16, 10:06
Mike Percy
2012-11-19, 08:17
Mike Percy
2012-11-19, 08:53
|
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-10-29, 23:49
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 ----------------------------------------------------------- How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: if (idleCloseFuture != null) idleCloseFuture.cancel(false); idleCloseFuture = timedRollerPool.schedule(new Runnable() { public void run() { try { close(); } catch(Throwable t) { LOG.error("Unexpected error", t); if (t instanceof Error) { throw (Error) t; } } }, idleTimeout, TimeUnit.SECONDS); This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. - Mike Percy On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst 29ead84 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java bce8e11 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java a6d624b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-10-29, 23:51
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture - Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 ----------------------------------------------------------- On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst 29ead84 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java bce8e11 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java a6d624b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 03:56
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 ----------------------------------------------------------- On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst 29ead84 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java bce8e11 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java a6d624b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know)
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 06:11
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 ----------------------------------------------------------- On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 06:12
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. > > Juhani Connolly wrote: > So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. > > After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. > > So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. Uh, that is, 4000 bytes each. Most of that is in the writer. - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 ----------------------------------------------------------- On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed.
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-10-31, 07:27
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. > > Juhani Connolly wrote: > So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. > > After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. > > So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. > > Juhani Connolly wrote: > Uh, that is, 4000 bytes each. Most of that is in the writer. Yeah, you're right about the sfWriters map. The original implementation contained that thing and I never tried to address that issue... it won't cause correctness problems (since close() is effectively idempotent) but yes it will consume some memory. That problem exists with all of the existing rolling code, so it would not just exist with the new code for the close-on-idle feature. One easy band-aid would be to redefine the default maxOpenFiles to, say, 500. That would reduce the severity of the memory reclamation delay, at the limit. If each object takes 4K then a cache of 500 would only take up 2MB which isn't terrible. Another simple approach, which would be a bit ugly (need to be careful with the circular reference) would be to provide a sfWriters reference to each BucketWriter instance, and when the BucketWriter's close() method is called then it can remove itself from the cache if it's still there. Speaking of which, I would prefer to use the Guava CacheBuilder over what we are using now if we can. Anyway, aside from simple workarounds such as the above, I think the whole HDFSEventSink/BucketWriter interaction would need to be significantly refactored to solve this design flaw, which makes me nervous since the HDFS sink is rather stable today after fishing out many very subtle bugs over several months of testing and use. - Mike This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote:
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-10-31, 07:44
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. > > Juhani Connolly wrote: > So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. > > After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. > > So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. > > Juhani Connolly wrote: > Uh, that is, 4000 bytes each. Most of that is in the writer. > > Mike Percy wrote: > Yeah, you're right about the sfWriters map. The original implementation contained that thing and I never tried to address that issue... it won't cause correctness problems (since close() is effectively idempotent) but yes it will consume some memory. That problem exists with all of the existing rolling code, so it would not just exist with the new code for the close-on-idle feature. > > One easy band-aid would be to redefine the default maxOpenFiles to, say, 500. That would reduce the severity of the memory reclamation delay, at the limit. If each object takes 4K then a cache of 500 would only take up 2MB which isn't terrible. Another simple approach, which would be a bit ugly (need to be careful with the circular reference) would be to provide a sfWriters reference to each BucketWriter instance, and when the BucketWriter's close() method is called then it can remove itself from the cache if it's still there. Speaking of which, I would prefer to use the Guava CacheBuilder over what we are using now if we can. Actually, a less-gross solution than the explicit circular reference would be to pass in some kind of onCloseListener to the BucketWriter which gets called back when close() is called on the BucketWriter. That seems like a reasonable way to solve this. - Mike This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote:
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 07:46
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. > > Juhani Connolly wrote: > So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. > > After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. > > So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. > > Juhani Connolly wrote: > Uh, that is, 4000 bytes each. Most of that is in the writer. > > Mike Percy wrote: > Yeah, you're right about the sfWriters map. The original implementation contained that thing and I never tried to address that issue... it won't cause correctness problems (since close() is effectively idempotent) but yes it will consume some memory. That problem exists with all of the existing rolling code, so it would not just exist with the new code for the close-on-idle feature. > > One easy band-aid would be to redefine the default maxOpenFiles to, say, 500. That would reduce the severity of the memory reclamation delay, at the limit. If each object takes 4K then a cache of 500 would only take up 2MB which isn't terrible. Another simple approach, which would be a bit ugly (need to be careful with the circular reference) would be to provide a sfWriters reference to each BucketWriter instance, and when the BucketWriter's close() method is called then it can remove itself from the cache if it's still there. Speaking of which, I would prefer to use the Guava CacheBuilder over what we are using now if we can. How about just running with my original implementation which actually removes inactive writers from the map? It adds another thread but the synchronization block should not be a big deal, as only live writers are iterated over. Considering the polling frequency I don't see this as becoming a problem ever. Regardless of what other rolling may be going on, this should clean up all writers when activated. As I said, we already have 6mb of memory gone after a week feeding in data from 4 data collection servers to the servers dumping the data to hdfs. We may have up to 100 servers per "cluster"(round robin avro sinked destinations), so that's 150mb of memory leakage per week. It's not really a minor detail when deployed at scale. I do not think that a default maxOpenFiles is a good idea, simply for the fact that it closes files in order of age, regardless of if they are active or not. - Juhani This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote:
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 07:48
> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote: > > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like: > > > > if (idleCloseFuture != null) idleCloseFuture.cancel(false); > > idleCloseFuture = timedRollerPool.schedule(new Runnable() { > > public void run() { > > try { > > close(); > > } catch(Throwable t) { > > LOG.error("Unexpected error", t); > > if (t instanceof Error) { > > throw (Error) t; > > } > > } > > }, idleTimeout, TimeUnit.SECONDS); > > > > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer. > > > > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases. > > > > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools. > > Mike Percy wrote: > Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture > > Juhani Connolly wrote: > Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up. > > Juhani Connolly wrote: > So, I took a heap dump and checked the retained size for BucketWriter objects... Around 4000 bytes all told. > > After a week, one of our aggregator/hdfs sink nodes has got 1500 bucket writers alive in memory, for about 6mb of memory on what are essentially dead objects. This is because we generate a new path(based on time) every hour, for each host/data type. We're still running in a test phase, with only a handful of our servers feeding data, so with more servers and more time, this moderate amount of memory to do nothing. > > So in the end of the day, at some point, HDFSEventSink does need to get involved to clean this stuff up. > > Juhani Connolly wrote: > Uh, that is, 4000 bytes each. Most of that is in the writer. > > Mike Percy wrote: > Yeah, you're right about the sfWriters map. The original implementation contained that thing and I never tried to address that issue... it won't cause correctness problems (since close() is effectively idempotent) but yes it will consume some memory. That problem exists with all of the existing rolling code, so it would not just exist with the new code for the close-on-idle feature. > > One easy band-aid would be to redefine the default maxOpenFiles to, say, 500. That would reduce the severity of the memory reclamation delay, at the limit. If each object takes 4K then a cache of 500 would only take up 2MB which isn't terrible. Another simple approach, which would be a bit ugly (need to be careful with the circular reference) would be to provide a sfWriters reference to each BucketWriter instance, and when the BucketWriter's close() method is called then it can remove itself from the cache if it's still there. Speaking of which, I would prefer to use the Guava CacheBuilder over what we are using now if we can. onCloseListener is an interesting idea... However in the current implementation, a close() happens every time the rollInterval happens. Further writes will then call open() again, so we can't just remove the writer from the map every time close is called(unless we readd it with every open). - Juhani This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12894 On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote:
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 08:19
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review12949 ----------------------------------------------------------- Just for posterity, I caught up to mike on IRC, and we hashed out an approach of using Futures and adding an onIdleTimeout hook so that the eventsink can clear out the idle sfWriters which are just eating up memory - Juhani Connolly On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2012, 6:01 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst 29ead84 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java bce8e11 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java a6d624b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 10:49
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Oct. 31, 2012, 10:49 a.m.) Review request for Flume and Mike Percy. Changes ------- Implemented the described suggestions: - Added a callback interface to HDFSEventSink - uses a scheduled future to close stuff once the bucket writer has been idle too long. - onIdleTimeout callback will now remove the bucketWriter from sfWriters, cleaning up that leak(though this is only the case of idleTimeout is non-0) - Added hdfs.idleTimeout configuration setting to hdfs sink, default is 0. Perhaps we should consider enabling it by default with a reasonably long default timeout? Tested on local hdfs, confirmed that the config option is working and that files are indeed getting closed. I also verified in the debugger that the sfWriter entries were actually getting deleted. Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-10-31, 10:53
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Oct. 31, 2012, 10:53 a.m.) Review request for Flume and Mike Percy. Changes ------- whitespace... Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-02, 09:34
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13033 ----------------------------------------------------------- Juhani, apologies for not getting back to this yet, I'll dig into this as soon as I can. - Mike Percy On Oct. 31, 2012, 10:53 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 31, 2012, 10:53 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesAlexander Alten-Lorenz 2012-11-06, 08:36
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13131 ----------------------------------------------------------- flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java <https://reviews.apache.org/r/7659/#comment28290> Please remove the space flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java <https://reviews.apache.org/r/7659/#comment28289> Please remove the space - Alexander Alten-Lorenz On Oct. 31, 2012, 10:53 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Oct. 31, 2012, 10:53 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-07, 01:34
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Nov. 7, 2012, 1:34 a.m.) Review request for Flume and Mike Percy. Changes ------- whitespace Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesAlexander Alten-Lorenz 2012-11-07, 09:15
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13210 ----------------------------------------------------------- Ship it! Looks good, thank you for the work. - Alexander Alten-Lorenz On Nov. 7, 2012, 1:34 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 7, 2012, 1:34 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-09, 20:01
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13301 ----------------------------------------------------------- This looks good, but I don't see any unit tests that exercise the functionality. - Mike Percy On Nov. 7, 2012, 1:34 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 7, 2012, 1:34 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesAlexander Alten-Lorenz 2012-11-11, 10:56
> On Nov. 9, 2012, 8:01 p.m., Mike Percy wrote: > > This looks good, but I don't see any unit tests that exercise the functionality. Thats true, missed this. Was patching my local branch with and let run the build and a simple test. Sry. - Ship it, please add a valuable unit test - Alexander ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13301 ----------------------------------------------------------- On Nov. 7, 2012, 1:34 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 7, 2012, 1:34 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-12, 03:07
> On Nov. 9, 2012, 8:01 p.m., Mike Percy wrote: > > This looks good, but I don't see any unit tests that exercise the functionality. > > Alexander Alten-Lorenz wrote: > Thats true, missed this. Was patching my local branch with and let run the build and a simple test. Sry. > > - Ship it, please add a valuable unit test The main reason I didn't add unit tests was due to the fact that any test would require a handful of sleeps to make sure everything is working as expected, further extending the already long test cycle. I'll have a look at the other testing code and see if I can put some kind of test together without adding too many delays - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13301 ----------------------------------------------------------- On Nov. 7, 2012, 1:34 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 7, 2012, 1:34 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-14, 08:01
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Nov. 14, 2012, 8 a.m.) Review request for Flume and Mike Percy. Changes ------- Added a unit test. There's some commentary on a corner case in there... If you process an event just as the close is happening, what can happen is the following: - process gets handle to the bucketwriter - (in the timer thread)that bucketwriter times out, get closed and removed from map - (in the sink runner/test thread)the bucketwriter has append called on it, opening up a new file with a single append to it - a new bucketwriter is created for following events, which will be writing to a different location. it results in an extra file with a single append, so while not disastrous it's not exactly a great situation(though likely very rare) The only way I can see to fix it is to introduce a synchronization block around the map removal in the callback and the map.get/map.insert in the process() call. The synchronization block was one of the things that we had been wanting to avoid for this alternate method for the implementation... Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-16, 02:00
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Nov. 16, 2012, 2 a.m.) Review request for Flume and Mike Percy. Changes ------- Since no-one had other suggestions I addressed the issue mentioned in the last patch by using a flag to check if a bucket writer was closed for idling. I don't really like it, but to have consistency otherwise we'd need to have a lock shared over bucketwriter(for the bucketwriter map get/put section and the bucketWriter idleClose timer event). Or we could go back to the original implementation which deals with most of this in HDFSEventSink. It breaks up encapsulation of the rolling functionality into multiple classes, but it feels less forced to me than this solution. Feedback would be appreciated. Either way, this works. With the borderline case of sleeping barely more than the idle timeout, often the unit test will attempt get an IOException because the process tries to use a bucketWriter that is in the process of getting decommissioned. This is not a problem as the next process() will work fine. Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-16, 07:25
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13511 ----------------------------------------------------------- Ship it! Hey Juhani, nice work! Good catch, I agree that the race will be very rare and it's handled fine. Please attach the patch to the JIRA. - Mike Percy On Nov. 16, 2012, 2 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 2 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-16, 07:27
> On Nov. 16, 2012, 7:25 a.m., Mike Percy wrote: > > Hey Juhani, nice work! Good catch, I agree that the race will be very rare and it's handled fine. > > > > Please attach the patch to the JIRA. Oops, one thing! I think the idleClosed flag needs to be declared as volatile because the Callable and the process() method will be executed in different threads. - Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13511 ----------------------------------------------------------- On Nov. 16, 2012, 2 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 2 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-16, 07:33
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13513 ----------------------------------------------------------- flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java <https://reviews.apache.org/r/7659/#comment28973> One more thing: I think we should also set hdfs.rollInterval = 0 hdfs.rollSize = 0 hdfs.rollCount = 0 Since default rollCount = 10 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java <https://reviews.apache.org/r/7659/#comment28974> Let's also assert that neither file has a .tmp extension. - Mike Percy On Nov. 16, 2012, 2 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 2 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-16, 08:10
> On Nov. 16, 2012, 7:25 a.m., Mike Percy wrote: > > Hey Juhani, nice work! Good catch, I agree that the race will be very rare and it's handled fine. > > > > Please attach the patch to the JIRA. > > Mike Percy wrote: > Oops, one thing! I think the idleClosed flag needs to be declared as volatile because the Callable and the process() method will be executed in different threads. I'll do that - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13511 ----------------------------------------------------------- On Nov. 16, 2012, 2 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 2 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-16, 08:12
> On Nov. 16, 2012, 7:33 a.m., Mike Percy wrote: > > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java, line 1032 > > <https://reviews.apache.org/r/7659/diff/7/?file=190617#file190617line1032> > > > > Let's also assert that neither file has a .tmp extension. this only really verifies that the sink shutdown works. I'll add in a check before the sink close that one is tmp and one isn't > On Nov. 16, 2012, 7:33 a.m., Mike Percy wrote: > > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java, line 996 > > <https://reviews.apache.org/r/7659/diff/7/?file=190617#file190617line996> > > > > One more thing: I think we should also set > > > > hdfs.rollInterval = 0 > > hdfs.rollSize = 0 > > hdfs.rollCount = 0 > > > > Since default rollCount = 10 will do - Juhani ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13513 ----------------------------------------------------------- On Nov. 16, 2012, 2 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 2 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesJuhani Connolly 2012-11-16, 10:06
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/ ----------------------------------------------------------- (Updated Nov. 16, 2012, 10:06 a.m.) Review request for Flume and Mike Percy. Changes ------- Improved the testing and made flag volatile. Before closing the sink, verifies there is one tmp and one non-tmp files. Also checks after closing that neither is tmp Description ------- Added lastWrite to BucketWriter to verify when it was last updated Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. This addresses bug FLUME-1660. https://issues.apache.org/jira/browse/FLUME-1660 Diffs (updated) ----- flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b Diff: https://reviews.apache.org/r/7659/diff/ Testing ------- Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> All other tests pass. Thanks, Juhani Connolly
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-19, 08:17
----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13558 ----------------------------------------------------------- Ship it! Looks good! I am getting a couple test errors from the integration tests that seem unrelated so I'm committing this anyway. - Mike Percy On Nov. 16, 2012, 10:06 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 10:06 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > >
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handlesMike Percy 2012-11-19, 08:53
> On Nov. 19, 2012, 8:17 a.m., Mike Percy wrote: > > Looks good! I am getting a couple test errors from the integration tests that seem unrelated so I'm committing this anyway. FYI I ran a full build again and the tests are no longer failing, not sure what's flaky about them. - Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7659/#review13558 ----------------------------------------------------------- On Nov. 16, 2012, 10:06 a.m., Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7659/ > ----------------------------------------------------------- > > (Updated Nov. 16, 2012, 10:06 a.m.) > > > Review request for Flume and Mike Percy. > > > Description > ------- > > Added lastWrite to BucketWriter to verify when it was last updated > > Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed. > > > This addresses bug FLUME-1660. > https://issues.apache.org/jira/browse/FLUME-1660 > > > Diffs > ----- > > flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763 > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604 > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b > > Diff: https://reviews.apache.org/r/7659/diff/ > > > Testing > ------- > > Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0) > > > There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know) > > Failed tests: testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0> > > All other tests pass. > > > Thanks, > > Juhani Connolly > > |