Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> Review Request: FLUME-1660 Close "idle" hdfs handles


+
Juhani Connolly 2012-10-19, 04:31
+
Juhani Connolly 2012-10-19, 06:01
+
Mike Percy 2012-10-29, 23:49
+
Mike Percy 2012-10-29, 23:51
Copy link to this message
-
Re: Review Request: FLUME-1660 Close "idle" hdfs handles


> On Oct. 29, 2012, 11:49 p.m., Mike Percy wrote:
> > How about we just use the existing timedRollerPool inside the BucketWriter to do this? Just pass closeIdleTimeout to the BucketWriter constructor. At the end of each append, we can just do something like:
> >
> > if (idleCloseFuture != null) idleCloseFuture.cancel(false);
> > idleCloseFuture = timedRollerPool.schedule(new Runnable() {
> >   public void run() {
> >     try {
> >       close();
> >     } catch(Throwable t) {
> >       LOG.error("Unexpected error", t);
> >       if (t instanceof Error) {
> >         throw (Error) t;
> >       }
> >     }
> > }, idleTimeout, TimeUnit.SECONDS);
> >
> > This is basically exactly how the rollInterval timer works (see implementation in BucketWriter.doOpen()). Note you would also want to cancel this future in doClose(), as we do for the rollInterval timer.
> >
> > This approach is certainly slower than just doing a System.getCurrentTimeMillis(), but it's not too bad... Executing future.cancel(false) and future.schedule() seem to take a combined 1.5 microseconds on my laptop. We could put this logic in the doFlush() method and effectively only reset the idle timer at the end of a transaction, which would amortize the cost to almost nil in most cases.
> >
> > The benefit is that if files are rolling too fast, we have a configurable thread pool there to avoid jobs stacking up, whereas a single thread can fall behind. Also, it avoids a synchronization block and iterating through the sfWriters map, and keeps the rolling logic mostly contained in the BucketWriter. It also avoids creating new threads / thread pools.
>
> Mike Percy wrote:
>     Edit: above should say, at the end of each doFlush() then cancel/reschedule the idleCloseFuture

Hmm... I can see that as a viable approach but am curious about what happens with the sfWriters map in HDFSEventSink... It seems like old writers are just abandoned there forever? I would like to clean them up properly(I believe this is common in the use case where files are dumped in a file named by date). While not major, this does seem like it would lead to a buildup of inactive writers? We've had OOM errors when running flume with an HDFS sink using the default memory settings. I have no idea if this is related, perhaps it could be? Looks to me that nowhere other than the stop method is the sfWriters map ever cleaned up.
- Juhani
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7659/#review12894
-----------------------------------------------------------
On Oct. 19, 2012, 6:01 a.m., Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7659/
> -----------------------------------------------------------
>
> (Updated Oct. 19, 2012, 6:01 a.m.)
>
>
> Review request for Flume and Mike Percy.
>
>
> Description
> -------
>
> Added lastWrite to BucketWriter to verify when it was last updated
>
> Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed.
>
>
> This addresses bug FLUME-1660.
>     https://issues.apache.org/jira/browse/FLUME-1660
>
>
> Diffs
> -----
>
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 29ead84
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java bce8e11
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java a6d624b
>
> Diff: https://reviews.apache.org/r/7659/diff/
>
>
> Testing
> -------
>
> Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0)
>
>
> There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know)
+
Juhani Connolly 2012-10-31, 06:11
+
Juhani Connolly 2012-10-31, 06:12
+
Mike Percy 2012-10-31, 07:27
+
Mike Percy 2012-10-31, 07:44
+
Juhani Connolly 2012-10-31, 07:46
+
Juhani Connolly 2012-10-31, 07:48
+
Juhani Connolly 2012-10-31, 08:19
+
Juhani Connolly 2012-10-31, 10:49
+
Juhani Connolly 2012-10-31, 10:53
+
Mike Percy 2012-11-02, 09:34
+
Alexander Alten-Lorenz 2012-11-06, 08:36
+
Juhani Connolly 2012-11-07, 01:34
+
Alexander Alten-Lorenz 2012-11-07, 09:15
+
Mike Percy 2012-11-09, 20:01
+
Alexander Alten-Lorenz 2012-11-11, 10:56
+
Juhani Connolly 2012-11-12, 03:07
+
Juhani Connolly 2012-11-14, 08:01
+
Juhani Connolly 2012-11-16, 02:00
+
Mike Percy 2012-11-16, 07:25
+
Mike Percy 2012-11-16, 07:27
+
Juhani Connolly 2012-11-16, 08:10
+
Mike Percy 2012-11-16, 07:33
+
Juhani Connolly 2012-11-16, 08:12
+
Juhani Connolly 2012-11-16, 10:06
+
Mike Percy 2012-11-19, 08:17
+
Mike Percy 2012-11-19, 08:53