Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> .tmp in hdfs sink


Copy link to this message
-
Re: .tmp in hdfs sink
Thanks a lot!! Now with this what should be the expected behaviour? After
file is closed a new file is created for writes that come after closing the
file?

Thanks again for committing this change. Do you know when 1.3.0 is out? I
am currently using the snapshot version of 1.3.0

On Tue, Nov 20, 2012 at 11:16 AM, Mike Percy <[EMAIL PROTECTED]> wrote:

> Mohit,
> FLUME-1660 is now committed and it will be in 1.3.0. In the case where you
> are using 1.2.0, I suggest running with hdfs.rollInterval set so the files
> will roll normally.
>
> Regards,
> Mike
>
>
> On Thu, Nov 15, 2012 at 11:23 PM, Juhani Connolly <
> [EMAIL PROTECTED]> wrote:
>
>>  I am actually working on a patch for exactly this, refer to FLUME-1660
>>
>> The patch is on review board right now, I fixed a corner case issue that
>> came up with unit testing, but the implementation is not really to my
>> satisfaction. If you are interested please have a look and add your opinion.
>>
>> https://issues.apache.org/jira/browse/FLUME-1660
>> https://reviews.apache.org/r/7659/
>>
>>
>> On 11/16/2012 01:16 PM, Mohit Anchlia wrote:
>>
>> Another question I had was about rollover. What's the best way to
>> rollover files in reasonable timeframe? For instance our path is
>> YY/MM/DD/HH so every hour there is new file and the -1 hr is just sitting
>> with .tmp and it takes sometimes even hour before .tmp is closed and
>> renamed to .snappy. In this situation is there a way to tell flume to
>> rollover files sooner based on some idle time limit?
>>
>> On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>>
>>> Thanks Mike it makes sense. Anyway I can help?
>>>
>>>
>>> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi Mohit, this is a complicated issue. I've filed
>>>> https://issues.apache.org/jira/browse/FLUME-1714 to track it.
>>>>
>>>> In short, it would require a non-trivial amount of work to implement
>>>> this, and it would need to be done carefully. I agree that it would be
>>>> better if Flume handled this case more gracefully than it does today.
>>>> Today, Flume assumes that you have some job that would go and clean up the
>>>> .tmp files as needed, and that you understand that they could be partially
>>>> written if a crash occurred.
>>>>
>>>> Regards,
>>>> Mike
>>>>
>>>>
>>>> On Sun, Nov 11, 2012 at 8:32 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> What we are seeing is that if flume gets killed either because of
>>>>> server failure or other reasons, it keeps around the .tmp file. Sometimes
>>>>> for whatever reasons .tmp file is not readable. Is there a way to rollover
>>>>> .tmp file more gracefully?
>>>>>
>>>>
>>>>
>>>
>>
>>
>