Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> HDFS sink leaves .tmp files


+
Chris Neal 2012-08-29, 14:18
+
Chris Neal 2012-09-10, 15:02
+
Eran Kutner 2012-09-10, 15:16
+
Chris Neal 2012-09-10, 15:21
+
Kathleen Ting 2012-09-10, 17:37
+
Chris Neal 2012-09-10, 18:59
+
Chris Neal 2012-09-10, 19:01
+
Bhaskar V. Karambelkar 2012-09-10, 20:08
+
Mike Percy 2012-09-10, 22:04
+
Kathleen Ting 2012-09-10, 22:09
+
Eran Kutner 2012-09-10, 23:15
Copy link to this message
-
Re: HDFS sink leaves .tmp files
Thanks Kathleen!
I'll download that build tomorrow morning and give it a whirl.

Chris

On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]> wrote:

> [Moving to [EMAIL PROTECTED] |
> https://groups.google.com/a/cloudera.org/group/cdh-user/topics since
> this is getting to be CDH specific]
> bcc: [EMAIL PROTECTED]
>
> Chris,
>
> When the file has not been closed by the client, the file size may be
> shown as 0. The NameNode will not update the metadata about the file
> until the block is completed or the file handle is closed. Even if it
> updates at a block boundary, the size won't be accurate until the file
> is closed.
>
> The metadata takes some time to populate even though the files may
> contain data. The CDH4.1 version of Flume includes FLUME-1238, which
> will do auto-rolling of files and helps lower the period where these
> files appear to be 0 size.
>
> Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and
> the CDH4 Flume is compatible with CDH4* Hadoop, you can download the
> nightly build of flume-ng-1.2.0-cdh4.1.0 from
> http://nightly.cloudera.com/cdh4/cdh/4/
>
> Regards, Kathleen
>
> On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar
> <[EMAIL PROTECTED]> wrote:
> > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @
> > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz
> >
> >
> > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote:
> >>
> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still
> >> the latest from their yum repo.
> >>
> >>
> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote:
> >>>
> >>> I'm using a combination :)
> >>>
> >>> The application tier is 1.3.0-SNAPSHOT
> >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from
> the
> >>> CDH repo.  It's version is:  1.1.0+121-1.cdh4.0.1.p0.1.el6
> >>>
> >>> If the issue is on the HDFS sink side, that it could definitely be in
> my
> >>> version!
> >>> I'll check if Cloudera has a more recent version to update to.
> >>>
> >>> Thanks!
> >>> Chris
> >>>
> >>>
> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]>
> >>> wrote:
> >>>>
> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in
> >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0?
> >>>>
> >>>> Thanks, Kathleen
> >>>>
> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote:
> >>>> > Glad to know it's not just me :)
> >>>> >
> >>>> >
> >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]>
> wrote:
> >>>> >>
> >>>> >> I have the same problem. I roll every 1 minute so I have tons of
> >>>> >> those
> >>>> >> .tmp files.
> >>>> >>
> >>>> >> -eran
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]>
> wrote:
> >>>> >>>
> >>>> >>> I'm still seeing this consistently every 24 hour period.  Does
> this
> >>>> >>> sound
> >>>> >>> like a configuration issue, an issue with the Exec source, or an
> >>>> >>> issue with
> >>>> >>> the HDFS sink?
> >>>> >>>
> >>>> >>> Thanks!
> >>>> >>>
> >>>> >>>
> >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]>
> >>>> >>> wrote:
> >>>> >>>>
> >>>> >>>> Hi all,
> >>>> >>>>
> >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log
> >>>> >>>> file
> >>>> >>>> that gets rolled once a day.  It seems that when log4J rolls the
> >>>> >>>> file to the
> >>>> >>>> new date, the hdfs sink ends up with a .tmp file.  I haven't
> >>>> >>>> figured out if
> >>>> >>>> there is any data loss yet, but was curious if this is expected
> >>>> >>>> behavior?
> >>>> >>>>
> >>>> >>>> Thanks for your time.
> >>>> >>>> Chris
> >>>> >>>
> >>>> >>>
> >>>> >>
> >>>> >
> >>>
> >>>
> >>
> >
>
+
Chris Neal 2012-09-13, 14:38
+
Kathleen Ting 2012-09-13, 16:08
+
Shara Shi 2012-11-06, 02:47
+
Kathleen Ting 2012-11-06, 20:05