|
Chris Neal
2012-08-29, 14:18
Chris Neal
2012-09-10, 15:02
Eran Kutner
2012-09-10, 15:16
Chris Neal
2012-09-10, 15:21
Kathleen Ting
2012-09-10, 17:37
Chris Neal
2012-09-10, 18:59
Chris Neal
2012-09-10, 19:01
Bhaskar V. Karambelkar
2012-09-10, 20:08
Mike Percy
2012-09-10, 22:04
Kathleen Ting
2012-09-10, 22:09
Eran Kutner
2012-09-10, 23:15
Chris Neal
2012-09-11, 01:38
Chris Neal
2012-09-13, 14:38
Kathleen Ting
2012-09-13, 16:08
Shara Shi
2012-11-06, 02:47
Kathleen Ting
2012-11-06, 20:05
|
-
HDFS sink leaves .tmp filesChris Neal 2012-08-29, 14:18
Hi all,
I have an Exec Source running a tail -F on a log4J-generated log file that gets rolled once a day. It seems that when log4J rolls the file to the new date, the hdfs sink ends up with a .tmp file. I haven't figured out if there is any data loss yet, but was curious if this is expected behavior? Thanks for your time. Chris +
Chris Neal 2012-08-29, 14:18
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-10, 15:02
I'm still seeing this consistently every 24 hour period. Does this sound
like a configuration issue, an issue with the Exec source, or an issue with the HDFS sink? Thanks! On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > Hi all, > > I have an Exec Source running a tail -F on a log4J-generated log file that > gets rolled once a day. It seems that when log4J rolls the file to the new > date, the hdfs sink ends up with a .tmp file. I haven't figured out if > there is any data loss yet, but was curious if this is expected behavior? > > Thanks for your time. > Chris > +
Chris Neal 2012-09-10, 15:02
-
Re: HDFS sink leaves .tmp filesEran Kutner 2012-09-10, 15:16
I have the same problem. I roll every 1 minute so I have tons of those .tmp
files. -eran On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > I'm still seeing this consistently every 24 hour period. Does this sound > like a configuration issue, an issue with the Exec source, or an issue with > the HDFS sink? > > Thanks! > > > On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> Hi all, >> >> I have an Exec Source running a tail -F on a log4J-generated log file >> that gets rolled once a day. It seems that when log4J rolls the file to >> the new date, the hdfs sink ends up with a .tmp file. I haven't figured >> out if there is any data loss yet, but was curious if this is expected >> behavior? >> >> Thanks for your time. >> Chris >> > > +
Eran Kutner 2012-09-10, 15:16
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-10, 15:21
Glad to know it's not just me :)
On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > I have the same problem. I roll every 1 minute so I have tons of those > .tmp files. > > -eran > > > > On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> I'm still seeing this consistently every 24 hour period. Does this sound >> like a configuration issue, an issue with the Exec source, or an issue with >> the HDFS sink? >> >> Thanks! >> >> >> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >>> Hi all, >>> >>> I have an Exec Source running a tail -F on a log4J-generated log file >>> that gets rolled once a day. It seems that when log4J rolls the file to >>> the new date, the hdfs sink ends up with a .tmp file. I haven't figured >>> out if there is any data loss yet, but was curious if this is expected >>> behavior? >>> >>> Thanks for your time. >>> Chris >>> >> >> > +
Chris Neal 2012-09-10, 15:21
-
Re: HDFS sink leaves .tmp filesKathleen Ting 2012-09-10, 17:37
Chris, Eran, this appears to be FLUME-1238, which was fixed in
Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? Thanks, Kathleen On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > Glad to know it's not just me :) > > > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >> >> I have the same problem. I roll every 1 minute so I have tons of those >> .tmp files. >> >> -eran >> >> >> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> I'm still seeing this consistently every 24 hour period. Does this sound >>> like a configuration issue, an issue with the Exec source, or an issue with >>> the HDFS sink? >>> >>> Thanks! >>> >>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> >>>> Hi all, >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log file >>>> that gets rolled once a day. It seems that when log4J rolls the file to the >>>> new date, the hdfs sink ends up with a .tmp file. I haven't figured out if >>>> there is any data loss yet, but was curious if this is expected behavior? >>>> >>>> Thanks for your time. >>>> Chris >>> >>> >> > +
Kathleen Ting 2012-09-10, 17:37
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-10, 18:59
I'm using a combination :)
The application tier is 1.3.0-SNAPSHOT The HDFS tier is CentOS, and I grabbed the latest (at the time) from the CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 If the issue is on the HDFS sink side, that it could definitely be in my version! I'll check if Cloudera has a more recent version to update to. Thanks! Chris On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> wrote: > Chris, Eran, this appears to be FLUME-1238, which was fixed in > Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? > > Thanks, Kathleen > > On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > > Glad to know it's not just me :) > > > > > > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > >> > >> I have the same problem. I roll every 1 minute so I have tons of those > >> .tmp files. > >> > >> -eran > >> > >> > >> > >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>> > >>> I'm still seeing this consistently every 24 hour period. Does this > sound > >>> like a configuration issue, an issue with the Exec source, or an issue > with > >>> the HDFS sink? > >>> > >>> Thanks! > >>> > >>> > >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>>> > >>>> Hi all, > >>>> > >>>> I have an Exec Source running a tail -F on a log4J-generated log file > >>>> that gets rolled once a day. It seems that when log4J rolls the file > to the > >>>> new date, the hdfs sink ends up with a .tmp file. I haven't figured > out if > >>>> there is any data loss yet, but was curious if this is expected > behavior? > >>>> > >>>> Thanks for your time. > >>>> Chris > >>> > >>> > >> > > > +
Chris Neal 2012-09-10, 18:59
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-10, 19:01
Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still the
latest from their yum repo. On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > I'm using a combination :) > > The application tier is 1.3.0-SNAPSHOT > The HDFS tier is CentOS, and I grabbed the latest (at the time) from the > CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 > > If the issue is on the HDFS sink side, that it could definitely be in my > version! > I'll check if Cloudera has a more recent version to update to. > > Thanks! > Chris > > > On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > >> Chris, Eran, this appears to be FLUME-1238, which was fixed in >> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >> >> Thanks, Kathleen >> >> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >> > Glad to know it's not just me :) >> > >> > >> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >> >> >> >> I have the same problem. I roll every 1 minute so I have tons of those >> >> .tmp files. >> >> >> >> -eran >> >> >> >> >> >> >> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >>> >> >>> I'm still seeing this consistently every 24 hour period. Does this >> sound >> >>> like a configuration issue, an issue with the Exec source, or an >> issue with >> >>> the HDFS sink? >> >>> >> >>> Thanks! >> >>> >> >>> >> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >>>> >> >>>> Hi all, >> >>>> >> >>>> I have an Exec Source running a tail -F on a log4J-generated log file >> >>>> that gets rolled once a day. It seems that when log4J rolls the >> file to the >> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't figured >> out if >> >>>> there is any data loss yet, but was curious if this is expected >> behavior? >> >>>> >> >>>> Thanks for your time. >> >>>> Chris >> >>> >> >>> >> >> >> > >> > > +
Chris Neal 2012-09-10, 19:01
-
Re: HDFS sink leaves .tmp filesBhaskar V. Karambelkar 2012-09-10, 20:08
Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @
http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still > the latest from their yum repo. > > > On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> I'm using a combination :) >> >> The application tier is 1.3.0-SNAPSHOT >> The HDFS tier is CentOS, and I grabbed the latest (at the time) from the >> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >> >> If the issue is on the HDFS sink side, that it could definitely be in my >> version! >> I'll check if Cloudera has a more recent version to update to. >> >> Thanks! >> Chris >> >> >> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: >> >>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>> >>> Thanks, Kathleen >>> >>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> > Glad to know it's not just me :) >>> > >>> > >>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >>> >> >>> >> I have the same problem. I roll every 1 minute so I have tons of those >>> >> .tmp files. >>> >> >>> >> -eran >>> >> >>> >> >>> >> >>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> >>> >>> I'm still seeing this consistently every 24 hour period. Does this >>> sound >>> >>> like a configuration issue, an issue with the Exec source, or an >>> issue with >>> >>> the HDFS sink? >>> >>> >>> >>> Thanks! >>> >>> >>> >>> >>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> >>> wrote: >>> >>>> >>> >>>> Hi all, >>> >>>> >>> >>>> I have an Exec Source running a tail -F on a log4J-generated log >>> file >>> >>>> that gets rolled once a day. It seems that when log4J rolls the >>> file to the >>> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't >>> figured out if >>> >>>> there is any data loss yet, but was curious if this is expected >>> behavior? >>> >>>> >>> >>>> Thanks for your time. >>> >>>> Chris >>> >>> >>> >>> >>> >> >>> > >>> >> >> > +
Bhaskar V. Karambelkar 2012-09-10, 20:08
-
Re: HDFS sink leaves .tmp filesMike Percy 2012-09-10, 22:04
Yeah - that's built against CDH3 though. But I think it would work anyway.
On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar <[EMAIL PROTECTED] > wrote: > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz > > > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still >> the latest from their yum repo. >> >> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >>> I'm using a combination :) >>> >>> The application tier is 1.3.0-SNAPSHOT >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from the >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >>> >>> If the issue is on the HDFS sink side, that it could definitely be in my >>> version! >>> I'll check if Cloudera has a more recent version to update to. >>> >>> Thanks! >>> Chris >>> >>> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: >>> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>>> >>>> Thanks, Kathleen >>>> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> > Glad to know it's not just me :) >>>> > >>>> > >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >>>> >> >>>> >> I have the same problem. I roll every 1 minute so I have tons of >>>> those >>>> >> .tmp files. >>>> >> >>>> >> -eran >>>> >> >>>> >> >>>> >> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> >>>> wrote: >>>> >>> >>>> >>> I'm still seeing this consistently every 24 hour period. Does this >>>> sound >>>> >>> like a configuration issue, an issue with the Exec source, or an >>>> issue with >>>> >>> the HDFS sink? >>>> >>> >>>> >>> Thanks! >>>> >>> >>>> >>> >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>> >>>> >>>> Hi all, >>>> >>>> >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log >>>> file >>>> >>>> that gets rolled once a day. It seems that when log4J rolls the >>>> file to the >>>> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't >>>> figured out if >>>> >>>> there is any data loss yet, but was curious if this is expected >>>> behavior? >>>> >>>> >>>> >>>> Thanks for your time. >>>> >>>> Chris >>>> >>> >>>> >>> >>>> >> >>>> > >>>> >>> >>> >> > +
Mike Percy 2012-09-10, 22:04
-
Re: HDFS sink leaves .tmp filesKathleen Ting 2012-09-10, 22:09
[Moving to [EMAIL PROTECTED] |
https://groups.google.com/a/cloudera.org/group/cdh-user/topics since this is getting to be CDH specific] bcc: [EMAIL PROTECTED] Chris, When the file has not been closed by the client, the file size may be shown as 0. The NameNode will not update the metadata about the file until the block is completed or the file handle is closed. Even if it updates at a block boundary, the size won't be accurate until the file is closed. The metadata takes some time to populate even though the files may contain data. The CDH4.1 version of Flume includes FLUME-1238, which will do auto-rolling of files and helps lower the period where these files appear to be 0 size. Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and the CDH4 Flume is compatible with CDH4* Hadoop, you can download the nightly build of flume-ng-1.2.0-cdh4.1.0 from http://nightly.cloudera.com/cdh4/cdh/4/ Regards, Kathleen On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar <[EMAIL PROTECTED]> wrote: > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz > > > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still >> the latest from their yum repo. >> >> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> I'm using a combination :) >>> >>> The application tier is 1.3.0-SNAPSHOT >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from the >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >>> >>> If the issue is on the HDFS sink side, that it could definitely be in my >>> version! >>> I'll check if Cloudera has a more recent version to update to. >>> >>> Thanks! >>> Chris >>> >>> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>>> >>>> Thanks, Kathleen >>>> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> > Glad to know it's not just me :) >>>> > >>>> > >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >>>> >> >>>> >> I have the same problem. I roll every 1 minute so I have tons of >>>> >> those >>>> >> .tmp files. >>>> >> >>>> >> -eran >>>> >> >>>> >> >>>> >> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> >>> >>>> >>> I'm still seeing this consistently every 24 hour period. Does this >>>> >>> sound >>>> >>> like a configuration issue, an issue with the Exec source, or an >>>> >>> issue with >>>> >>> the HDFS sink? >>>> >>> >>>> >>> Thanks! >>>> >>> >>>> >>> >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> >>>> >>> wrote: >>>> >>>> >>>> >>>> Hi all, >>>> >>>> >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log >>>> >>>> file >>>> >>>> that gets rolled once a day. It seems that when log4J rolls the >>>> >>>> file to the >>>> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't >>>> >>>> figured out if >>>> >>>> there is any data loss yet, but was curious if this is expected >>>> >>>> behavior? >>>> >>>> >>>> >>>> Thanks for your time. >>>> >>>> Chris >>>> >>> >>>> >>> >>>> >> >>>> > >>> >>> >> > +
Kathleen Ting 2012-09-10, 22:09
-
Re: HDFS sink leaves .tmp filesEran Kutner 2012-09-10, 23:15
I'm using flume 1.2 on CDH4:
Flume 1.2.0 Subversion https://svn.apache.org/repos/asf/flume/tags/flume-1.2.0-rc1 -r 1360090 Compiled by mpercy on Wed Jul 11 02:54:24 PDT 2012 >From source with checksum 8eb3085d75407e62436c83fab38c8645 But I think I figured why it's happening. I have multiple flume agents writing to the same directory. I assumed the annoying number that got appended to the file names was added to make them unique but it doesn't. Although it looks like a time stamp it is being incremented sequentially which means multiple servers are trying to write the same files at the same time. I think that was causing the problem. I've now added a hostname interceptor and added the hostname to the file names to distinguish them and at least for now the problem seems to be gone. -eran On Tue, Sep 11, 2012 at 1:09 AM, Kathleen Ting <[EMAIL PROTECTED]> wrote: > [Moving to [EMAIL PROTECTED] | > https://groups.google.com/a/cloudera.org/group/cdh-user/topics since > this is getting to be CDH specific] > bcc: [EMAIL PROTECTED] > > Chris, > > When the file has not been closed by the client, the file size may be > shown as 0. The NameNode will not update the metadata about the file > until the block is completed or the file handle is closed. Even if it > updates at a block boundary, the size won't be accurate until the file > is closed. > > The metadata takes some time to populate even though the files may > contain data. The CDH4.1 version of Flume includes FLUME-1238, which > will do auto-rolling of files and helps lower the period where these > files appear to be 0 size. > > Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and > the CDH4 Flume is compatible with CDH4* Hadoop, you can download the > nightly build of flume-ng-1.2.0-cdh4.1.0 from > http://nightly.cloudera.com/cdh4/cdh/4/ > > Regards, Kathleen > > On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar > <[EMAIL PROTECTED]> wrote: > > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ > > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz > > > > > > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> > >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still > >> the latest from their yum repo. > >> > >> > >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>> > >>> I'm using a combination :) > >>> > >>> The application tier is 1.3.0-SNAPSHOT > >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from > the > >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 > >>> > >>> If the issue is on the HDFS sink side, that it could definitely be in > my > >>> version! > >>> I'll check if Cloudera has a more recent version to update to. > >>> > >>> Thanks! > >>> Chris > >>> > >>> > >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> > >>> wrote: > >>>> > >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in > >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? > >>>> > >>>> Thanks, Kathleen > >>>> > >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>>> > Glad to know it's not just me :) > >>>> > > >>>> > > >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> > wrote: > >>>> >> > >>>> >> I have the same problem. I roll every 1 minute so I have tons of > >>>> >> those > >>>> >> .tmp files. > >>>> >> > >>>> >> -eran > >>>> >> > >>>> >> > >>>> >> > >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> > wrote: > >>>> >>> > >>>> >>> I'm still seeing this consistently every 24 hour period. Does > this > >>>> >>> sound > >>>> >>> like a configuration issue, an issue with the Exec source, or an > >>>> >>> issue with > >>>> >>> the HDFS sink? > >>>> >>> > >>>> >>> Thanks! > >>>> >>> > >>>> >>> > >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> > >>>> >>> wrote: > >>>> >>>> > >>>> >>>> Hi all, > +
Eran Kutner 2012-09-10, 23:15
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-11, 01:38
Thanks Kathleen!
I'll download that build tomorrow morning and give it a whirl. Chris On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]> wrote: > [Moving to [EMAIL PROTECTED] | > https://groups.google.com/a/cloudera.org/group/cdh-user/topics since > this is getting to be CDH specific] > bcc: [EMAIL PROTECTED] > > Chris, > > When the file has not been closed by the client, the file size may be > shown as 0. The NameNode will not update the metadata about the file > until the block is completed or the file handle is closed. Even if it > updates at a block boundary, the size won't be accurate until the file > is closed. > > The metadata takes some time to populate even though the files may > contain data. The CDH4.1 version of Flume includes FLUME-1238, which > will do auto-rolling of files and helps lower the period where these > files appear to be 0 size. > > Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and > the CDH4 Flume is compatible with CDH4* Hadoop, you can download the > nightly build of flume-ng-1.2.0-cdh4.1.0 from > http://nightly.cloudera.com/cdh4/cdh/4/ > > Regards, Kathleen > > On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar > <[EMAIL PROTECTED]> wrote: > > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ > > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz > > > > > > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >> > >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still > >> the latest from their yum repo. > >> > >> > >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>> > >>> I'm using a combination :) > >>> > >>> The application tier is 1.3.0-SNAPSHOT > >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from > the > >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 > >>> > >>> If the issue is on the HDFS sink side, that it could definitely be in > my > >>> version! > >>> I'll check if Cloudera has a more recent version to update to. > >>> > >>> Thanks! > >>> Chris > >>> > >>> > >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> > >>> wrote: > >>>> > >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in > >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? > >>>> > >>>> Thanks, Kathleen > >>>> > >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > >>>> > Glad to know it's not just me :) > >>>> > > >>>> > > >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> > wrote: > >>>> >> > >>>> >> I have the same problem. I roll every 1 minute so I have tons of > >>>> >> those > >>>> >> .tmp files. > >>>> >> > >>>> >> -eran > >>>> >> > >>>> >> > >>>> >> > >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> > wrote: > >>>> >>> > >>>> >>> I'm still seeing this consistently every 24 hour period. Does > this > >>>> >>> sound > >>>> >>> like a configuration issue, an issue with the Exec source, or an > >>>> >>> issue with > >>>> >>> the HDFS sink? > >>>> >>> > >>>> >>> Thanks! > >>>> >>> > >>>> >>> > >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> > >>>> >>> wrote: > >>>> >>>> > >>>> >>>> Hi all, > >>>> >>>> > >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log > >>>> >>>> file > >>>> >>>> that gets rolled once a day. It seems that when log4J rolls the > >>>> >>>> file to the > >>>> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't > >>>> >>>> figured out if > >>>> >>>> there is any data loss yet, but was curious if this is expected > >>>> >>>> behavior? > >>>> >>>> > >>>> >>>> Thanks for your time. > >>>> >>>> Chris > >>>> >>> > >>>> >>> > >>>> >> > >>>> > > >>> > >>> > >> > > > +
Chris Neal 2012-09-11, 01:38
-
Re: HDFS sink leaves .tmp filesChris Neal 2012-09-13, 14:38
Just to follow up, the .tmp file problem did go away using 1.3.0-SNAPSHOT
on the HDFS sink agent. Thanks again Kathleen :) On Mon, Sep 10, 2012 at 8:38 PM, Chris Neal <[EMAIL PROTECTED]> wrote: > Thanks Kathleen! > I'll download that build tomorrow morning and give it a whirl. > > Chris > > > On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]>wrote: > >> [Moving to [EMAIL PROTECTED] | >> https://groups.google.com/a/cloudera.org/group/cdh-user/topics since >> this is getting to be CDH specific] >> bcc: [EMAIL PROTECTED] >> >> Chris, >> >> When the file has not been closed by the client, the file size may be >> shown as 0. The NameNode will not update the metadata about the file >> until the block is completed or the file handle is closed. Even if it >> updates at a block boundary, the size won't be accurate until the file >> is closed. >> >> The metadata takes some time to populate even though the files may >> contain data. The CDH4.1 version of Flume includes FLUME-1238, which >> will do auto-rolling of files and helps lower the period where these >> files appear to be 0 size. >> >> Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and >> the CDH4 Flume is compatible with CDH4* Hadoop, you can download the >> nightly build of flume-ng-1.2.0-cdh4.1.0 from >> http://nightly.cloudera.com/cdh4/cdh/4/ >> >> Regards, Kathleen >> >> On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar >> <[EMAIL PROTECTED]> wrote: >> > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ >> > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz >> > >> > >> > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >> >> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is still >> >> the latest from their yum repo. >> >> >> >> >> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >>> >> >>> I'm using a combination :) >> >>> >> >>> The application tier is 1.3.0-SNAPSHOT >> >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from >> the >> >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >> >>> >> >>> If the issue is on the HDFS sink side, that it could definitely be in >> my >> >>> version! >> >>> I'll check if Cloudera has a more recent version to update to. >> >>> >> >>> Thanks! >> >>> Chris >> >>> >> >>> >> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> >> >>> wrote: >> >>>> >> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >> >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >> >>>> >> >>>> Thanks, Kathleen >> >>>> >> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> >> wrote: >> >>>> > Glad to know it's not just me :) >> >>>> > >> >>>> > >> >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> >> wrote: >> >>>> >> >> >>>> >> I have the same problem. I roll every 1 minute so I have tons of >> >>>> >> those >> >>>> >> .tmp files. >> >>>> >> >> >>>> >> -eran >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> >> wrote: >> >>>> >>> >> >>>> >>> I'm still seeing this consistently every 24 hour period. Does >> this >> >>>> >>> sound >> >>>> >>> like a configuration issue, an issue with the Exec source, or an >> >>>> >>> issue with >> >>>> >>> the HDFS sink? >> >>>> >>> >> >>>> >>> Thanks! >> >>>> >>> >> >>>> >>> >> >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> >> >>>> >>> wrote: >> >>>> >>>> >> >>>> >>>> Hi all, >> >>>> >>>> >> >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated log >> >>>> >>>> file >> >>>> >>>> that gets rolled once a day. It seems that when log4J rolls the >> >>>> >>>> file to the >> >>>> >>>> new date, the hdfs sink ends up with a .tmp file. I haven't >> >>>> >>>> figured out if >> >>>> >>>> there is any data loss yet, but was curious if this is expected >> >>>> >>>> behavior? +
Chris Neal 2012-09-13, 14:38
-
Re: HDFS sink leaves .tmp filesKathleen Ting 2012-09-13, 16:08
Chris, glad to hear and glad to be of help. Thanks for letting us know
that it worked. Regards, Kathleen On Thu, Sep 13, 2012 at 7:38 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > Just to follow up, the .tmp file problem did go away using 1.3.0-SNAPSHOT on > the HDFS sink agent. > > Thanks again Kathleen :) > > > On Mon, Sep 10, 2012 at 8:38 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >> Thanks Kathleen! >> I'll download that build tomorrow morning and give it a whirl. >> >> Chris >> >> >> On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]> >> wrote: >>> >>> [Moving to [EMAIL PROTECTED] | >>> https://groups.google.com/a/cloudera.org/group/cdh-user/topics since >>> this is getting to be CDH specific] >>> bcc: [EMAIL PROTECTED] >>> >>> Chris, >>> >>> When the file has not been closed by the client, the file size may be >>> shown as 0. The NameNode will not update the metadata about the file >>> until the block is completed or the file handle is closed. Even if it >>> updates at a block boundary, the size won't be accurate until the file >>> is closed. >>> >>> The metadata takes some time to populate even though the files may >>> contain data. The CDH4.1 version of Flume includes FLUME-1238, which >>> will do auto-rolling of files and helps lower the period where these >>> files appear to be 0 size. >>> >>> Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop and >>> the CDH4 Flume is compatible with CDH4* Hadoop, you can download the >>> nightly build of flume-ng-1.2.0-cdh4.1.0 from >>> http://nightly.cloudera.com/cdh4/cdh/4/ >>> >>> Regards, Kathleen >>> >>> On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar >>> <[EMAIL PROTECTED]> wrote: >>> > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build @ >>> > http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz >>> > >>> > >>> > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >> >>> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is >>> >> still >>> >> the latest from their yum repo. >>> >> >>> >> >>> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> >>> >>> I'm using a combination :) >>> >>> >>> >>> The application tier is 1.3.0-SNAPSHOT >>> >>> The HDFS tier is CentOS, and I grabbed the latest (at the time) from >>> >>> the >>> >>> CDH repo. It's version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >>> >>> >>> >>> If the issue is on the HDFS sink side, that it could definitely be in >>> >>> my >>> >>> version! >>> >>> I'll check if Cloudera has a more recent version to update to. >>> >>> >>> >>> Thanks! >>> >>> Chris >>> >>> >>> >>> >>> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting <[EMAIL PROTECTED]> >>> >>> wrote: >>> >>>> >>> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>> >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>> >>>> >>> >>>> Thanks, Kathleen >>> >>>> >>> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> >>> >>>> wrote: >>> >>>> > Glad to know it's not just me :) >>> >>>> > >>> >>>> > >>> >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner <[EMAIL PROTECTED]> >>> >>>> > wrote: >>> >>>> >> >>> >>>> >> I have the same problem. I roll every 1 minute so I have tons of >>> >>>> >> those >>> >>>> >> .tmp files. >>> >>>> >> >>> >>>> >> -eran >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal <[EMAIL PROTECTED]> >>> >>>> >> wrote: >>> >>>> >>> >>> >>>> >>> I'm still seeing this consistently every 24 hour period. Does >>> >>>> >>> this >>> >>>> >>> sound >>> >>>> >>> like a configuration issue, an issue with the Exec source, or an >>> >>>> >>> issue with >>> >>>> >>> the HDFS sink? >>> >>>> >>> >>> >>>> >>> Thanks! >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> On Wed, Aug 29, 2012 at 9:18 AM, Chris Neal <[EMAIL PROTECTED]> >>> >>>> >>> wrote: >>> >>>> >>>> >>> >>>> >>>> Hi all, >>> >>>> >>>> >>> >>>> >>>> I have an Exec Source running a tail -F on a log4J-generated +
Kathleen Ting 2012-09-13, 16:08
-
答复: HDFS sink leaves .tmp filesShara Shi 2012-11-06, 02:47
I met the similar problem that .tmp file appears on hdfs when hdfs reboot.
I think the file dose not be closed properly . But I don't know how to handle this problem. Shara -----邮件原件----- 发件人: Kathleen Ting [mailto:[EMAIL PROTECTED]] 发送时间: 2012年9月14日 0:09 收件人: [EMAIL PROTECTED] 主题: Re: HDFS sink leaves .tmp files Chris, glad to hear and glad to be of help. Thanks for letting us know that it worked. Regards, Kathleen On Thu, Sep 13, 2012 at 7:38 AM, Chris Neal <[EMAIL PROTECTED]> wrote: > Just to follow up, the .tmp file problem did go away using > 1.3.0-SNAPSHOT on the HDFS sink agent. > > Thanks again Kathleen :) > > > On Mon, Sep 10, 2012 at 8:38 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >> >> Thanks Kathleen! >> I'll download that build tomorrow morning and give it a whirl. >> >> Chris >> >> >> On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]> >> wrote: >>> >>> [Moving to [EMAIL PROTECTED] | >>> https://groups.google.com/a/cloudera.org/group/cdh-user/topics since >>> this is getting to be CDH specific] >>> bcc: [EMAIL PROTECTED] >>> >>> Chris, >>> >>> When the file has not been closed by the client, the file size may >>> be shown as 0. The NameNode will not update the metadata about the >>> file until the block is completed or the file handle is closed. Even >>> if it updates at a block boundary, the size won't be accurate until >>> the file is closed. >>> >>> The metadata takes some time to populate even though the files may >>> contain data. The CDH4.1 version of Flume includes FLUME-1238, which >>> will do auto-rolling of files and helps lower the period where these >>> files appear to be 0 size. >>> >>> Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop >>> and the CDH4 Flume is compatible with CDH4* Hadoop, you can download >>> the nightly build of flume-ng-1.2.0-cdh4.1.0 from >>> http://nightly.cloudera.com/cdh4/cdh/4/ >>> >>> Regards, Kathleen >>> >>> On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar >>> <[EMAIL PROTECTED]> wrote: >>> > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build >>> > @ http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz >>> > >>> > >>> > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >> >>> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is >>> >> still the latest from their yum repo. >>> >> >>> >> >>> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> >>> >>> I'm using a combination :) >>> >>> >>> >>> The application tier is 1.3.0-SNAPSHOT The HDFS tier is CentOS, >>> >>> and I grabbed the latest (at the time) from the CDH repo. It's >>> >>> version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >>> >>> >>> >>> If the issue is on the HDFS sink side, that it could definitely >>> >>> be in my version! >>> >>> I'll check if Cloudera has a more recent version to update to. >>> >>> >>> >>> Thanks! >>> >>> Chris >>> >>> >>> >>> >>> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting >>> >>> <[EMAIL PROTECTED]> >>> >>> wrote: >>> >>>> >>> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>> >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>> >>>> >>> >>>> Thanks, Kathleen >>> >>>> >>> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> >>> >>>> wrote: >>> >>>> > Glad to know it's not just me :) >>> >>>> > >>> >>>> > >>> >>>> > On Mon, Sep 10, 2012 at 10:16 AM, Eran Kutner >>> >>>> > <[EMAIL PROTECTED]> >>> >>>> > wrote: >>> >>>> >> >>> >>>> >> I have the same problem. I roll every 1 minute so I have >>> >>>> >> tons of those .tmp files. >>> >>>> >> >>> >>>> >> -eran >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> On Mon, Sep 10, 2012 at 6:02 PM, Chris Neal >>> >>>> >> <[EMAIL PROTECTED]> >>> >>>> >> wrote: >>> >>>> >>> >>> >>>> >>> I'm still seeing this consistently every 24 hour period. >>> >>>> >>> Does this sound like a configuration issue, an issue with >>> >>>> >>> the Exec source, or an issue with the HDFS sink? +
Shara Shi 2012-11-06, 02:47
-
Re: 答复: HDFS sink leaves .tmp filesKathleen Ting 2012-11-06, 20:05
Hi Shara,
The .tmp file contains the actual data before it is renamed to the final file. If the .tmp file is still open, then Flume is still holding it open in order to write to it. If Flume somehow dies and is unable to clean up the .tmp files, then once the client lease expires, the NameNode will consider it closed (but the NameNode will not rename it from .tmp - we recommend writing a script to do that). Regards, Kathleen On Mon, Nov 5, 2012 at 6:47 PM, Shara Shi <[EMAIL PROTECTED]> wrote: > I met the similar problem that .tmp file appears on hdfs when hdfs reboot. > I think the file dose not be closed properly . But I don't know how to > handle this problem. > > Shara > > -----邮件原件----- > 发件人: Kathleen Ting [mailto:[EMAIL PROTECTED]] > 发送时间: 2012年9月14日 0:09 > 收件人: [EMAIL PROTECTED] > 主题: Re: HDFS sink leaves .tmp files > > Chris, glad to hear and glad to be of help. Thanks for letting us know that > it worked. > > Regards, Kathleen > > On Thu, Sep 13, 2012 at 7:38 AM, Chris Neal <[EMAIL PROTECTED]> wrote: >> Just to follow up, the .tmp file problem did go away using >> 1.3.0-SNAPSHOT on the HDFS sink agent. >> >> Thanks again Kathleen :) >> >> >> On Mon, Sep 10, 2012 at 8:38 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>> >>> Thanks Kathleen! >>> I'll download that build tomorrow morning and give it a whirl. >>> >>> Chris >>> >>> >>> On Mon, Sep 10, 2012 at 5:09 PM, Kathleen Ting <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> [Moving to [EMAIL PROTECTED] | >>>> https://groups.google.com/a/cloudera.org/group/cdh-user/topics since >>>> this is getting to be CDH specific] >>>> bcc: [EMAIL PROTECTED] >>>> >>>> Chris, >>>> >>>> When the file has not been closed by the client, the file size may >>>> be shown as 0. The NameNode will not update the metadata about the >>>> file until the block is completed or the file handle is closed. Even >>>> if it updates at a block boundary, the size won't be accurate until >>>> the file is closed. >>>> >>>> The metadata takes some time to populate even though the files may >>>> contain data. The CDH4.1 version of Flume includes FLUME-1238, which >>>> will do auto-rolling of files and helps lower the period where these >>>> files appear to be 0 size. >>>> >>>> Since the CDH3u5 version of Flume is compatible with CDH3* Hadoop >>>> and the CDH4 Flume is compatible with CDH4* Hadoop, you can download >>>> the nightly build of flume-ng-1.2.0-cdh4.1.0 from >>>> http://nightly.cloudera.com/cdh4/cdh/4/ >>>> >>>> Regards, Kathleen >>>> >>>> On Mon, Sep 10, 2012 at 1:08 PM, Bhaskar V. Karambelkar >>>> <[EMAIL PROTECTED]> wrote: >>>> > Don't know about RPM, but there's a 1.2.x tarball of the 1.2 build >>>> > @ http://archive.cloudera.com/cdh/3/flume-ng-1.2.0-cdh3u5.tar.gz >>>> > >>>> > >>>> > On Mon, Sep 10, 2012 at 3:01 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> >> >>>> >> Just checked, and from Cloudera, 1.1.0+121-1.cdh4.0.1.p0.1.el6 is >>>> >> still the latest from their yum repo. >>>> >> >>>> >> >>>> >> On Mon, Sep 10, 2012 at 1:59 PM, Chris Neal <[EMAIL PROTECTED]> wrote: >>>> >>> >>>> >>> I'm using a combination :) >>>> >>> >>>> >>> The application tier is 1.3.0-SNAPSHOT The HDFS tier is CentOS, >>>> >>> and I grabbed the latest (at the time) from the CDH repo. It's >>>> >>> version is: 1.1.0+121-1.cdh4.0.1.p0.1.el6 >>>> >>> >>>> >>> If the issue is on the HDFS sink side, that it could definitely >>>> >>> be in my version! >>>> >>> I'll check if Cloudera has a more recent version to update to. >>>> >>> >>>> >>> Thanks! >>>> >>> Chris >>>> >>> >>>> >>> >>>> >>> On Mon, Sep 10, 2012 at 12:37 PM, Kathleen Ting >>>> >>> <[EMAIL PROTECTED]> >>>> >>> wrote: >>>> >>>> >>>> >>>> Chris, Eran, this appears to be FLUME-1238, which was fixed in >>>> >>>> Flume-1.2.0. Can you let me know if you are using Flume-1.2.0? >>>> >>>> >>>> >>>> Thanks, Kathleen >>>> >>>> >>>> >>>> On Mon, Sep 10, 2012 at 8:21 AM, Chris Neal <[EMAIL PROTECTED]> >>>> >>>> wrote: >>>> >>>> > Glad to know it's not just me :) +
Kathleen Ting 2012-11-06, 20:05
|