Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # user >> collector not closing files larger than 64MB


Copy link to this message
-
Re: collector not closing files larger than 64MB
I think I was the original source of that document, and that mis-conception.
I thought you mentioned that to me over IM when you were first explaining
things to me Ari, but I must've mis-understood.
On Tue, Jul 12, 2011 at 4:57 PM, Himanshu Gahlot <
[EMAIL PROTECTED]> wrote:

> Hi Ari,
>
> The chukwa dataflow documentation
> (http://incubator.apache.org/chukwa/docs/r0.4.0/dataflow.html)
> mentions -  "Collectors write chunks to logs/*.chukwa files until a
> 64MB chunk size is reached or a given time interval has passed." I
> think this should be corrected.
>
> Thanks,
> Himanshu
>
>
>
> On Tue, Jul 12, 2011 at 4:18 PM, Ariel Rabkin <[EMAIL PROTECTED]> wrote:
> > Howdy.
> >
> > There isn't such a check; if there's documentation somewhere that
> > suggests there is, let us know where and we can fix it. In general,
> > the goal is to have .done files as large as possible while compatible
> > with SLAs; there wasn't any intent to have them only be one block
> > long.
> >
> > --Ari
> >
> > On Tue, Jul 12, 2011 at 3:57 PM, Himanshu Gahlot
> > <[EMAIL PROTECTED]> wrote:
> >> Hi,
> >>
> >> We have a system where the .chukwa files produced are larger than
> >> 64MB, but I do not see them being closed by collector and convert to
> >> .done file. They are closing only at the scheduled time and hence are
> >> greater than the block size (64MB). I do not see a check to close
> >> files larger than a block size in SeqFileWriter class. Where is this
> >> check made in the code ?
> >>
> >> Thanks,
> >> Himanshu
> >>
> >
> >
> >
> > --
> > Ari Rabkin [EMAIL PROTECTED]
> > UC Berkeley Computer Science Department
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB