|
|
-
network compression between agent and collector
Sourygna Luangsay 2012-07-24, 15:50
Hi guys,
I have been looking for it both in documentation and mailing lists but have found nothing.
Do you know if there is a way to configure Chukwa so that chunk transfers between agent and collector are compressed (gzip, lzo or other) ?
Since most of our logs are text files, this would prove a high benefit in order to decrease network bandwidth.
Regards,
Sourygna Luangsay
-
Re: network compression between agent and collector
Ariel Rabkin 2012-07-24, 16:31
There is not currently a way to configure that, I don't think.
Shouldn't be that hard a feature to add, if it's something you need. If you decide to go ahead and try, I'm sure the community would be happy to help.
--Ari
On Tue, Jul 24, 2012 at 11:50 AM, Sourygna Luangsay <[EMAIL PROTECTED]> wrote: > Hi guys, > > > > I have been looking for it both in documentation and mailing lists but have > found nothing. > > > > Do you know if there is a way to configure Chukwa so that chunk transfers > between agent and collector are compressed (gzip, lzo or other) ? > > Since most of our logs are text files, this would prove a high benefit in > order to decrease network bandwidth. > > > > Regards, > > > > Sourygna Luangsay > >
-- Ari Rabkin [EMAIL PROTECTED] Princeton Computer Science Department
-
Re: network compression between agent and collector
Sourygna Luangsay 2012-07-29, 00:19
Hi Ari,
Yes, we do need such feature for a project of us. So plan to develop it. When I come back from holidays I'll create a JIRA.
Meanwhile, don't hesitate to tell me more if you have any idea of some interesting features linked with such compression, or any advice to implement it. For instance, I am not really sure right now at which level I should set the compression in the Chukwa Agent: - at the whole DataOutputBuffer level? - at the "data" field of every Chunk? - or at the adaptor level? (at first sight, the DataOutputBuffer level seems the easier to impement).
Thanks,
Sourygna
-
Re: network compression between agent and collector
Ariel Rabkin 2012-07-29, 03:29
I would do the DataOutputBuffer level -- in general, compressing bigger blocks is more efficient since the compression algorithm has more room to find duplicates. But trying to stripe across buffers would leave you with awkwardness in the presence of missing data.
I would start with the DataOutputBuffer strategy, since it's easy to do and not obviously the wrong thing -- if it seems to work satisfactorily, declare victory and contribute the patch.
On Sat, Jul 28, 2012 at 8:19 PM, Sourygna Luangsay <[EMAIL PROTECTED]> wrote: > Hi Ari, > > Yes, we do need such feature for a project of us. So plan to develop it. > When I come back from holidays I'll create a JIRA. > > Meanwhile, don't hesitate to tell me more if you have any idea of some > interesting > features linked with such compression, or any advice to implement it. For > instance, I am not > really sure right now at which level I should set the compression in the > Chukwa Agent: > - at the whole DataOutputBuffer level? > - at the "data" field of every Chunk? > - or at the adaptor level? > (at first sight, the DataOutputBuffer level seems the easier to impement). > > Thanks, > > Sourygna
-- Ari Rabkin [EMAIL PROTECTED] Princeton Computer Science Department
|
|