|
|
-
Using Chukwa as monitoring tool.
Akshay Kumar 2010-12-16, 12:34
Hi, I have a Hadoop installation, and I want to collect some basic OS level metrics like - cpu, memory, disk usage, and Hadoop metrics.
I have looked into Ganglia, but it requires installing agents on client machines, which is what I want to avoid.
My queries: a) Is this a fair use case for using chukwa? e.g. polling client machines for CPU stats few times per minute? b) Is it possible to integrate data collected from chukwa collectors in a form readable by rrdtool kind of graphing tools on the server side?
Thanks, Akshay
+
Akshay Kumar 2010-12-16, 12:34
-
Re: Using Chukwa as monitoring tool.
Eric Fiala 2010-12-16, 14:54
Akshay It is possible to integrate data collected from chukwa collectors and convert it to any format (dunno if code for RRD exists but writing the conversion would be fairly trivial), however your requirement for an agent-less setup cannot be filled by chukwa.
My understanding is that honu provides similar log harvesting functionality, without the requirement for agents.
That said, ganglia is a fairly proven workhorse, out-of-box I have yet to encounter a simpler technology that provides as much cluster-centric metrics - unfortunately the agent is required.
hth, EF On 16 December 2010 05:34, Akshay Kumar <[EMAIL PROTECTED]> wrote:
> Hi, > I have a Hadoop installation, and I want to collect some basic OS level > metrics like - cpu, memory, disk usage, and Hadoop metrics. > > I have looked into Ganglia, but it requires installing agents on client > machines, which is what I want to avoid. > > My queries: > a) Is this a fair use case for using chukwa? e.g. polling client machines > for CPU stats few times per minute? > b) Is it possible to integrate data collected from chukwa collectors in a > form readable by rrdtool kind of graphing tools on the server side? > > Thanks, > Akshay >
+
Eric Fiala 2010-12-16, 14:54
-
Re: Using Chukwa as monitoring tool.
Eric Yang 2010-12-16, 16:57
Hi Akshay,
A) Yes. You can use "add sigar.SystemMetrics SystemMetrics [interval] 0" to stream CPU state at specified interval. For example:
"add sigar.SystemMetrics SystemMetrics 5 0" without quotes will stream CPU state every 5 seconds.
B) Chukwa has a graphing tool built in which is called HICC. It requires Hbase deployed in order to use HICC.
However, agent is still required on the client machines.
Regards, Eric
On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote:
Hi, I have a Hadoop installation, and I want to collect some basic OS level metrics like - cpu, memory, disk usage, and Hadoop metrics.
I have looked into Ganglia, but it requires installing agents on client machines, which is what I want to avoid.
My queries: a) Is this a fair use case for using chukwa? e.g. polling client machines for CPU stats few times per minute? b) Is it possible to integrate data collected from chukwa collectors in a form readable by rrdtool kind of graphing tools on the server side?
Thanks, Akshay
+
Eric Yang 2010-12-16, 16:57
-
Re: Using Chukwa as monitoring tool.
ZHOU Qi 2010-12-17, 02:21
Hi Eric,
I read the wiki of Chukwa, but there is less information about HICC. >From where I can get its screen-shot or demo?
Thanks, 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > Hi Akshay, > > A) Yes. You can use “add sigar.SystemMetrics SystemMetrics [interval] 0” to > stream CPU state at specified interval. For example: > > “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will stream CPU > state every 5 seconds. > > B) Chukwa has a graphing tool built in which is called HICC. It requires > Hbase deployed in order to use HICC. > > However, agent is still required on the client machines. > > Regards, > Eric > > On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: > > Hi, > I have a Hadoop installation, and I want to collect some basic OS level > metrics like - cpu, memory, disk usage, and Hadoop metrics. > > I have looked into Ganglia, but it requires installing agents on client > machines, which is what I want to avoid. > > My queries: > a) Is this a fair use case for using chukwa? e.g. polling client machines > for CPU stats few times per minute? > b) Is it possible to integrate data collected from chukwa collectors in a > form readable by rrdtool kind of graphing tools on the server side? > > Thanks, > Akshay > >
+
ZHOU Qi 2010-12-17, 02:21
-
Re: Using Chukwa as monitoring tool.
Eric Yang 2010-12-17, 02:46
Sure, here you go.
Regards, Eric
On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote:
Hi Eric,
I read the wiki of Chukwa, but there is less information about HICC. >From where I can get its screen-shot or demo?
Thanks, 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > Hi Akshay, > > A) Yes. You can use "add sigar.SystemMetrics SystemMetrics [interval] 0" to > stream CPU state at specified interval. For example: > > "add sigar.SystemMetrics SystemMetrics 5 0" without quotes will stream CPU > state every 5 seconds. > > B) Chukwa has a graphing tool built in which is called HICC. It requires > Hbase deployed in order to use HICC. > > However, agent is still required on the client machines. > > Regards, > Eric > > On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: > > Hi, > I have a Hadoop installation, and I want to collect some basic OS level > metrics like - cpu, memory, disk usage, and Hadoop metrics. > > I have looked into Ganglia, but it requires installing agents on client > machines, which is what I want to avoid. > > My queries: > a) Is this a fair use case for using chukwa? e.g. polling client machines > for CPU stats few times per minute? > b) Is it possible to integrate data collected from chukwa collectors in a > form readable by rrdtool kind of graphing tools on the server side? > > Thanks, > Akshay > >
+
Eric Yang 2010-12-17, 02:46
-
Re: Using Chukwa as monitoring tool.
ZHOU Qi 2010-12-17, 04:53
Got it. Thanks.
2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > Sure, here you go. > > Regards, > Eric > > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: > > Hi Eric, > > I read the wiki of Chukwa, but there is less information about HICC. > From where I can get its screen-shot or demo? > > Thanks, > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> Hi Akshay, >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics [interval] 0” >> to >> stream CPU state at specified interval. For example: >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will stream CPU >> state every 5 seconds. >> >> B) Chukwa has a graphing tool built in which is called HICC. It requires >> Hbase deployed in order to use HICC. >> >> However, agent is still required on the client machines. >> >> Regards, >> Eric >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: >> >> Hi, >> I have a Hadoop installation, and I want to collect some basic OS level >> metrics like - cpu, memory, disk usage, and Hadoop metrics. >> >> I have looked into Ganglia, but it requires installing agents on client >> machines, which is what I want to avoid. >> >> My queries: >> a) Is this a fair use case for using chukwa? e.g. polling client machines >> for CPU stats few times per minute? >> b) Is it possible to integrate data collected from chukwa collectors in a >> form readable by rrdtool kind of graphing tools on the server side? >> >> Thanks, >> Akshay >> >> > >
+
ZHOU Qi 2010-12-17, 04:53
-
Re: Using Chukwa as monitoring tool.
Akshay Kumar 2010-12-26, 18:03
Hi, Thanks for the responses. A bit late to check this one. I have one more query - In the Chukwa administration guide: http://people.apache.org/~eyang/docs/r0.1.2/admin.htmlIt says *Chukwa* can also be installed on a single node, in which case the machine must have at least *16 GB of memory. * Q) For my usecase ( for monitoring system metrics) - is it safe to assume it is not going to be that big a requirement for memory? Thanks, Akshay On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: > Got it. Thanks. > > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > > Sure, here you go. > > > > Regards, > > Eric > > > > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: > > > > Hi Eric, > > > > I read the wiki of Chukwa, but there is less information about HICC. > > From where I can get its screen-shot or demo? > > > > Thanks, > > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > >> Hi Akshay, > >> > >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics [interval] > 0” > >> to > >> stream CPU state at specified interval. For example: > >> > >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will stream > CPU > >> state every 5 seconds. > >> > >> B) Chukwa has a graphing tool built in which is called HICC. It > requires > >> Hbase deployed in order to use HICC. > >> > >> However, agent is still required on the client machines. > >> > >> Regards, > >> Eric > >> > >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: > >> > >> Hi, > >> I have a Hadoop installation, and I want to collect some basic OS level > >> metrics like - cpu, memory, disk usage, and Hadoop metrics. > >> > >> I have looked into Ganglia, but it requires installing agents on client > >> machines, which is what I want to avoid. > >> > >> My queries: > >> a) Is this a fair use case for using chukwa? e.g. polling client > machines > >> for CPU stats few times per minute? > >> b) Is it possible to integrate data collected from chukwa collectors in > a > >> form readable by rrdtool kind of graphing tools on the server side? > >> > >> Thanks, > >> Akshay > >> > >> > > > > >
+
Akshay Kumar 2010-12-26, 18:03
-
Re: Using Chukwa as monitoring tool.
Ariel Rabkin 2010-12-26, 18:08
Yes. That 16 GB number is for the HICC server, not for the collection side. And even then, it's if you have a lot of data (a whole cluster's worth) living in a MySQL database with a web application serving the data. The monitoring agent and the collector are both fairly small-footprint. --Ari On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> wrote: > Hi, > Thanks for the responses. A bit late to check this one. > I have one more query - > In the Chukwa administration guide: > http://people.apache.org/~eyang/docs/r0.1.2/admin.html> It says > Chukwa can also be installed on a single node, in which case the machine > must have at least 16 GB of memory. > > Q) For my usecase ( for monitoring system metrics) - is it safe to assume it > is not going to be that big a requirement for memory? > > Thanks, > Akshay > > > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: >> >> Got it. Thanks. >> >> 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> > Sure, here you go. >> > >> > Regards, >> > Eric >> > >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: >> > >> > Hi Eric, >> > >> > I read the wiki of Chukwa, but there is less information about HICC. >> > From where I can get its screen-shot or demo? >> > >> > Thanks, >> > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> >> Hi Akshay, >> >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics [interval] >> >> 0” >> >> to >> >> stream CPU state at specified interval. For example: >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will stream >> >> CPU >> >> state every 5 seconds. >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. It >> >> requires >> >> Hbase deployed in order to use HICC. >> >> >> >> However, agent is still required on the client machines. >> >> >> >> Regards, >> >> Eric >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: >> >> >> >> Hi, >> >> I have a Hadoop installation, and I want to collect some basic OS level >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. >> >> >> >> I have looked into Ganglia, but it requires installing agents on client >> >> machines, which is what I want to avoid. >> >> >> >> My queries: >> >> a) Is this a fair use case for using chukwa? e.g. polling client >> >> machines >> >> for CPU stats few times per minute? >> >> b) Is it possible to integrate data collected from chukwa collectors in >> >> a >> >> form readable by rrdtool kind of graphing tools on the server side? >> >> >> >> Thanks, >> >> Akshay >> >> >> >> >> > >> > > > -- Ari Rabkin [EMAIL PROTECTED] UC Berkeley Computer Science Department
+
Ariel Rabkin 2010-12-26, 18:08
-
Re: Using Chukwa as monitoring tool.
Akshay Kumar 2010-12-26, 18:29
Thanks, In my setup, I can not afford ( as of now) to have a machine with 16GB memory. So that means, I can not deploy Chukwa as a monitoring solution ? I do not intend to do any log analysis / collection for now - just simple OS and hadoop metrics. I mean, I do not understand why would one have 16GB has hard limit for minimal functioning too. I imagine it should be for a high performance system and not bare-bones structure. What am I missing here? -Akshay On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: > Yes. That 16 GB number is for the HICC server, not for the collection > side. And even then, it's if you have a lot of data (a whole cluster's > worth) living in a MySQL database with a web application serving the > data. > > The monitoring agent and the collector are both fairly small-footprint. > > --Ari > > On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> > wrote: > > Hi, > > Thanks for the responses. A bit late to check this one. > > I have one more query - > > In the Chukwa administration guide: > > http://people.apache.org/~eyang/docs/r0.1.2/admin.html<http://people.apache.org/%7Eeyang/docs/r0.1.2/admin.html>> > It says > > Chukwa can also be installed on a single node, in which case the machine > > must have at least 16 GB of memory. > > > > Q) For my usecase ( for monitoring system metrics) - is it safe to assume > it > > is not going to be that big a requirement for memory? > > > > Thanks, > > Akshay > > > > > > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: > >> > >> Got it. Thanks. > >> > >> 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > >> > Sure, here you go. > >> > > >> > Regards, > >> > Eric > >> > > >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: > >> > > >> > Hi Eric, > >> > > >> > I read the wiki of Chukwa, but there is less information about HICC. > >> > From where I can get its screen-shot or demo? > >> > > >> > Thanks, > >> > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > >> >> Hi Akshay, > >> >> > >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics > [interval] > >> >> 0” > >> >> to > >> >> stream CPU state at specified interval. For example: > >> >> > >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will > stream > >> >> CPU > >> >> state every 5 seconds. > >> >> > >> >> B) Chukwa has a graphing tool built in which is called HICC. It > >> >> requires > >> >> Hbase deployed in order to use HICC. > >> >> > >> >> However, agent is still required on the client machines. > >> >> > >> >> Regards, > >> >> Eric > >> >> > >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: > >> >> > >> >> Hi, > >> >> I have a Hadoop installation, and I want to collect some basic OS > level > >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. > >> >> > >> >> I have looked into Ganglia, but it requires installing agents on > client > >> >> machines, which is what I want to avoid. > >> >> > >> >> My queries: > >> >> a) Is this a fair use case for using chukwa? e.g. polling client > >> >> machines > >> >> for CPU stats few times per minute? > >> >> b) Is it possible to integrate data collected from chukwa collectors > in > >> >> a > >> >> form readable by rrdtool kind of graphing tools on the server side? > >> >> > >> >> Thanks, > >> >> Akshay > >> >> > >> >> > >> > > >> > > > > > > > > > -- > Ari Rabkin [EMAIL PROTECTED] > UC Berkeley Computer Science Department >
+
Akshay Kumar 2010-12-26, 18:29
-
Re: Using Chukwa as monitoring tool.
Ariel Rabkin 2010-12-26, 18:35
16 GB isn't a hard limit, just a suggestion. And that's based on the assumption that you have a big cluster and are collecting a lot of data and using the older MySQL based infrastructure. How much memory you need depends on what volume of data you're collecting and what you're doing with it. How do you intend to store the data and how will you be visualizing it? --Ari On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[EMAIL PROTECTED]> wrote: > Thanks, > In my setup, I can not afford ( as of now) to have a machine with 16GB > memory. > So that means, I can not deploy Chukwa as a monitoring solution ? I do not > intend to do any log analysis / collection for now - just simple OS and > hadoop metrics. > > I mean, I do not understand why would one have 16GB has hard limit for > minimal functioning too. > I imagine it should be for a high performance system and not bare-bones > structure. What am I missing here? > > -Akshay > > On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: >> >> Yes. That 16 GB number is for the HICC server, not for the collection >> side. And even then, it's if you have a lot of data (a whole cluster's >> worth) living in a MySQL database with a web application serving the >> data. >> >> The monitoring agent and the collector are both fairly small-footprint. >> >> --Ari >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > Thanks for the responses. A bit late to check this one. >> > I have one more query - >> > In the Chukwa administration guide: >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html>> > It says >> > Chukwa can also be installed on a single node, in which case the machine >> > must have at least 16 GB of memory. >> > >> > Q) For my usecase ( for monitoring system metrics) - is it safe to >> > assume it >> > is not going to be that big a requirement for memory? >> > >> > Thanks, >> > Akshay >> > >> > >> > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: >> >> >> >> Got it. Thanks. >> >> >> >> 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> >> > Sure, here you go. >> >> > >> >> > Regards, >> >> > Eric >> >> > >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: >> >> > >> >> > Hi Eric, >> >> > >> >> > I read the wiki of Chukwa, but there is less information about HICC. >> >> > From where I can get its screen-shot or demo? >> >> > >> >> > Thanks, >> >> > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> >> >> Hi Akshay, >> >> >> >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics >> >> >> [interval] >> >> >> 0” >> >> >> to >> >> >> stream CPU state at specified interval. For example: >> >> >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will >> >> >> stream >> >> >> CPU >> >> >> state every 5 seconds. >> >> >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. It >> >> >> requires >> >> >> Hbase deployed in order to use HICC. >> >> >> >> >> >> However, agent is still required on the client machines. >> >> >> >> >> >> Regards, >> >> >> Eric >> >> >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: >> >> >> >> >> >> Hi, >> >> >> I have a Hadoop installation, and I want to collect some basic OS >> >> >> level >> >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. >> >> >> >> >> >> I have looked into Ganglia, but it requires installing agents on >> >> >> client >> >> >> machines, which is what I want to avoid. >> >> >> >> >> >> My queries: >> >> >> a) Is this a fair use case for using chukwa? e.g. polling client >> >> >> machines >> >> >> for CPU stats few times per minute? >> >> >> b) Is it possible to integrate data collected from chukwa collectors >> >> >> in >> >> >> a >> >> >> form readable by rrdtool kind of graphing tools on the server side? >> >> >> >> >> >> Thanks, >> >> >> Akshay >> >> >> >> >> >> >> >> > >> >> > >> > >> > >> >> >> >> -- >> Ari Rabkin [EMAIL PROTECTED] Ari Rabkin [EMAIL PROTECTED] UC Berkeley Computer Science Department
+
Ariel Rabkin 2010-12-26, 18:35
-
Re: Using Chukwa as monitoring tool.
Akshay Kumar 2010-12-28, 21:18
Hi, I have GWT as the front-end, where I want to embed this information in one of the following ways: a) Simply embed RRDtool kind of generated images. That means, I will have to run rrdtool ( I am looking at rrd4j) on server side and convert the data to RRD format on agent/server side. b) Use some graphing library - like http://dygraphs.com/. I am not expecting too much of volume. To start with simple CPU, Memory and hadoop metrics collected from 20 or so machines collected at a rate not more than 10 per minute per metric. < http://dygraphs.com/>Thanks, Akshay On 27 December 2010 00:05, Ariel Rabkin <[EMAIL PROTECTED]> wrote: > 16 GB isn't a hard limit, just a suggestion. And that's based on the > assumption that you have a big cluster and are collecting a lot of > data and using the older MySQL based infrastructure. > > How much memory you need depends on what volume of data you're > collecting and what you're doing with it. How do you intend to store > the data and how will you be visualizing it? > > > > --Ari > > On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[EMAIL PROTECTED]> > wrote: > > Thanks, > > In my setup, I can not afford ( as of now) to have a machine with 16GB > > memory. > > So that means, I can not deploy Chukwa as a monitoring solution ? I do > not > > intend to do any log analysis / collection for now - just simple OS and > > hadoop metrics. > > > > I mean, I do not understand why would one have 16GB has hard limit for > > minimal functioning too. > > I imagine it should be for a high performance system and not bare-bones > > structure. What am I missing here? > > > > -Akshay > > > > On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: > >> > >> Yes. That 16 GB number is for the HICC server, not for the collection > >> side. And even then, it's if you have a lot of data (a whole cluster's > >> worth) living in a MySQL database with a web application serving the > >> data. > >> > >> The monitoring agent and the collector are both fairly small-footprint. > >> > >> --Ari > >> > >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> > >> wrote: > >> > Hi, > >> > Thanks for the responses. A bit late to check this one. > >> > I have one more query - > >> > In the Chukwa administration guide: > >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html> >> > It says > >> > Chukwa can also be installed on a single node, in which case the > machine > >> > must have at least 16 GB of memory. > >> > > >> > Q) For my usecase ( for monitoring system metrics) - is it safe to > >> > assume it > >> > is not going to be that big a requirement for memory? > >> > > >> > Thanks, > >> > Akshay > >> > > >> > > >> > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: > >> >> > >> >> Got it. Thanks. > >> >> > >> >> 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > >> >> > Sure, here you go. > >> >> > > >> >> > Regards, > >> >> > Eric > >> >> > > >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: > >> >> > > >> >> > Hi Eric, > >> >> > > >> >> > I read the wiki of Chukwa, but there is less information about > HICC. > >> >> > From where I can get its screen-shot or demo? > >> >> > > >> >> > Thanks, > >> >> > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: > >> >> >> Hi Akshay, > >> >> >> > >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics > >> >> >> [interval] > >> >> >> 0” > >> >> >> to > >> >> >> stream CPU state at specified interval. For example: > >> >> >> > >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will > >> >> >> stream > >> >> >> CPU > >> >> >> state every 5 seconds. > >> >> >> > >> >> >> B) Chukwa has a graphing tool built in which is called HICC. It > >> >> >> requires > >> >> >> Hbase deployed in order to use HICC. > >> >> >> > >> >> >> However, agent is still required on the client machines. > >> >> >> > >> >> >> Regards, > >> >> >> Eric > >> >> >> > >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]>
+
Akshay Kumar 2010-12-28, 21:18
-
Re: Using Chukwa as monitoring tool.
Eric Yang 2010-12-29, 03:33
Hi Akshay, In both options, data down sampling is required. RRDTools is doing data down sampling when the data is written to the RRD files. Chukwa 0.4 uses mysql for data down sampling. The graph is then rendered using flot ( http://code.google.com/p/flot/) graphing library to serve the data. There was also a prototype to render graph on the server side with jfreechart. However, there was no clear interface to expose graph-able data. In Chukwa 0.5, we are decoupling the data with the graph library. There is a REST API interface to get metrics data. (See https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5 is still under development, the data down sampling has shifted from sql statements into mapreduce/pig-latin script. I have not determine what will be in the final framework. It is most likely to use Oozie as workflow scheduling engine to run mapreduce/pipg-latin jobs to provide down sampling and aggregation framework. You are welcome to try out code from trunk (0.5). The current limitation is to avoid using a large time range and there is no aggregation. Hope this helps. regards, Eric On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar <[EMAIL PROTECTED]> wrote: > Hi, > I have GWT as the front-end, where I want to embed this information in one > of the following ways: > a) Simply embed RRDtool kind of generated images. That means, I will have to > run rrdtool ( I am looking at rrd4j) on server side and convert the data to > RRD format on agent/server side. > b) Use some graphing library - like http://dygraphs.com/. > I am not expecting too much of volume. To start with simple CPU, Memory and > hadoop metrics collected from 20 or so machines collected at a rate not more > than 10 per minute per metric. > Thanks, > Akshay > On 27 December 2010 00:05, Ariel Rabkin <[EMAIL PROTECTED]> wrote: >> >> 16 GB isn't a hard limit, just a suggestion. And that's based on the >> assumption that you have a big cluster and are collecting a lot of >> data and using the older MySQL based infrastructure. >> >> How much memory you need depends on what volume of data you're >> collecting and what you're doing with it. How do you intend to store >> the data and how will you be visualizing it? >> >> >> >> --Ari >> >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[EMAIL PROTECTED]> >> wrote: >> > Thanks, >> > In my setup, I can not afford ( as of now) to have a machine with 16GB >> > memory. >> > So that means, I can not deploy Chukwa as a monitoring solution ? I do >> > not >> > intend to do any log analysis / collection for now - just simple OS and >> > hadoop metrics. >> > >> > I mean, I do not understand why would one have 16GB has hard limit for >> > minimal functioning too. >> > I imagine it should be for a high performance system and not bare-bones >> > structure. What am I missing here? >> > >> > -Akshay >> > >> > On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: >> >> >> >> Yes. That 16 GB number is for the HICC server, not for the collection >> >> side. And even then, it's if you have a lot of data (a whole cluster's >> >> worth) living in a MySQL database with a web application serving the >> >> data. >> >> >> >> The monitoring agent and the collector are both fairly small-footprint. >> >> >> >> --Ari >> >> >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> >> >> wrote: >> >> > Hi, >> >> > Thanks for the responses. A bit late to check this one. >> >> > I have one more query - >> >> > In the Chukwa administration guide: >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html>> >> > It says >> >> > Chukwa can also be installed on a single node, in which case the >> >> > machine >> >> > must have at least 16 GB of memory. >> >> > >> >> > Q) For my usecase ( for monitoring system metrics) - is it safe to >> >> > assume it >> >> > is not going to be that big a requirement for memory? >> >> > >> >> > Thanks, >> >> > Akshay >> >> > >> >> > >> >> > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote:
+
Eric Yang 2010-12-29, 03:33
-
Re: Using Chukwa as monitoring tool.
Akshay Kumar 2010-12-30, 10:25
Thanks so much Eric. I will take some time to grasp all this and try out stuff. Will definitely get back as and when I have some feedback to give. Regards, Akshay On 29 December 2010 09:03, Eric Yang <[EMAIL PROTECTED]> wrote: > Hi Akshay, > > In both options, data down sampling is required. RRDTools is doing > data down sampling when the data is written to the RRD files. Chukwa > 0.4 uses mysql for data down sampling. The graph is then rendered > using flot ( http://code.google.com/p/flot/) graphing library to serve > the data. There was also a prototype to render graph on the server > side with jfreechart. However, there was no clear interface to expose > graph-able data. > > In Chukwa 0.5, we are decoupling the data with the graph library. > There is a REST API interface to get metrics data. (See > https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5 > is still under development, the data down sampling has shifted from > sql statements into mapreduce/pig-latin script. I have not determine > what will be in the final framework. It is most likely to use Oozie > as workflow scheduling engine to run mapreduce/pipg-latin jobs to > provide down sampling and aggregation framework. > > You are welcome to try out code from trunk (0.5). The current > limitation is to avoid using a large time range and there is no > aggregation. Hope this helps. > > regards, > Eric > > On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar <[EMAIL PROTECTED]> > wrote: > > Hi, > > I have GWT as the front-end, where I want to embed this information in > one > > of the following ways: > > a) Simply embed RRDtool kind of generated images. That means, I will have > to > > run rrdtool ( I am looking at rrd4j) on server side and convert the data > to > > RRD format on agent/server side. > > b) Use some graphing library - like http://dygraphs.com/. > > I am not expecting too much of volume. To start with simple CPU, Memory > and > > hadoop metrics collected from 20 or so machines collected at a rate not > more > > than 10 per minute per metric. > > Thanks, > > Akshay > > On 27 December 2010 00:05, Ariel Rabkin <[EMAIL PROTECTED]> wrote: > >> > >> 16 GB isn't a hard limit, just a suggestion. And that's based on the > >> assumption that you have a big cluster and are collecting a lot of > >> data and using the older MySQL based infrastructure. > >> > >> How much memory you need depends on what volume of data you're > >> collecting and what you're doing with it. How do you intend to store > >> the data and how will you be visualizing it? > >> > >> > >> > >> --Ari > >> > >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[EMAIL PROTECTED]> > >> wrote: > >> > Thanks, > >> > In my setup, I can not afford ( as of now) to have a machine with 16GB > >> > memory. > >> > So that means, I can not deploy Chukwa as a monitoring solution ? I > do > >> > not > >> > intend to do any log analysis / collection for now - just simple OS > and > >> > hadoop metrics. > >> > > >> > I mean, I do not understand why would one have 16GB has hard limit for > >> > minimal functioning too. > >> > I imagine it should be for a high performance system and not > bare-bones > >> > structure. What am I missing here? > >> > > >> > -Akshay > >> > > >> > On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: > >> >> > >> >> Yes. That 16 GB number is for the HICC server, not for the > collection > >> >> side. And even then, it's if you have a lot of data (a whole > cluster's > >> >> worth) living in a MySQL database with a web application serving the > >> >> data. > >> >> > >> >> The monitoring agent and the collector are both fairly > small-footprint. > >> >> > >> >> --Ari > >> >> > >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar < > [EMAIL PROTECTED]> > >> >> wrote: > >> >> > Hi, > >> >> > Thanks for the responses. A bit late to check this one. > >> >> > I have one more query - > >> >> > In the Chukwa administration guide: > >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html
+
Akshay Kumar 2010-12-30, 10:25
-
Re: Using Chukwa as monitoring tool.
Eric Yang 2010-12-26, 20:46
For development, I am running on MacOSX 10.6 with only 2GB of RAM. Chukwa can run with small memory foot print but not optimal performance. The recommended memory size is for production system when you need to monitor thousands of nodes in a cluster system. Chukwa is design with parallelism in mind. Hence, there is a lot of initial overhead for setup parallelism, which is not necessary if the data size is small. Try to figure out if these things applies to you: - Generate more than 1TB of data per day (2000+ nodes of hadoop cluster to produce this type of volume) - Number of data sources saturate TCP connections need monitoring system to do software load balancing - Need Raw data, digested data can't support analysis use case Use Chukwa if any of the item applies to you, otherwise Ganglia is also a great way to monitor hadoop at smaller scale. regards, Eric On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[EMAIL PROTECTED]> wrote: > Thanks, > In my setup, I can not afford ( as of now) to have a machine with 16GB > memory. > So that means, I can not deploy Chukwa as a monitoring solution ? I do not > intend to do any log analysis / collection for now - just simple OS and > hadoop metrics. > > I mean, I do not understand why would one have 16GB has hard limit for > minimal functioning too. > I imagine it should be for a high performance system and not bare-bones > structure. What am I missing here? > > -Akshay > > On 26 December 2010 23:38, Ariel Rabkin <[EMAIL PROTECTED]> wrote: >> >> Yes. That 16 GB number is for the HICC server, not for the collection >> side. And even then, it's if you have a lot of data (a whole cluster's >> worth) living in a MySQL database with a web application serving the >> data. >> >> The monitoring agent and the collector are both fairly small-footprint. >> >> --Ari >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > Thanks for the responses. A bit late to check this one. >> > I have one more query - >> > In the Chukwa administration guide: >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html>> > It says >> > Chukwa can also be installed on a single node, in which case the machine >> > must have at least 16 GB of memory. >> > >> > Q) For my usecase ( for monitoring system metrics) - is it safe to >> > assume it >> > is not going to be that big a requirement for memory? >> > >> > Thanks, >> > Akshay >> > >> > >> > On 17 December 2010 10:23, ZHOU Qi <[EMAIL PROTECTED]> wrote: >> >> >> >> Got it. Thanks. >> >> >> >> 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> >> > Sure, here you go. >> >> > >> >> > Regards, >> >> > Eric >> >> > >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[EMAIL PROTECTED]> wrote: >> >> > >> >> > Hi Eric, >> >> > >> >> > I read the wiki of Chukwa, but there is less information about HICC. >> >> > From where I can get its screen-shot or demo? >> >> > >> >> > Thanks, >> >> > 2010/12/17 Eric Yang <[EMAIL PROTECTED]>: >> >> >> Hi Akshay, >> >> >> >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics >> >> >> [interval] >> >> >> 0” >> >> >> to >> >> >> stream CPU state at specified interval. For example: >> >> >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will >> >> >> stream >> >> >> CPU >> >> >> state every 5 seconds. >> >> >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. It >> >> >> requires >> >> >> Hbase deployed in order to use HICC. >> >> >> >> >> >> However, agent is still required on the client machines. >> >> >> >> >> >> Regards, >> >> >> Eric >> >> >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[EMAIL PROTECTED]> wrote: >> >> >> >> >> >> Hi, >> >> >> I have a Hadoop installation, and I want to collect some basic OS >> >> >> level >> >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. >> >> >> >> >> >> I have looked into Ganglia, but it requires installing agents on >> >> >> client >> >> >> machines, which is what I want to avoid.
+
Eric Yang 2010-12-26, 20:46
|
|