Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> In flume-ng is there any advantages of 2-tier topology in  a cluster of  30-40 nodes?


Copy link to this message
-
In flume-ng is there any advantages of 2-tier topology in  a cluster of  30-40 nodes?
Hi

In our scenario there are around 30 machines from which we want to put
data into HDFS.

Now the approach we thought of initially was:

1. First tier  : Agent which collect data from source then pass it to
avro sink.
2. Second tier:  Lets call those agents 'collectors' which collect data
from First tier agents and then dump it to HDFS.
(Second tier agents are fewer in number say 4:1)

Instead of above topology if I simply use HDFS sink in first tier
agents. It can serve the purpose.
And also number of nodes are lesser (say 30) that won't hurt HDFS
namenode too much compared
to if number of nodes were say 1000.

But apart from that I don't say any advantage of adding the 2nd tier.
Is there any advantage I am missing in terms of failover, HDFS
performance or any other parameter?

Regards,
Jagadish
+
Alexander Alten-Lorenz 2013-01-30, 06:26
+
Jagadish Bihani 2013-01-30, 14:43
+
Jagadish Bihani 2013-02-01, 11:08
+
Alexander Alten-Lorenz 2013-02-01, 13:15
+
Jagadish Bihani 2013-02-01, 13:27
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB