Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> access patterns investigation to dynamically toggle the replication factor in a hadoop cluster


Copy link to this message
-
access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Hi all,

As part of the research for an ongoing project, we are interested in
investigating the ability  to predict data access patterns on a hadoop
cluster. The purpose is to study the file access patterns (in a time
series manner), so that proactive manipulation of data may be achieved.
This for example may involve the increase/decrease of the replication
factor in an Apache Hadoop cluster (and according HDFS) to deal with an
upcoming predicted increase/decrease of data accesses.

So we would like your advise on some issues:
1) is this the correct mailing list? :)
2) would a changed replication factor translate to a better performance
of a MR job (either by experience you may have or if you have in mind a
report/paper etc. that has studied this)
3) do you find this interesting in general and something we should pursue?
4) are you aware of any related work on the topic we could use as a
starting point?

Thanks for your help,
George
+
George Kousiouris 2012-09-05, 16:13
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB