Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Copying a file to specified nodes


+
Rasit OZDAS 2009-02-10, 13:05
Copy link to this message
-
Re: Copying a file to specified nodes
Hey Rasit,

I'm not sure I fully understand your description of the problem, but
you might want to check out the JIRA ticket for making the replica
placement algorithms in HDFS pluggable
(https://issues.apache.org/jira/browse/HADOOP-3799) and add your use
case there.

Regards,
Jeff

On Tue, Feb 10, 2009 at 5:05 AM, Rasit OZDAS <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> We have thousands of files, each dedicated to a user.  (Each user has
> access to other users' files, but they do this not very often.)
> Each user runs map-reduce jobs on the cluster.
> So we should seperate his/her files equally across the cluster,
> so that every machine can take part in the process (assuming he/she is
> the only user running jobs).
> For this we should initially copy files to specified nodes:
> User A :   first file : Node 1, second file: Node 2, .. etc.
> User B :   first file : Node 1, second file: Node 2, .. etc.
>
> I know, hadoop create also replicas, but in our solution at least one
> file will be in the right place
> (or we're willing to control other replicas too).
>
> Rebalancing is also not a problem, assuming it uses the information
> about how much a computer is in use.
> It even helps for a better organization of files.
>
> How can we copy files to specified nodes?
> Or do you have a better solution for us?
>
> I couldn't find a solution to this, probably such an option doesn't exist.
> But I wanted to take an expert's opinion about this.
>
> Thanks in advance..
> Rasit
+
Rasit OZDAS 2009-02-16, 08:07
+
Rasit OZDAS 2009-02-16, 15:17
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB