Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Copy Vs DistCP


+
KayVajj 2013-04-10, 22:20
Copy link to this message
-
Re: Copy Vs DistCP
DistCP is a full blown mapreduce job (mapper only, where the mappers do a
"fully" parallel copy to the detsination).

CP appears (correct me if im wrong) to simply invoke the FileSystem and
issues a copy command for every source file.

I have an additional question: how is CP which is internal to a cluster
optimized (if at all) ?
On Wed, Apr 10, 2013 at 6:20 PM, KayVajj <[EMAIL PROTECTED]> wrote:

> I have few questions regarding the usage of DistCP for copying files in
> the same cluster.
>
>
> 1) Which one is better within a  same cluster and what factors (like file
> size etc) wouldinfluence the usage of one over te other?
>
> 2) when we run a cp command like below from a  client node of the cluster
> (not a data node), How does the cp command work
>      i) like an MR job
>     ii) copy files locally and then it copy it back at the new location.
>
> Example of the copy command
>
> hdfs dfs -cp /<some_location>/file /<new_location>/
>
> Thanks, your responses are appreciated.
>
> -- Kay
>

--
Jay Vyas
http://jayunit100.blogspot.com
+
麦树荣 2013-04-10, 23:28
+
Jay Vyas 2013-04-10, 23:48
+
KayVajj 2013-04-11, 00:17
+
Azuryy Yu 2013-04-11, 01:30
+
KayVajj 2013-04-11, 04:12