Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: One petabyte of data loading into HDFS with in 10 min.


+
Nick Jones 2012-09-05, 14:59
+
Mathias Herberts 2012-09-05, 15:12
+
zGreenfelder 2012-09-05, 14:56
+
DSouza, Clive V 2012-09-05, 14:58
+
Michael Segel 2012-09-07, 14:00
+
prabhu K 2012-09-10, 07:40
+
Steve Loughran 2012-09-10, 09:40
+
Michael Segel 2012-09-10, 11:50
+
Gauthier, Alexander 2012-09-10, 16:17
+
Fabio Pitzolu 2012-09-05, 14:47
+
prabhu K 2012-09-05, 12:21
+
Chen He 2012-09-05, 14:03
+
Shailesh Dargude 2012-09-05, 14:14
Copy link to this message
-
Re: One petabyte of data loading into HDFS with in 10 min.
Hello Shailesh,

      Give distcp a shot. It runs a MR for copying data from source to
destination, so the data can be copied parallely.

Regards,
    Mohammad Tariq

On Wed, Sep 5, 2012 at 7:44 PM, Shailesh Dargude <
[EMAIL PROTECTED]> wrote:

> Sorry Prabhu for hijacking this discussion a bit..  I wonder , what is the
> best practice to load the data in HDFS in general. Considering the size of
> the data ( many times its in gbs or TBs generally),   how are storage  and
> time constraints handled.****
>
> ** **
>
> If anybody  can share your experiences or best practice it would great!***
> *
>
> ** **
>
> -Shailesh.****
>
> ** **
>
> *From:* Chen He [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, September 05, 2012 7:34 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: One petabyte of data loading into HDFS with in 10 min.****
>
> ** **
>
> If it is not a single file, you can upload them using multiple threads to
> HDFS.****
>
> On Wed, Sep 5, 2012 at 7:21 AM, prabhu K <[EMAIL PROTECTED]> wrote:*
> ***
>
> Hi Users,****
>
>  ****
>
> Please clarify the below questions.****
>
>  ****
>
> 1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many
> slave (Data Nodes) machines required.****
>
>  ****
>
> 2. With in 10 minutes one petabyte of data load into HDFS/HIVE, what is
> the configuration setup for cloud computing.****
>
>  ****
>
> Please suggest and help me on this.****
>
>  ****
>
> Thanks&Regards,****
>
> Prabhu.****
>
>  ****
>
> ** **
>
+
Steve Loughran 2012-09-07, 09:12
+
Gulfie 2012-09-06, 20:52
+
Michael Segel 2012-09-10, 19:54
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB