Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> parallel inserts ?


+
John B 2012-02-15, 15:59
Copy link to this message
-
Re: parallel inserts ?
Hi John
       Yes Insert is parallel in default for hive. Hive QL gets transformed to mapreduce jobs and hence definitely it is parallel. The only case it is not parallel is when you have just 1 reducer . It is just reading and processing the input files and in parallel using map reduce jobs from the source table data dir and writes the desired output files to the destination table dir.      
        Hive is just an abstraction over map reduce and can't be compared against a db in terms of features. Almost every data processing operation is just some map reduce jobs.
Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: John B <[EMAIL PROTECTED]>
Date: Wed, 15 Feb 2012 10:59:09
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: parallel inserts ?

Other sql datbases typically can parallelize selects but are unable to
automatically parallelize inserts.

With the most recent stable hiveql will the following statement have
the --insert-- automatically parallelized ?

 INSERT OVERWRITE TABLE pv_gender
 SELECT pv_users.gender
 FROM pv_users
I understand there is now 'insert into ..select from' syntax. Is the
insert part of that statement automatically parallelized ?

What is the highest insert speed anybody has seen - and I am not
talking about imports I mean inserts from one table to another ?

+
John B 2012-04-18, 17:00
+
Edward Capriolo 2012-04-18, 19:20
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB