Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Re: Difference between -put command and Hive Load command?


Copy link to this message
-
Re: Difference between -put command and Hive Load command?
Please send Hive-relevant questions to the Hive's user community lists
([EMAIL PROTECTED]) instead of me directly. For more details read
http://hive.apache.org/mailing_lists.html#Users. I've added the list
in my response here, please carry forward discussions on the lists :-)

Neither is faster than the other, and both use the same underlying FS
implementations.

What -put does is a copy from local FS to HDFS. This would almost be
what Hive's "LOAD DATA [LOCAL]" command does as well. If your data is
already on HDFS, then "LOAD DATA"'s operation would be akin to doing a
"fs -mv" operation, which is what may have led you to believe it is
"faster" in comparison.

On Wed, Feb 13, 2013 at 11:02 AM,  <[EMAIL PROTECTED]> wrote:
> Hi Harsha,
>
> What is the difference between hadoop fs -put command and Hive 'LOAD DATA' command, because i tried to create external table and loaded data using "hadoop fs -put" command and tried using "LOAD DATA" command in Hive.
>
> It seems hive "LOAD DATA" command is much much faster than "-Put" command.
>
> can you please give what is the reason for this?
>
> Thanks
> Shreehari

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB