Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Re: Difference between -put command and Hive Load command?


Copy link to this message
-
Re: Difference between -put command and Hive Load command?
Harsh J 2013-02-13, 05:47
Please send Hive-relevant questions to the Hive's user community lists
([EMAIL PROTECTED]) instead of me directly. For more details read
http://hive.apache.org/mailing_lists.html#Users. I've added the list
in my response here, please carry forward discussions on the lists :-)

Neither is faster than the other, and both use the same underlying FS
implementations.

What -put does is a copy from local FS to HDFS. This would almost be
what Hive's "LOAD DATA [LOCAL]" command does as well. If your data is
already on HDFS, then "LOAD DATA"'s operation would be akin to doing a
"fs -mv" operation, which is what may have led you to believe it is
"faster" in comparison.

On Wed, Feb 13, 2013 at 11:02 AM,  <[EMAIL PROTECTED]> wrote:
> Hi Harsha,
>
> What is the difference between hadoop fs -put command and Hive 'LOAD DATA' command, because i tried to create external table and loaded data using "hadoop fs -put" command and tried using "LOAD DATA" command in Hive.
>
> It seems hive "LOAD DATA" command is much much faster than "-Put" command.
>
> can you please give what is the reason for this?
>
> Thanks
> Shreehari

--
Harsh J