Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> S3/EMR Hive: Load contents of a single file


+
Tony Burton 2013-03-26, 17:11
Copy link to this message
-
Re: S3/EMR Hive: Load contents of a single file
First of all, you cannot point a table to a file. Each table will have a
corresponding table. If you want to have all the in the table contains in
only one file, simply copy that one file into the directory. The table does
not need to know the name of the file. It only matters whether the
structure of the data in the file is similar to the table structure.

When you query the table, it gets the data from whatever files are there
from the corresponding directory.

Regards,
Ramki.
On Tue, Mar 26, 2013 at 10:11 AM, Tony Burton <[EMAIL PROTECTED]>wrote:

> Hi list,****
>
> ** **
>
> I've been using hive to perform queries on data hosted on AWS S3, and my
> tables point at data by specifying the directory in which the data is
> stored, eg ****
>
> ** **
>
> $ create external table myData (str1 string, str2 string, count1 int)
> partitioned by <snip> row format <snip> stored as textfile location
> 's3://mybucket/path/to/data';****
>
> ** **
>
> where s3://mybucket/path/to/data is the "directory" that contains the
> files I'm interested in. My use case now is to create a table with data
> pointing to a specifc file in a directory:****
>
> ** **
>
> $ create external table myData (str1 string, str2 string, count1 int)
> partitioned by <snip> row format <snip> stored as textfile location
> 's3://mybucket/path/to/data/src1.txt';****
>
>             ****
>
> and I get the error: "FAILED: Error in metadata: MetaException(message:Got
> exception: java.io.IOException Can't make directory for path
> 's3://spinmetrics/global/counter_Fixture.txt' since it is a file.)". Ok,
> lets try to create the table without specifying the data source:****
>
> ** **
>
> $ create external table myData (str1 string, str2 string, count1 int)
> partitioned by <snip> row format <snip> stored as textfile****
>
> ** **
>
> Ok, no problem. Now lets load the data****
>
> ** **
>
> $ LOAD DATA INPATH 's3://mybucket/path/to/data/src1.txt' INTO TABLE myData;
> ****
>
> ** **
>
> (referring to https://cwiki.apache.org/Hive/languagemanual-dml.html -
> "...filepath can refer to a file (in which case hive will move the file
> into the table)")****
>
> ** **
>
> Error message is: " FAILED: Error in semantic analysis: Line 1:17 Path is
> not legal ''s3://mybucket/path/to/data/src1.txt": Move from: s3://
> mybucket/path/to/data/src1.txt to: hdfs://
> 10.48.97.97:9000/mnt/hive_081/warehouse/gfix is not valid. Please check
> that values for params "default.fs.name" and
> "hive.metastore.warehouse.dir" do not conflict."****
>
> ** **
>
> So I check my default.fs.name and hive.metastore.warehouse.dir (which
> have never caused problems before):****
>
> ** **
>
> $ set fs.default.name;****
>
> fs.default.name=hdfs://10.48.97.97:9000****
>
> $ set hive.metastore.warehouse.dir;****
>
> hive.metastore.warehouse.dir=/mnt/hive_081/warehouse****
>
> ** **
>
> Clearly different, but which is correct? Is there an easier way to load a
> single file into a hive table? Or should I just put each file in a
> directory and proceed as before?****
>
> ** **
>
> Thanks!****
>
> ** **
>
> Tony****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *Tony Burton
> Senior Software Engineer*
> e: [EMAIL PROTECTED]
>
> ****
>
> [image: cid:[EMAIL PROTECTED]7330]<http://www.sportingsolutions.com/>
> ****
>
> ** **
>
>
>
>
> *****************************************************************************
> P *Please consider the environment before printing this email or
> attachments*
>
>
> This email and any attachments are confidential, protected by copyright
> and may be legally privileged. If you are not the intended recipient, then
> the dissemination or copying of this email is prohibited. If you have
> received this in error, please notify the sender by replying by email and
> then delete the email completely from your system. Neither Sporting Index
> nor the sender accepts responsibility for any virus, or any other defect
> which might affect any computer or IT system into which the email is
+
Sanjay Subramanian 2013-03-26, 17:21
+
Tony Burton 2013-03-26, 17:39
+
Sanjay Subramanian 2013-03-26, 17:41
+
Tony Burton 2013-03-26, 17:45
+
Keith Wiley 2013-03-26, 19:39
+
Tony Burton 2013-03-27, 08:46
+
Tony Burton 2013-03-27, 09:58
+
Keith Wiley 2013-03-27, 17:02
+
Tony Burton 2013-03-27, 17:18