Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - HIVE and S3 folders


Copy link to this message
-
Re: HIVE and S3 folders
Mark Grover 2012-03-07, 17:56
Hi Balaji,
The Hive/Hadoop installation that comes with EMR is Amazon specific which has some additional patches that make s3 paths as recognizable as HDFS paths.

However, if you are using EC2, you most likely have Apache or Cloudera installation which doesn't recognize S3 paths.

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation

www: oanda.com www: fxtrade.com

"Best Trading Platform" - World Finance's Forex Awards 2009.
"The One to Watch" - Treasury Today's Adam Smith Awards 2009.
----- Original Message -----
From: "Balaji Rao" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Wednesday, March 7, 2012 12:48:31 PM
Subject: HIVE and S3 folders

I'm having problems with HIVE- EC2 reading files on S3.

I have a lot of files and folders on S3 created by s3cmd and utilized
by Elastic Map Reduce (HIVE) and they work interchangeably, files
created by HIVE-EMR can be read by s3cmd and vice versa.
However, I'm having problems with HIVE/Hadoop running on EC2. Both
Hive 0.7 and 0.8 seem to create an additional folder "/" on S3

For example, if I have a file s3://bucket/path/00000 created by s3cmd
or HIVE-EMR and I try to create an external table on HIVE- EC2

create external table wc(site string, cnt int) row format delimited
fields terminated by '\t' stored as textfile location
's3://bucket/path'

This does not recognize the EMR created s3 folders, instead I see a
new folder "/"

<bucket> / "/" / path

Am I missing something here ?
Balaji