Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> HIVE and S3 via EMR?


Copy link to this message
-
Re: HIVE and S3 via EMR?
Ok, I spoke too soon.  Same error.  Crapola.  Still working on it.

On Tue, May 29, 2012 at 2:19 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> I get an error when I create an external table.  btw - I can partition on
> dt or from/to address.  I'm just not clear on how to partition - my efforts
> fail.
>
> hive> create external table from_to(from_address string, to_address
> string, dt string)
>     >     row format delimited fields terminated by '\t' stored as
> textfile location 's3n://rjurney_public_web/from_to_date';
> FAILED: Error in metadata: java.lang.IllegalArgumentException: Invalid
> hostname in URI s3n://rjurney_public_web/from_to_date
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
>
>
> However, I just upgraded to HIVE 0.9, and it works :)  No reason to use
> the old stuff when I can scp the new one up.
>
> Thanks!
>
> On Tue, May 29, 2012 at 1:34 PM, Balaji Rao <[EMAIL PROTECTED]> wrote:
>
>> If you are using hive on EMR, you can create a table directly from the
>> data on S3:
>>
>> From hive, you can create tables that use S3 data like this:
>>
>> create external table from_to(from_address string, to_address string,
>> dt string) row format delimited fields terminated by '\t' stored as
>> textfile location 's3://rjurney_public_web/from_to_date';
>>
>> You could then:
>>  select <*> from from_to
>>
>> Balaji
>>
>> On Tue, May 29, 2012 at 4:20 PM, Russell Jurney
>> <[EMAIL PROTECTED]> wrote:
>> > How do I load data from S3 into Hive using Amazon EMR?  I've booted a
>> small
>> > cluster, and I want to load a 3-column TSV file from Pig into a table
>> like
>> > this:
>> >
>> > create table from_to (from_address string, to_address string, dt
>> string);
>> >
>> >
>> > When I run something like this:
>> >
>> > load data inpath 's3n://rjurney_public_web/from_to_date' into table
>> from_to;
>> >
>> >
>> > I get errors:
>> >
>> > FAILED: Error in semantic analysis: Line 1:17 Invalid path
>> > 's3n://rjurney_public_web/from_to_date': only "file" or "hdfs" file
>> systems
>> > accepted. s3n file system is not supported.
>> >
>> >
>> > There is no distcp on the master node of my EMR cluster, so I can't
>> copy it
>> > over.  I've read the documentation... and so far after a day of trying,
>> I
>> > can't load data into HIVE via EMR.
>> >
>> > What am I missing?  Thanks!
>> > --
>> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
>> datasyndrome.com
>>
>
>
>
> --
> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.
> com
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB