Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Creating external table poiting to s3 folder with files not loading data


+
Fernando Andrés Doglio Tu... 2012-12-11, 14:05
Copy link to this message
-
Re: Creating external table poiting to s3 folder with files not loading data
A couple of clarifying questions and suggestions. First, keep in mind that
Hive doesn't care if you have a typo of some kind in your external location
;) Use DESCRIBE FORMATTED to verify the path is right. For an external
partitioned table, DESCRIBE FORMATTED table
PARTITION(col1=val1,col2=val2,...).

A dumb mistake I've often made is use a variable in a script, e.g., "...
LOCATION '${DATA}/foo/bar/baz';" and forget to define DATA when invoking
the script.

When you said "load a file", did you mean using the LOAD DATA ... INPATH
's3n://...' command? I've read that s3n is not supported for these
statements, but I'm not sure that's actually true.

If everything looks correct, you should be able to do hadoop fs -ls
s3n://... successfully. Actually, since your hive environment could have
different settings for some filesystem properties, it might be a better
check to use dfs -ls ... at the hive CLI prompt.

Otherwise, it's probably the SerDe, as Mark suggested. If possible, I would
attempt to use the data in some temporary external table using a built-in
SerDe, like the default, just to confirm that it's not a file system issue
and it's probably the SerDe.

Hope that helps.
dean

On Tue, Dec 11, 2012 at 8:05 AM, Fernando Andrés Doglio Turissini <
[EMAIL PROTECTED]> wrote:

> Long subject, I know.. let me explain a bit more about the problem:
>
> I'm trying to load a file into a hive table (this is on an EMR instance)
> for that I create an external table, and I set the location to the folder
> on an s3 bucket, where the file resides.
> The problem is that even though the table is created correctly, when I do
> a "select * from table" it returns nothing. I'm not seeing errors on the
> logs either, so I don't know what can be happening....
>
> Also, probably important: I'm using a custom SerDe that I did not
> write...but I do have the code for it.
>
> I'm quite new to hive, so I appreciate any kind of pointers you can throw
> at me.
>
> Thanks!
> Fernando Doglio
>

--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
+
Fernando Andrés Doglio Tu... 2012-12-17, 11:32
+
Dean Wampler 2012-12-17, 15:07
+
Mark Grover 2012-12-14, 09:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB