Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Creating external table poiting to s3 folder with files not loading data


+
Fernando Andrés Doglio Tu... 2012-12-11, 14:05
+
Dean Wampler 2012-12-14, 15:22
Copy link to this message
-
Re: Creating external table poiting to s3 folder with files not loading data
Hello, and thank you both for your answers...
I think I found the problem... keep in mind I'm quite new to all this
Hive/Hadoop stuff :)

I think my problem was due to the fact that the create table statement had
the partition defined but the information was not partitioned on the file
system (it was just 1 file inside a folder).

I'm guessing that what I have to do, is load the data into a
non-partitioned table and then  copy the information using hive and dynamic
partitioning the data in the same query... is that right?

Thanks again!

On Fri, Dec 14, 2012 at 1:22 PM, Dean Wampler <
[EMAIL PROTECTED]> wrote:

> A couple of clarifying questions and suggestions. First, keep in mind that
> Hive doesn't care if you have a typo of some kind in your external location
> ;) Use DESCRIBE FORMATTED to verify the path is right. For an external
> partitioned table, DESCRIBE FORMATTED table
> PARTITION(col1=val1,col2=val2,...).
>
> A dumb mistake I've often made is use a variable in a script, e.g., "...
> LOCATION '${DATA}/foo/bar/baz';" and forget to define DATA when invoking
> the script.
>
> When you said "load a file", did you mean using the LOAD DATA ... INPATH
> 's3n://...' command? I've read that s3n is not supported for these
> statements, but I'm not sure that's actually true.
>
> If everything looks correct, you should be able to do hadoop fs -ls
> s3n://... successfully. Actually, since your hive environment could have
> different settings for some filesystem properties, it might be a better
> check to use dfs -ls ... at the hive CLI prompt.
>
> Otherwise, it's probably the SerDe, as Mark suggested. If possible, I
> would attempt to use the data in some temporary external table using a
> built-in SerDe, like the default, just to confirm that it's not a file
> system issue and it's probably the SerDe.
>
> Hope that helps.
> dean
>
> On Tue, Dec 11, 2012 at 8:05 AM, Fernando Andrés Doglio Turissini <
> [EMAIL PROTECTED]> wrote:
>
>> Long subject, I know.. let me explain a bit more about the problem:
>>
>> I'm trying to load a file into a hive table (this is on an EMR instance)
>> for that I create an external table, and I set the location to the folder
>> on an s3 bucket, where the file resides.
>> The problem is that even though the table is created correctly, when I do
>> a "select * from table" it returns nothing. I'm not seeing errors on the
>> logs either, so I don't know what can be happening....
>>
>> Also, probably important: I'm using a custom SerDe that I did not
>> write...but I do have the code for it.
>>
>> I'm quite new to hive, so I appreciate any kind of pointers you can throw
>> at me.
>>
>> Thanks!
>> Fernando Doglio
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>
+
Dean Wampler 2012-12-17, 15:07
+
Mark Grover 2012-12-14, 09:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB