Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Creating external table poiting to s3 folder with files not loading data


+
Fernando Andrés Doglio Tu... 2012-12-11, 14:05
+
Dean Wampler 2012-12-14, 15:22
Copy link to this message
-
Re: Creating external table poiting to s3 folder with files not loading data
Fernando Andrés Doglio Tu... 2012-12-17, 11:32
Hello, and thank you both for your answers...
I think I found the problem... keep in mind I'm quite new to all this
Hive/Hadoop stuff :)

I think my problem was due to the fact that the create table statement had
the partition defined but the information was not partitioned on the file
system (it was just 1 file inside a folder).

I'm guessing that what I have to do, is load the data into a
non-partitioned table and then  copy the information using hive and dynamic
partitioning the data in the same query... is that right?

Thanks again!

On Fri, Dec 14, 2012 at 1:22 PM, Dean Wampler <
[EMAIL PROTECTED]> wrote:

> A couple of clarifying questions and suggestions. First, keep in mind that
> Hive doesn't care if you have a typo of some kind in your external location
> ;) Use DESCRIBE FORMATTED to verify the path is right. For an external
> partitioned table, DESCRIBE FORMATTED table
> PARTITION(col1=val1,col2=val2,...).
>
> A dumb mistake I've often made is use a variable in a script, e.g., "...
> LOCATION '${DATA}/foo/bar/baz';" and forget to define DATA when invoking
> the script.
>
> When you said "load a file", did you mean using the LOAD DATA ... INPATH
> 's3n://...' command? I've read that s3n is not supported for these
> statements, but I'm not sure that's actually true.
>
> If everything looks correct, you should be able to do hadoop fs -ls
> s3n://... successfully. Actually, since your hive environment could have
> different settings for some filesystem properties, it might be a better
> check to use dfs -ls ... at the hive CLI prompt.
>
> Otherwise, it's probably the SerDe, as Mark suggested. If possible, I
> would attempt to use the data in some temporary external table using a
> built-in SerDe, like the default, just to confirm that it's not a file
> system issue and it's probably the SerDe.
>
> Hope that helps.
> dean
>
> On Tue, Dec 11, 2012 at 8:05 AM, Fernando Andrés Doglio Turissini <
> [EMAIL PROTECTED]> wrote:
>
>> Long subject, I know.. let me explain a bit more about the problem:
>>
>> I'm trying to load a file into a hive table (this is on an EMR instance)
>> for that I create an external table, and I set the location to the folder
>> on an s3 bucket, where the file resides.
>> The problem is that even though the table is created correctly, when I do
>> a "select * from table" it returns nothing. I'm not seeing errors on the
>> logs either, so I don't know what can be happening....
>>
>> Also, probably important: I'm using a custom SerDe that I did not
>> write...but I do have the code for it.
>>
>> I'm quite new to hive, so I appreciate any kind of pointers you can throw
>> at me.
>>
>> Thanks!
>> Fernando Doglio
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>
+
Dean Wampler 2012-12-17, 15:07
+
Mark Grover 2012-12-14, 09:05