Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> sqooping into S3


Copy link to this message
-
Re: sqooping into S3
Sorry for the delay.  I have not really checked with S3 as the default
warehouse directory so it may be a Sqoop or Hive issue.  That said, if you
want Sqoop to create the hive table, you need to use --hive-import option.

Or instead of going through an initial import to S3 and then doing a "Load"
to hive (which actually does a rename), you can try to use the hcatalog
import option.   That option will create the hive table using the HCatalog
interfaces and you may have better success in creating the hive table and
pushing the data.   HCat import and export is available as part of Sqoop
1.4.4

Thanks

Venkat
On Tue, Feb 4, 2014 at 4:58 PM, Imran Akbar <[EMAIL PROTECTED]> wrote:

> Hey Venkat,
>     Thanks for the tip - when I remove the "--hive-import" and
> "--hive-overwrite" options, the import from MySQL to S3 works without any
> errors, with either of the --target-dir and --warehouse-dir options
> specified.  I've set the "hive.metastore.warehouse.dir" configuration in my
> hive-site.xml file to "s3n://***:***/@iakbar.emr/warehouse" which is where
> the data is.  However, when I run hive and run the command "show tables" I
> don't see the name of the table sqoop imported.  Must I manually create the
> table schema in hive?
>
> yours,
> imran
>
>
> On Tue, Feb 4, 2014 at 2:35 PM, Venkat Ranganathan <
> [EMAIL PROTECTED]> wrote:
>
>> If you see the error, you can see that FS object being referenced is an
>> HDFS location which is not valid as you have an S3 filesystem as source of
>> data.
>>
>> I dont know what your intention is.   You are saying hive import from
>> MYSQL to S3.    Do you mean Sqoop import?  You just want the files to land
>> on S3?  Then you don't need the --hive-import and --hive-overwrite options.
>>
>> To do hive import with hive from S3 file, you probably have to make the
>> warehouse dir to be on S3.
>>
>> You can also create an external table in Hive after the data lands on S3
>>
>> Venkat
>>
>>
>> On Tue, Feb 4, 2014 at 1:50 PM, Imran Akbar <[EMAIL PROTECTED]>wrote:
>>
>>> That doesn't seem to be the issue, because I just manually created a
>>> folder called "_logs" in S3 and it worked.
>>> Any ideas why the sqoop import would work, but would fail when trying to
>>> create a "_logs" folder after its done?
>>>
>>>
>>> On Tue, Feb 4, 2014 at 1:44 PM, Imran Akbar <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hey Venkat,
>>>>     Sorry, I meant to say I made that change in core-site.xml, not
>>>> site-core.xml.
>>>>
>>>> I'm trying to do a hive import from MySQL to S3, but I think the error
>>>> is popping up because sqoop is trying to create a "_logs" directory, but
>>>> according to S3's naming conventions you can't start the name of a bucket
>>>> with an underscore:
>>>>
>>>> "Bucket names can contain lowercase letters, numbers, and dashes. Each
>>>> label must start and end with a lowercase letter or a number."
>>>> http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html
>>>>
>>>> this is the error i'm getting (the iakbar.emr/dump2/ location on S3
>>>> contains files, so I know sqoop works up to this point):
>>>> "This file system object (hdfs://10.202.163.18:9000) does not support
>>>> access to the request path 's3n://****:****@iakbar.emr/dump2/_logs'"
>>>>
>>>> thanks,
>>>> imran
>>>>
>>>>
>>>> On Tue, Feb 4, 2014 at 12:45 PM, Venkat Ranganathan <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> I think you are trying to do a hive import from the S3 location.    I
>>>>> think it may not be supported - As Jarcec said you may want to change the
>>>>> core-site to point to S3 on your Hadoop cluster.  But I have not  tested
>>>>> this so not sure if that will work
>>>>>
>>>>> Venkat
>>>>>
>>>>>
>>>>> On Tue, Feb 4, 2014 at 12:04 PM, Imran Akbar <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> I think it may have worked, but I am getting an error.
>>>>>>
>>>>>> I added this line to site-core.xml:
>>>>>> <property><name>fs.defaultFS</name><value>s3n</value></property>
>>>>>
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.