Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> sqooping into S3


Copy link to this message
-
Re: sqooping into S3
If you see the error, you can see that FS object being referenced is an
HDFS location which is not valid as you have an S3 filesystem as source of
data.

I dont know what your intention is.   You are saying hive import from MYSQL
to S3.    Do you mean Sqoop import?  You just want the files to land on S3?
 Then you don't need the --hive-import and --hive-overwrite options.

To do hive import with hive from S3 file, you probably have to make the
warehouse dir to be on S3.

You can also create an external table in Hive after the data lands on S3

Venkat
On Tue, Feb 4, 2014 at 1:50 PM, Imran Akbar <[EMAIL PROTECTED]> wrote:

> That doesn't seem to be the issue, because I just manually created a
> folder called "_logs" in S3 and it worked.
> Any ideas why the sqoop import would work, but would fail when trying to
> create a "_logs" folder after its done?
>
>
> On Tue, Feb 4, 2014 at 1:44 PM, Imran Akbar <[EMAIL PROTECTED]>wrote:
>
>> Hey Venkat,
>>     Sorry, I meant to say I made that change in core-site.xml, not
>> site-core.xml.
>>
>> I'm trying to do a hive import from MySQL to S3, but I think the error is
>> popping up because sqoop is trying to create a "_logs" directory, but
>> according to S3's naming conventions you can't start the name of a bucket
>> with an underscore:
>>
>> "Bucket names can contain lowercase letters, numbers, and dashes. Each
>> label must start and end with a lowercase letter or a number."
>> http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html
>>
>> this is the error i'm getting (the iakbar.emr/dump2/ location on S3
>> contains files, so I know sqoop works up to this point):
>> "This file system object (hdfs://10.202.163.18:9000) does not support
>> access to the request path 's3n://****:****@iakbar.emr/dump2/_logs'"
>>
>> thanks,
>> imran
>>
>>
>> On Tue, Feb 4, 2014 at 12:45 PM, Venkat Ranganathan <
>> [EMAIL PROTECTED]> wrote:
>>
>>> I think you are trying to do a hive import from the S3 location.    I
>>> think it may not be supported - As Jarcec said you may want to change the
>>> core-site to point to S3 on your Hadoop cluster.  But I have not  tested
>>> this so not sure if that will work
>>>
>>> Venkat
>>>
>>>
>>> On Tue, Feb 4, 2014 at 12:04 PM, Imran Akbar <[EMAIL PROTECTED]>wrote:
>>>
>>>> I think it may have worked, but I am getting an error.
>>>>
>>>> I added this line to site-core.xml:
>>>> <property><name>fs.defaultFS</name><value>s3n</value></property>
>>>>
>>>> and I see the following contents in my S3 directory after running sqoop:
>>>> _SUCCESS
>>>> part-m-00000
>>>> part-m-00001
>>>> part-m-00002
>>>> part-m-00003
>>>> part-m-00004
>>>> part-m-00005
>>>>
>>>> I'm running sqoop version 1.4.4.
>>>>
>>>> But I still get this error after running sqoop:
>>>> http://pastebin.com/5AYCsd78
>>>>
>>>> any ideas?
>>>> thanks for the help so far
>>>>
>>>> imran
>>>>
>>>>
>>>> On Tue, Feb 4, 2014 at 11:24 AM, Venkat Ranganathan <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> Which version of sqoop are you using.   Sqoop 1.4.4 addressed use of
>>>>> other filesystems with the fix mentioned in SQOOP-1033
>>>>>
>>>>> Thanks
>>>>> Venkat
>>>>>
>>>>>
>>>>> On Tue, Feb 4, 2014 at 8:14 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Yes Imran,
>>>>>> I would try to define the fs.defaultFS for the S3 in core-site.xml
>>>>>> and see if it will help Sqoop to accept the S3 path.
>>>>>>
>>>>>> Jarcec
>>>>>>
>>>>>> On Tue, Feb 04, 2014 at 08:08:17AM -0800, Imran Akbar wrote:
>>>>>> > thanks Jarek,
>>>>>> >    How would I do that?  Do I need to set fs.defaultFS in
>>>>>> core-site.xml, or
>>>>>> > is it something else?  Is there a document somewhere which
>>>>>> describes this?
>>>>>> >
>>>>>> > yours,
>>>>>> > imran
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Feb 3, 2014 at 9:31 PM, Jarek Jarcec Cecho <
>>>>>> [EMAIL PROTECTED]>wrote:
>>>>>> >
>>>>>> > > Would you mind trying to set the S3 filesystem as the default one
>>>>>> for
>>>>>> > > Sqoop?
>
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.