Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> cannot use distcp in some s3 buckets


Copy link to this message
-
Re: cannot use distcp in some s3 buckets
On Thu, Oct 13, 2011 at 2:06 PM, Raimon Bosch <[EMAIL PROTECTED]> wrote:
> By the way,
>
> The url I'm trying has a '_' in the bucket name. Could be this the problem?

Yes, underscores are not permitted in hostnames.

Cheers,
Tom

>
> 2011/10/13 Raimon Bosch <[EMAIL PROTECTED]>
>
>> Hi,
>>
>> I've been having some problems with one of our s3 buckets. I have asked on
>> amazon support with no luck yet
>> https://forums.aws.amazon.com/thread.jspa?threadID=78001.
>>
>> I'm getting this exception only with our oldest s3 bucket with this
>> command: "hadoop distcp s3://<MY_BUCKET_NAME>/logfile-20110815.gz
>> /tmp/logfile-20110815.gz"
>>
>> java.lang.IllegalArgumentException: Invalid hostname in URI
>> s3://<MY_BUCKET_NAME>/logfile-20110815.gz /tmp/logfile-20110815.gz
>> at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
>> at
>> org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82)
>>
>> As you can see, hadoop is rejecting my url before starting to do the
>> authorization steps. Someone has been in a similar issue? I have already
>> tested the same operation in newer s3 buckets and the command is working
>> correctly.
>>
>> Thanks in advance,
>> Raimon Bosch.
>>
>>
>>
>