Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> S3 Sink in FlumeNG Configuration?


+
Matthew Moore 2013-03-29, 14:44
+
Brock Noland 2013-03-29, 14:49
+
Matthew Moore 2013-03-29, 15:47
+
Matthew Moore 2013-03-29, 19:15
Copy link to this message
-
Re: S3 Sink in FlumeNG Configuration?
I am awesome at answering my own questions =\

I was using jets3t 0.7.4 instead of the 0.6.1 included with Hadoop (yet
jets3t wasn't included with Flume)

Best,
Matt
--
Matthew Moore
Co-Founder & CTO, CrowdMob Inc.
Mobile: (650) 888-5962

Need to schedule a meeting?  Invite me via Google Calendar!
[EMAIL PROTECTED]
On Fri, Mar 29, 2013 at 12:15 PM, Matthew Moore <[EMAIL PROTECTED]> wrote:

> Hey Guys,
>
> I've made a decent amount of progress, and now have the settings correct.
>  For completeness, the settings look like this:
>
> agent.sinks.s3Sink.type = hdfs
> agent.sinks.s3Sink.hdfs.path = s3://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@BUCKET-NAME/
>
> You can see the full setup at this gist:
> https://gist.github.com/crowdmatt/5256881
>
>
> However, I've run into the following problem:
>
>
> 2013-03-29 19:05:28,954 (SinkRunner-PollingRunner-DefaultSinkProcessor)
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)]
> process failed
> org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: Request Error. HEAD
> '/FlumeData.1364583927762.tmp' on Host 'mybucket.s3.amazonaws.com' @
> 'Fri, 29 Mar 2013 19:05:28 GMT' -- ResponseCode: 404, ResponseStatus: Not
> Found, RequestId: 00864FE1DCD5AD95, HostId:
> 68AuSUe/XsP9zUiwe4yqhhDjETjVEnXVuTdZjYKQfj6VBKyACLH++MD1i8xgrEE4
>  at
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122)
>
>
> Does anyone have any pointers on how I can start debugging?
>
> Best,
> Matt
> --
> Matthew Moore
> Co-Founder & CTO, CrowdMob Inc.
> Mobile: (650) 888-5962
>
> Need to schedule a meeting?  Invite me via Google Calendar!
> [EMAIL PROTECTED]
>
>
> On Fri, Mar 29, 2013 at 8:47 AM, Matthew Moore <[EMAIL PROTECTED]> wrote:
>
>> Hey,
>>
>> Thanks for the links to the Jiras.  It seems like someone implemented
>> an S3BufferedWriter which might be helpful in the future.
>>
>> However, I'm still not sure what to set the configuration (flume.conf) to
>> use s3 as a sink?  Has anyone done that?
>>
>> Best,
>> Matt
>> --
>> Matthew Moore
>> Co-Founder & CTO, CrowdMob Inc.
>> Mobile: (650) 888-5962
>>
>> Need to schedule a meeting?  Invite me via Google Calendar!
>> [EMAIL PROTECTED]
>>
>>
>> On Fri, Mar 29, 2013 at 7:49 AM, Brock Noland <[EMAIL PROTECTED]> wrote:
>>
>>> Sorry, I don't know much about this, but here are two relevant JIRA's:
>>>
>>> https://issues.apache.org/jira/browse/FLUME-1228
>>> https://issues.apache.org/jira/browse/FLUME-951
>>>
>>>
>>> On Fri, Mar 29, 2013 at 9:44 AM, Matthew Moore <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hey there,
>>>>
>>>> I know this is a really newbish question, but I'm hoping to get a
>>>> little assistance here so I'm not stuck guess-and-checking.
>>>>
>>>> I'm trying to figure out how to configure FlumeNG (1.3.1), but I
>>>> couldn't figure out how to setup the hdfs sink to use the s3
>>>> implementations.
>>>>
>>>> I'm keeping track of my progress on this gist I made:
>>>> https://gist.github.com/crowdmatt/5256881
>>>>
>>>> From what I've gathered, I should be using the hdfs type, which I'm
>>>> setting up as such:
>>>>
>>>> agent.sinks = s3Sink
>>>> agent.sinks.s3Sink.type = hdfs
>>>> agent.sinks.s3Sink.channel = recoverableMemoryChannel
>>>>
>>>> ... but that's where I end up hitting my head against the wall.  I know
>>>> I should be specifying my s3 access key, secret, and bucket in this format:
>>>> s3n://ACCESS_KEY_ID:SECRET_ACCESS_KEY@my-hdfs/
>>>>
>>>> However, I don't know where to specify that, or what dot notation to
>>>> use.
>>>>
>>>> Can anyone point me in the right direction?
>>>>
>>>> Best,
>>>> Matt
>>>> --
>>>> Matthew Moore
>>>> Co-Founder & CTO, CrowdMob Inc.
>>>> Mobile: (650) 888-5962
>>>>
>>>> Need to schedule a meeting?  Invite me via Google Calendar!
>>>> [EMAIL PROTECTED]
>>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB