Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Generic JDBC Sink


Copy link to this message
-
Re: Generic JDBC Sink
I suppose that really depends on the usage scenario.  There are a hundred
things that may affect the ability of the Flume chain to keep up with
incoming data, only one of which is the sink being a JDBC connection.  I
think for cases like mine where the data is structured and of a reasonable
volume, a JDBC connection makes sense.

I guess what I'm saying is that if someone uses it without thinking or
testing what they're doing with it...  That's not a problem with JDBC, the
sink, or Flume.  It's a problem with the operator.  :-P

-- Jeremy

On Thu, Nov 28, 2013 at 8:33 AM, Steve Morin <[EMAIL PROTECTED]> wrote:

> Think the biggest problem is not that people wouldn't want to use it but
> that data wouldn't be written fast enough to DB's to clear channels in many
> moderate volumes.
>
> I'll follow the ticket thanks
>
>
> On Thu, Nov 28, 2013 at 8:17 AM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>
>> Hi Steve,
>>
>> I’ve submitted the sink for review here:
>>
>> http://issues.apache.org/jira/browse/FLUME-2256
>>
>> If it’s something that interests you, I encourage you to apply the patch
>> and let me know if it meets your needs or if you find problems.
>>
>> So far, no movement on it…  But it’s only been a couple of days.  If
>> Flume doesn’t want it (for whatever reason) I’ll just take off all of the
>> Apache headers and put it up on GitHub with a similar license.  It’ll get
>> open sourced one way or another, but I think folding it into Flume makes
>> the most sense.
>>
>> -- Jeremy
>>
>>
>> On Nov 28, 2013, at 7:39, Steve Morin <[EMAIL PROTECTED]> wrote:
>>
>> Jeremy,
>>   I am interested in a JDBC flume sink are you open sourcing it?
>> -Steve
>>
>>
>> On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>>
>>> Is there any interest in a generic JDBC sink?
>>>
>>> Over the few days I decided to try and write one.  I have something that
>>> requires more testing, but seems to be working.
>>>
>>> Since the config file is how you’d interact with it, here’s a working
>>> example from my source tree:
>>>
>>> a.sinks.k.type=jdbc
>>> a.sinks.k.channel=c
>>> a.sinks.k.driver=com.mysql.jdbc.Driver
>>> a.sinks.k.url=jdbc:mysql://localhost:8889/flume
>>> a.sinks.k.user=username
>>> a.sinks.k.password=password
>>> a.sinks.k.batchSize=100
>>> a.sinks.k.sql=insert into twitter (body, timestamp) values
>>> (${body:string}, ${header.timestamp:long})
>>>
>>> The interesting part is the SQL statement.  You can put anything you
>>> want in there - it will get converted to a prepared statement on execution.
>>>  The Ant-ish tokens get parsed and replaced with parameters at startup.
>>>
>>> The tokens are three part.  For example, in:
>>>
>>> ${body:string(UTF-8)}
>>>
>>> The first is a place in the event to get the value from (“body”,
>>> “header.foo”, or “custom”).  The second part ("string") is a type
>>> identifier that converts into an appropriate JDBC parameter.  The third
>>> part (“UTF-8") is a configuration string for that type, if needed.  As for
>>> types, so far I’ve defined:
>>>
>>> body: string (with optional charset encoding), bytearray
>>> header: string, long, int, float, double, date (with mandatory date
>>> format and optional timezone)
>>>
>>> Additionally, if none of those make you happy you can define you own
>>> parameter converters:
>>>
>>> ${custom:com.company.foo.MyConverter(optionaltextconfig)}
>>>
>>> I know there is still improvement to be made, but I’d like to get some
>>> feedback, bug fixes, and maybe get it included before I do a bunch of
>>> useless work.  If there is interest, how would you like it for review or
>>> inclusion?
>>>
>>>  -- Jeremy
>>>
>>>
>>>
>>
>>
>