Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Generic JDBC Sink


Copy link to this message
-
Re: Generic JDBC Sink
Jeremy Karlson 2013-11-28, 16:17
Hi Steve,

I’ve submitted the sink for review here:

http://issues.apache.org/jira/browse/FLUME-2256

If it’s something that interests you, I encourage you to apply the patch and let me know if it meets your needs or if you find problems.

So far, no movement on it…  But it’s only been a couple of days.  If Flume doesn’t want it (for whatever reason) I’ll just take off all of the Apache headers and put it up on GitHub with a similar license.  It’ll get open sourced one way or another, but I think folding it into Flume makes the most sense.

-- Jeremy
On Nov 28, 2013, at 7:39, Steve Morin <[EMAIL PROTECTED]> wrote:

> Jeremy,
>   I am interested in a JDBC flume sink are you open sourcing it?
> -Steve
>
>
> On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson <[EMAIL PROTECTED]> wrote:
> Is there any interest in a generic JDBC sink?
>
> Over the few days I decided to try and write one.  I have something that requires more testing, but seems to be working.
>
> Since the config file is how you’d interact with it, here’s a working example from my source tree:
>
> a.sinks.k.type=jdbc
> a.sinks.k.channel=c
> a.sinks.k.driver=com.mysql.jdbc.Driver
> a.sinks.k.url=jdbc:mysql://localhost:8889/flume
> a.sinks.k.user=username
> a.sinks.k.password=password
> a.sinks.k.batchSize=100
> a.sinks.k.sql=insert into twitter (body, timestamp) values (${body:string}, ${header.timestamp:long})
>
> The interesting part is the SQL statement.  You can put anything you want in there - it will get converted to a prepared statement on execution.  The Ant-ish tokens get parsed and replaced with parameters at startup.
>
> The tokens are three part.  For example, in:
>
> ${body:string(UTF-8)}
>
> The first is a place in the event to get the value from (“body”, “header.foo”, or “custom”).  The second part ("string") is a type identifier that converts into an appropriate JDBC parameter.  The third part (“UTF-8") is a configuration string for that type, if needed.  As for types, so far I’ve defined:
>
> body: string (with optional charset encoding), bytearray
> header: string, long, int, float, double, date (with mandatory date format and optional timezone)
>
> Additionally, if none of those make you happy you can define you own parameter converters:
>
> ${custom:com.company.foo.MyConverter(optionaltextconfig)}
>
> I know there is still improvement to be made, but I’d like to get some feedback, bug fixes, and maybe get it included before I do a bunch of useless work.  If there is interest, how would you like it for review or inclusion?
>
>  -- Jeremy
>
>
>