Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Generic JDBC Sink


Copy link to this message
-
Re: Generic JDBC Sink
Think the biggest problem is not that people wouldn't want to use it but
that data wouldn't be written fast enough to DB's to clear channels in many
moderate volumes.

I'll follow the ticket thanks
On Thu, Nov 28, 2013 at 8:17 AM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:

> Hi Steve,
>
> I’ve submitted the sink for review here:
>
> http://issues.apache.org/jira/browse/FLUME-2256
>
> If it’s something that interests you, I encourage you to apply the patch
> and let me know if it meets your needs or if you find problems.
>
> So far, no movement on it…  But it’s only been a couple of days.  If Flume
> doesn’t want it (for whatever reason) I’ll just take off all of the Apache
> headers and put it up on GitHub with a similar license.  It’ll get open
> sourced one way or another, but I think folding it into Flume makes the
> most sense.
>
> -- Jeremy
>
>
> On Nov 28, 2013, at 7:39, Steve Morin <[EMAIL PROTECTED]> wrote:
>
> Jeremy,
>   I am interested in a JDBC flume sink are you open sourcing it?
> -Steve
>
>
> On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>
>> Is there any interest in a generic JDBC sink?
>>
>> Over the few days I decided to try and write one.  I have something that
>> requires more testing, but seems to be working.
>>
>> Since the config file is how you’d interact with it, here’s a working
>> example from my source tree:
>>
>> a.sinks.k.type=jdbc
>> a.sinks.k.channel=c
>> a.sinks.k.driver=com.mysql.jdbc.Driver
>> a.sinks.k.url=jdbc:mysql://localhost:8889/flume
>> a.sinks.k.user=username
>> a.sinks.k.password=password
>> a.sinks.k.batchSize=100
>> a.sinks.k.sql=insert into twitter (body, timestamp) values
>> (${body:string}, ${header.timestamp:long})
>>
>> The interesting part is the SQL statement.  You can put anything you want
>> in there - it will get converted to a prepared statement on execution.  The
>> Ant-ish tokens get parsed and replaced with parameters at startup.
>>
>> The tokens are three part.  For example, in:
>>
>> ${body:string(UTF-8)}
>>
>> The first is a place in the event to get the value from (“body”,
>> “header.foo”, or “custom”).  The second part ("string") is a type
>> identifier that converts into an appropriate JDBC parameter.  The third
>> part (“UTF-8") is a configuration string for that type, if needed.  As for
>> types, so far I’ve defined:
>>
>> body: string (with optional charset encoding), bytearray
>> header: string, long, int, float, double, date (with mandatory date
>> format and optional timezone)
>>
>> Additionally, if none of those make you happy you can define you own
>> parameter converters:
>>
>> ${custom:com.company.foo.MyConverter(optionaltextconfig)}
>>
>> I know there is still improvement to be made, but I’d like to get some
>> feedback, bug fixes, and maybe get it included before I do a bunch of
>> useless work.  If there is interest, how would you like it for review or
>> inclusion?
>>
>>  -- Jeremy
>>
>>
>>
>
>