Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Generic JDBC Sink


Copy link to this message
-
Re: Generic JDBC Sink
I suppose that really depends on the usage scenario.  There are a hundred
things that may affect the ability of the Flume chain to keep up with
incoming data, only one of which is the sink being a JDBC connection.  I
think for cases like mine where the data is structured and of a reasonable
volume, a JDBC connection makes sense.

I guess what I'm saying is that if someone uses it without thinking or
testing what they're doing with it...  That's not a problem with JDBC, the
sink, or Flume.  It's a problem with the operator.  :-P

-- Jeremy

On Thu, Nov 28, 2013 at 8:33 AM, Steve Morin <[EMAIL PROTECTED]> wrote:

> Think the biggest problem is not that people wouldn't want to use it but
> that data wouldn't be written fast enough to DB's to clear channels in many
> moderate volumes.
>
> I'll follow the ticket thanks
>
>
> On Thu, Nov 28, 2013 at 8:17 AM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>
>> Hi Steve,
>>
>> I’ve submitted the sink for review here:
>>
>> http://issues.apache.org/jira/browse/FLUME-2256
>>
>> If it’s something that interests you, I encourage you to apply the patch
>> and let me know if it meets your needs or if you find problems.
>>
>> So far, no movement on it…  But it’s only been a couple of days.  If
>> Flume doesn’t want it (for whatever reason) I’ll just take off all of the
>> Apache headers and put it up on GitHub with a similar license.  It’ll get
>> open sourced one way or another, but I think folding it into Flume makes
>> the most sense.
>>
>> -- Jeremy
>>
>>
>> On Nov 28, 2013, at 7:39, Steve Morin <[EMAIL PROTECTED]> wrote:
>>
>> Jeremy,
>>   I am interested in a JDBC flume sink are you open sourcing it?
>> -Steve
>>
>>
>> On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>>
>>> Is there any interest in a generic JDBC sink?
>>>
>>> Over the few days I decided to try and write one.  I have something that
>>> requires more testing, but seems to be working.
>>>
>>> Since the config file is how you’d interact with it, here’s a working
>>> example from my source tree:
>>>
>>> a.sinks.k.type=jdbc
>>> a.sinks.k.channel=c
>>> a.sinks.k.driver=com.mysql.jdbc.Driver
>>> a.sinks.k.url=jdbc:mysql://localhost:8889/flume
>>> a.sinks.k.user=username
>>> a.sinks.k.password=password
>>> a.sinks.k.batchSize=100
>>> a.sinks.k.sql=insert into twitter (body, timestamp) values
>>> (${body:string}, ${header.timestamp:long})
>>>
>>> The interesting part is the SQL statement.  You can put anything you
>>> want in there - it will get converted to a prepared statement on execution.
>>>  The Ant-ish tokens get parsed and replaced with parameters at startup.
>>>
>>> The tokens are three part.  For example, in:
>>>
>>> ${body:string(UTF-8)}
>>>
>>> The first is a place in the event to get the value from (“body”,
>>> “header.foo”, or “custom”).  The second part ("string") is a type
>>> identifier that converts into an appropriate JDBC parameter.  The third
>>> part (“UTF-8") is a configuration string for that type, if needed.  As for
>>> types, so far I’ve defined:
>>>
>>> body: string (with optional charset encoding), bytearray
>>> header: string, long, int, float, double, date (with mandatory date
>>> format and optional timezone)
>>>
>>> Additionally, if none of those make you happy you can define you own
>>> parameter converters:
>>>
>>> ${custom:com.company.foo.MyConverter(optionaltextconfig)}
>>>
>>> I know there is still improvement to be made, but I’d like to get some
>>> feedback, bug fixes, and maybe get it included before I do a bunch of
>>> useless work.  If there is interest, how would you like it for review or
>>> inclusion?
>>>
>>>  -- Jeremy
>>>
>>>
>>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB