Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Regarding the adding of additional sinks/sources for various DB's


Copy link to this message
-
Regarding the adding of additional sinks/sources for various DB's
Juhani Connolly 2013-11-25, 08:14
Hey guys,

What I write here is all just my personal opinion and I'm writing in
hopes of starting a discussion and/or getting feedback. I know I've not
been very active on the project recently(due to other engagements) but
do still want it to succeed and hope to find more time for it eventually.

Right now I see new/active issues for the addition of Redis and Kafka
sinks, and while they're nice features, I'm personally concerned about
feature bloat of the project. There are dozens of interceptors, sinks
and sources that can be thought of, but most of them are very specific
to a specific use-case.

Every time we add a new component we're also committing to maintaining
it over future releases, even if the original contributor gets too busy
for it. The more such components get added, the more we will get
distracted from improving core features and getting rid of issues
affecting them.

For these reasons I generally haven't submitted components we developed
for internal use(because they are too specific to our use cases), just
passing back fixes that fix bugs or apply to the core project.

For these reasons I think we may want to consider either a) being more
selective regarding additional component submissions or b) make a
contrib directory to the project which includes the components but
doesn't guarrantee ongoing maintenance or compatibility.

On the flip side of course, taking approaches like this may discourage
new contributors and could thus be considered a negative, and if many
people feel this way they should definitely share their thoughts.

I'd be curious to know what others think, and what direction they hope
to take the project in the future.