Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Questions about Morphline Solr Sink structure


Copy link to this message
-
Re: Questions about Morphline Solr Sink structure
Hello,

One more "proactive" question.

Isn't all code under the .... solr/morphline package not really about
Morphline *Solr* Sink, but really more about *Morphline* Sink?
In other words, if where Morphline actually outputs is dictated by the
Morphline command in Morphline config (e.g. loadSolr()), then as far
as Flume is concerned, isn't that really just *Morphline* Sink?

For example, if I wanted to get Flume to pass events through Morphline
and have Morphline output to Elasticsearch, I wouldn't really want to
add a while new Elasticsearch Morphline Sink.  I should really just be
able to use the existing (misnamed?) Morphline Solr Sink and just
point it to a Morphline config that has laodElasticsearch() instead of
loadSolr().

(please ignore the fact Morphline doesn't actually have
loadElasticsearch() yet - I think this is a Morphline issue, not a
Flume issue)

Is the above correct?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Sun, Nov 10, 2013 at 7:29 PM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> Warning: I've got a Flume NG and Morphlines newbie status
>
> I was looking at Morphline Solr Sink to see how one could write an
> equivalent Morphline Elasticsearch Sink, but after looking at the
> code, I'm a bit confused.  Here are my Qs:
>
> 1)  interface MorphlineHandler mentions Solr in N places, but it
> doesn't seem to be Solr-specific.  Couldn't one reuse this interface
> for a Morphline ES Sink?
>
> 2) In general, couldn't/shouldn't a few classes from
> org.apache.flume.sink.solr.morphline package really not outside
> anything solr-specific? e.g.  org.apache.flume.sink.morphline for
> those that are Morphline-specific?
>
> 3) Similarly, BlobDeserializer and BlobHandler don't seem to be even
> Morphline-specific.  Shouldn't they be elsewhere?
>
> 4) I was expecting to see SolrJ (Solr Java client library) being used
> in MorphlineHandlerImpl or MorphlineSolrSink to send events to Solr,
> but there is no trace of SolrJ there.  How exactly does this load
> Flume events into Solr then?
> Ooooh, is that because when using this sink one is supposed to provide
> a Morphline config and this config has a hard-coded loadSolr()
> command?
>
> 5) Would it make sense to refactor any of the current Morphline Solr
> Sink code to make it easier to add things Morphline Elasticsearch
> Sink?  If so, any guidance you could provide would be very helpful.
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB